easycv.datasets.detection.data_sources package¶

class easycv.datasets.detection.data_sources.DetSourceCoco(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶

Bases: object

coco data source

__init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶

Parameters

ann_file – Path of annotation file.
img_prefix – coco path prefix
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as Tre

get_length()[source]¶: Total number of samples of data.

load_annotations(ann_file)[source]¶

Load annotation from COCO style annotation file.

Parameters: ann_file (str) – Path of annotation file.
Returns: Annotation info from COCO api.
Return type: list[dict]

get_ann_info(idx)[source]¶

Get COCO annotation by index.

Parameters: idx (int) – Index of data.
Returns: Annotation info of specified index.
Return type: dict

get_cat_ids(idx)[source]¶

Get COCO category ids by index.

Parameters: idx (int) – Index of data.
Returns: All categories in the image of specified index.
Return type: list[int]

xyxy2xywh(bbox)[source]¶

Convert xyxy style bounding boxes to xywh style for COCO evaluation.

Parameters: bbox (numpy.ndarray) – The bounding boxes, shape (4, ), in xyxy order.
Returns: The converted bounding boxes, in xywh order.
Return type: list[float]

pre_pipeline(results)[source]¶: Prepare results dict for pipeline.

prepare_train_img(idx)[source]¶

Get training data and annotations after pipeline.

Parameters: idx (int) – Index of data.
Returns: Training data and annotation after pipeline with new keys introduced by pipeline.
Return type: dict

get_sample(idx)[source]¶

class easycv.datasets.detection.data_sources.DetSourcePAI(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]¶

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data format please refer to: https://help.aliyun.com/document_detail/311173.html

__init__(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]¶

Parameters

path – Path of manifest path with pai label format
classes – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples

get_source_iterator()[source]¶: Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.DetSourceRaw(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]¶

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- data_dir

|-images
|-1.jpg |-…

|-labels
|-1.txt |-…

` Label txt file is as follows: The first column is the label id, and columns 2 to 5 are coordinates relative to the image width and height [x_center, y_center, bbox_w, bbox_h]. ` 15 0.519398 0.544087 0.476359 0.572061 2 0.501859 0.820726 0.996281 0.332178 … ``` .. rubric:: Example

data_source = DetSourceRaw(: img_root_path=’/your/data_dir/images’, label_root_path=’/your/data_dir/labels’,

)

__init__(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]¶

Parameters

img_root_path – images dir path
label_root_path – labels dir path
classes (list, optional) – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
delimeter – delimeter of txt file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples

get_source_iterator()[source]¶: Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

post_process_fn(result_dict)[source]¶

class easycv.datasets.detection.data_sources.DetSourceVOC(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- voc_data

|-ImageSets

|-Main
|-train.txt |-…

|-JPEGImages
|-00001.jpg |-…

|-Annotations
|-00001.xml |-…

``` Example1:

data_source = DetSourceVOC(
path=’/your/voc_data/ImageSets/Main/train.txt’, classes=${VOC_CLASSES},

)

Example1:

data_source = DetSourceVOC(: path=’/your/voc_data/train.txt’, classes=${VOC_CLASSES}, img_root_path=’/your/voc_data/images’, img_root_path=’/your/voc_data/annotations’

)

__init__(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶

Parameters

path – path of img id list file in ImageSets/Main/
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples

get_source_iterator()[source]¶: Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

Submodules¶

easycv.datasets.detection.data_sources.coco module¶

class easycv.datasets.detection.data_sources.coco.DetSourceCoco(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶

Bases: object

coco data source

__init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶

Parameters

ann_file – Path of annotation file.
img_prefix – coco path prefix
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as Tre

get_length()[source]¶: Total number of samples of data.

load_annotations(ann_file)[source]¶

Load annotation from COCO style annotation file.

Parameters: ann_file (str) – Path of annotation file.
Returns: Annotation info from COCO api.
Return type: list[dict]

get_ann_info(idx)[source]¶

Get COCO annotation by index.

Parameters: idx (int) – Index of data.
Returns: Annotation info of specified index.
Return type: dict

get_cat_ids(idx)[source]¶

Get COCO category ids by index.

Parameters: idx (int) – Index of data.
Returns: All categories in the image of specified index.
Return type: list[int]

xyxy2xywh(bbox)[source]¶

Convert xyxy style bounding boxes to xywh style for COCO evaluation.

Parameters: bbox (numpy.ndarray) – The bounding boxes, shape (4, ), in xyxy order.
Returns: The converted bounding boxes, in xywh order.
Return type: list[float]

pre_pipeline(results)[source]¶: Prepare results dict for pipeline.

prepare_train_img(idx)[source]¶

Get training data and annotations after pipeline.

Parameters: idx (int) – Index of data.
Returns: Training data and annotation after pipeline with new keys introduced by pipeline.
Return type: dict

get_sample(idx)[source]¶

easycv.datasets.detection.data_sources.pai_format module¶

easycv.datasets.detection.data_sources.pai_format.get_prior_task_id(keys)[source]¶: “The task id ends with check is the highest priority.

easycv.datasets.detection.data_sources.pai_format.is_itag_v2(row)[source]¶: The keyword of the data source is picUrl in v1, but is source in v2

easycv.datasets.detection.data_sources.pai_format.parser_manifest_row_str(row_str, classes)[source]¶

class easycv.datasets.detection.data_sources.pai_format.DetSourcePAI(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]¶

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data format please refer to: https://help.aliyun.com/document_detail/311173.html

__init__(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]¶

Parameters

path – Path of manifest path with pai label format
classes – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples

get_source_iterator()[source]¶: Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

easycv.datasets.detection.data_sources.raw module¶

easycv.datasets.detection.data_sources.raw.parse_raw(source_iter, classes=None, delimeter=' ')[source]¶

class easycv.datasets.detection.data_sources.raw.DetSourceRaw(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]¶

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- data_dir

|-images
|-1.jpg |-…

|-labels
|-1.txt |-…

` Label txt file is as follows: The first column is the label id, and columns 2 to 5 are coordinates relative to the image width and height [x_center, y_center, bbox_w, bbox_h]. ` 15 0.519398 0.544087 0.476359 0.572061 2 0.501859 0.820726 0.996281 0.332178 … ``` .. rubric:: Example

data_source = DetSourceRaw(: img_root_path=’/your/data_dir/images’, label_root_path=’/your/data_dir/labels’,

)

__init__(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]¶

Parameters

img_root_path – images dir path
label_root_path – labels dir path
classes (list, optional) – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
delimeter – delimeter of txt file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples

get_source_iterator()[source]¶: Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

post_process_fn(result_dict)[source]¶

easycv.datasets.detection.data_sources.utils module¶

easycv.datasets.detection.data_sources.utils.exif_size(img)[source]¶

easycv.datasets.detection.data_sources.voc module¶

easycv.datasets.detection.data_sources.voc.parse_xml(source_item, classes)[source]¶

class easycv.datasets.detection.data_sources.voc.DetSourceVOC(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- voc_data

|-ImageSets

|-Main
|-train.txt |-…

|-JPEGImages
|-00001.jpg |-…

|-Annotations
|-00001.xml |-…

``` Example1:

data_source = DetSourceVOC(
path=’/your/voc_data/ImageSets/Main/train.txt’, classes=${VOC_CLASSES},

)

Example1:

data_source = DetSourceVOC(: path=’/your/voc_data/train.txt’, classes=${VOC_CLASSES}, img_root_path=’/your/voc_data/images’, img_root_path=’/your/voc_data/annotations’

)

__init__(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶

Parameters

path – path of img id list file in ImageSets/Main/
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples

get_source_iterator()[source]¶: Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.