easycv.datasets.detection.data_sources package

class easycv.datasets.detection.data_sources.DetSourceCoco(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]

Bases: object

coco data source

__init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]
Parameters
  • ann_file – Path of annotation file.

  • img_prefix – coco path prefix

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as Tre

get_length()[source]

Total number of samples of data.

load_annotations(ann_file)[source]

Load annotation from COCO style annotation file.

Parameters

ann_file (str) – Path of annotation file.

Returns

Annotation info from COCO api.

Return type

list[dict]

get_ann_info(idx)[source]

Get COCO annotation by index.

Parameters

idx (int) – Index of data.

Returns

Annotation info of specified index.

Return type

dict

get_cat_ids(idx)[source]

Get COCO category ids by index.

Parameters

idx (int) – Index of data.

Returns

All categories in the image of specified index.

Return type

list[int]

xyxy2xywh(bbox)[source]

Convert xyxy style bounding boxes to xywh style for COCO evaluation.

Parameters

bbox (numpy.ndarray) – The bounding boxes, shape (4, ), in xyxy order.

Returns

The converted bounding boxes, in xywh order.

Return type

list[float]

pre_pipeline(results)[source]

Prepare results dict for pipeline.

prepare_train_img(idx)[source]

Get training data and annotations after pipeline.

Parameters

idx (int) – Index of data.

Returns

Training data and annotation after pipeline with new keys introduced by pipeline.

Return type

dict

get_sample(idx)[source]
class easycv.datasets.detection.data_sources.DetSourcePAI(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data format please refer to: https://help.aliyun.com/document_detail/311173.html

__init__(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]
Parameters
  • path – Path of manifest path with pai label format

  • classes – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.DetSourceRaw(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- data_dir

|-images

|-1.jpg |-…

|-labels

|-1.txt |-…

` Label txt file is as follows: The first column is the label id, and columns 2 to 5 are coordinates relative to the image width and height [x_center, y_center, bbox_w, bbox_h]. ` 15 0.519398 0.544087 0.476359 0.572061 2 0.501859 0.820726 0.996281 0.332178 … ``` .. rubric:: Example

data_source = DetSourceRaw(

img_root_path=’/your/data_dir/images’, label_root_path=’/your/data_dir/labels’,

)

__init__(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]
Parameters
  • img_root_path – images dir path

  • label_root_path – labels dir path

  • classes (list, optional) – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • delimeter – delimeter of txt file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

post_process_fn(result_dict)[source]
class easycv.datasets.detection.data_sources.DetSourceVOC(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- voc_data

|-ImageSets
|-Main

|-train.txt |-…

|-JPEGImages

|-00001.jpg |-…

|-Annotations

|-00001.xml |-…

``` Example1:

data_source = DetSourceVOC(

path=’/your/voc_data/ImageSets/Main/train.txt’, classes=${VOC_CLASSES},

)

Example1:
data_source = DetSourceVOC(

path=’/your/voc_data/train.txt’, classes=${VOC_CLASSES}, img_root_path=’/your/voc_data/images’, img_root_path=’/your/voc_data/annotations’

)

__init__(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – path of img id list file in ImageSets/Main/

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

Submodules

easycv.datasets.detection.data_sources.coco module

class easycv.datasets.detection.data_sources.coco.DetSourceCoco(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]

Bases: object

coco data source

__init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]
Parameters
  • ann_file – Path of annotation file.

  • img_prefix – coco path prefix

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as Tre

get_length()[source]

Total number of samples of data.

load_annotations(ann_file)[source]

Load annotation from COCO style annotation file.

Parameters

ann_file (str) – Path of annotation file.

Returns

Annotation info from COCO api.

Return type

list[dict]

get_ann_info(idx)[source]

Get COCO annotation by index.

Parameters

idx (int) – Index of data.

Returns

Annotation info of specified index.

Return type

dict

get_cat_ids(idx)[source]

Get COCO category ids by index.

Parameters

idx (int) – Index of data.

Returns

All categories in the image of specified index.

Return type

list[int]

xyxy2xywh(bbox)[source]

Convert xyxy style bounding boxes to xywh style for COCO evaluation.

Parameters

bbox (numpy.ndarray) – The bounding boxes, shape (4, ), in xyxy order.

Returns

The converted bounding boxes, in xywh order.

Return type

list[float]

pre_pipeline(results)[source]

Prepare results dict for pipeline.

prepare_train_img(idx)[source]

Get training data and annotations after pipeline.

Parameters

idx (int) – Index of data.

Returns

Training data and annotation after pipeline with new keys introduced by pipeline.

Return type

dict

get_sample(idx)[source]

easycv.datasets.detection.data_sources.pai_format module

easycv.datasets.detection.data_sources.pai_format.get_prior_task_id(keys)[source]

“The task id ends with check is the highest priority.

easycv.datasets.detection.data_sources.pai_format.is_itag_v2(row)[source]

The keyword of the data source is picUrl in v1, but is source in v2

easycv.datasets.detection.data_sources.pai_format.parser_manifest_row_str(row_str, classes)[source]
class easycv.datasets.detection.data_sources.pai_format.DetSourcePAI(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data format please refer to: https://help.aliyun.com/document_detail/311173.html

__init__(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]
Parameters
  • path – Path of manifest path with pai label format

  • classes – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

easycv.datasets.detection.data_sources.raw module

easycv.datasets.detection.data_sources.raw.parse_raw(source_iter, classes=None, delimeter=' ')[source]
class easycv.datasets.detection.data_sources.raw.DetSourceRaw(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- data_dir

|-images

|-1.jpg |-…

|-labels

|-1.txt |-…

` Label txt file is as follows: The first column is the label id, and columns 2 to 5 are coordinates relative to the image width and height [x_center, y_center, bbox_w, bbox_h]. ` 15 0.519398 0.544087 0.476359 0.572061 2 0.501859 0.820726 0.996281 0.332178 … ``` .. rubric:: Example

data_source = DetSourceRaw(

img_root_path=’/your/data_dir/images’, label_root_path=’/your/data_dir/labels’,

)

__init__(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]
Parameters
  • img_root_path – images dir path

  • label_root_path – labels dir path

  • classes (list, optional) – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • delimeter – delimeter of txt file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

post_process_fn(result_dict)[source]

easycv.datasets.detection.data_sources.utils module

easycv.datasets.detection.data_sources.utils.exif_size(img)[source]

easycv.datasets.detection.data_sources.voc module

easycv.datasets.detection.data_sources.voc.parse_xml(source_item, classes)[source]
class easycv.datasets.detection.data_sources.voc.DetSourceVOC(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- voc_data

|-ImageSets
|-Main

|-train.txt |-…

|-JPEGImages

|-00001.jpg |-…

|-Annotations

|-00001.xml |-…

``` Example1:

data_source = DetSourceVOC(

path=’/your/voc_data/ImageSets/Main/train.txt’, classes=${VOC_CLASSES},

)

Example1:
data_source = DetSourceVOC(

path=’/your/voc_data/train.txt’, classes=${VOC_CLASSES}, img_root_path=’/your/voc_data/images’, img_root_path=’/your/voc_data/annotations’

)

__init__(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – path of img id list file in ImageSets/Main/

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.