easycv.datasets.shared package

class easycv.datasets.shared.ConcatDataset(datasets)[source]

Bases: torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]

A wrapper of concatenated dataset.

Same as torch.utils.data.dataset.ConcatDataset, but concat the group flag for image aspect ratio.

Parameters

datasets (list[Dataset]) – A list of datasets.

__init__(datasets)[source]

Initialize self. See help(type(self)) for accurate signature.

datasets: List[torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]]
cumulative_sizes: List[int]
class easycv.datasets.shared.RepeatDataset(dataset, times)[source]

Bases: object

A wrapper of repeated dataset.

The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.

Parameters
  • dataset (Dataset) – The dataset to be repeated.

  • times (int) – Repeat times.

__init__(dataset, times)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.OdpsReader(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]

Bases: object

__init__(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]

Init odps reader and datasource set to load data from odps table

Parameters
  • table_name (str) – odps table to load

  • selected_cols (list(str)) – select column

  • excluded_cols (list(str)) – exclude column

  • random_start (bool) – random start for odps table

  • odps_io_config (dict) – odps config contains access_id, access_key, endpoint

  • image_col (list(str)) – image column names

  • image_type (list(str)) – image column types support url/base64, must be same length with image type or 0

Returns :

None

reset_reader(dataloader_workid, dataloader_worknum)[source]
b64_decode()[source]
class easycv.datasets.shared.RawDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(scores, keyword, logger=None)[source]
class easycv.datasets.shared.BaseDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Base Dataset

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

abstract evaluate(results, evaluators, logger=None, **kwargs)[source]
visualize(results, **kwargs)[source]

Visulaize the model output results on validation data. Returns: A dictionary

If add image visualization, return dict containing

images: List of visulaized images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

class easycv.datasets.shared.MultiViewDataset(data_source, num_views, pipelines)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

The dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]

Submodules

easycv.datasets.shared.base module

class easycv.datasets.shared.base.BaseDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Base Dataset

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

abstract evaluate(results, evaluators, logger=None, **kwargs)[source]
visualize(results, **kwargs)[source]

Visulaize the model output results on validation data. Returns: A dictionary

If add image visualization, return dict containing

images: List of visulaized images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

easycv.datasets.shared.dali_tfrecord_imagenet module

class easycv.datasets.shared.dali_tfrecord_imagenet.DaliLoaderWrapper(dali_loader, batch_size, label_offset=0)[source]

Bases: object

__init__(dali_loader, batch_size, label_offset=0)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]

evaluate classification task

Parameters
  • results – a dict of list of tensor, including prediction and groundtruth info, where prediction tensor is NxC,and the same with groundtruth labels.

  • evaluators – a list of evaluator

Returns

a dict of float, different metric values

Return type

eval_result

class easycv.datasets.shared.dali_tfrecord_imagenet.DaliImageNetTFRecordDataSet(data_source, pipeline, distributed, batch_size, label_offset=0, random_shuffle=True, workers_per_gpu=2)[source]

Bases: object

__init__(data_source, pipeline, distributed, batch_size, label_offset=0, random_shuffle=True, workers_per_gpu=2)[source]

Initialize self. See help(type(self)) for accurate signature.

get_dataloader()[source]

easycv.datasets.shared.dali_tfrecord_multi_view module

class easycv.datasets.shared.dali_tfrecord_multi_view.DaliLoaderWrapper(dali_loader, batch_size, return_list)[source]

Bases: object

__init__(dali_loader, batch_size, return_list)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.dali_tfrecord_multi_view.DaliTFRecordMultiViewDataset(data_source, num_views, pipelines, distributed, batch_size, random_shuffle=True, workers_per_gpu=2)[source]

Bases: object

Adapt to dali, the dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines, distributed, batch_size, random_shuffle=True, workers_per_gpu=2)[source]

Initialize self. See help(type(self)) for accurate signature.

get_dataloader()[source]

easycv.datasets.shared.dataset_wrappers module

class easycv.datasets.shared.dataset_wrappers.ConcatDataset(datasets)[source]

Bases: torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]

A wrapper of concatenated dataset.

Same as torch.utils.data.dataset.ConcatDataset, but concat the group flag for image aspect ratio.

Parameters

datasets (list[Dataset]) – A list of datasets.

__init__(datasets)[source]

Initialize self. See help(type(self)) for accurate signature.

datasets: List[torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]]
cumulative_sizes: List[int]
class easycv.datasets.shared.dataset_wrappers.RepeatDataset(dataset, times)[source]

Bases: object

A wrapper of repeated dataset.

The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.

Parameters
  • dataset (Dataset) – The dataset to be repeated.

  • times (int) – Repeat times.

__init__(dataset, times)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.shared.multi_view module

class easycv.datasets.shared.multi_view.MultiViewDataset(data_source, num_views, pipelines)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

The dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]

easycv.datasets.shared.odps_reader module

easycv.datasets.shared.odps_reader.set_dataloader_workid(value)[source]
easycv.datasets.shared.odps_reader.set_dataloader_worknum(value)[source]
easycv.datasets.shared.odps_reader.get_dist_image(img_url, max_try=10)[source]
class easycv.datasets.shared.odps_reader.OdpsReader(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]

Bases: object

__init__(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]

Init odps reader and datasource set to load data from odps table

Parameters
  • table_name (str) – odps table to load

  • selected_cols (list(str)) – select column

  • excluded_cols (list(str)) – exclude column

  • random_start (bool) – random start for odps table

  • odps_io_config (dict) – odps config contains access_id, access_key, endpoint

  • image_col (list(str)) – image column names

  • image_type (list(str)) – image column types support url/base64, must be same length with image type or 0

Returns :

None

reset_reader(dataloader_workid, dataloader_worknum)[source]
b64_decode()[source]

easycv.datasets.shared.raw module

class easycv.datasets.shared.raw.RawDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(scores, keyword, logger=None)[source]