easycv.datasets.shared package¶

class easycv.datasets.shared.ConcatDataset(datasets)[source]¶

Bases: torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]

A wrapper of concatenated dataset.

Same as torch.utils.data.dataset.ConcatDataset, but concat the group flag for image aspect ratio.

Parameters: datasets (list[Dataset]) – A list of datasets.

__init__(datasets)[source]¶: Initialize self. See help(type(self)) for accurate signature.

datasets: List[torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]]¶

cumulative_sizes: List[int]¶

class easycv.datasets.shared.RepeatDataset(dataset, times)[source]¶

Bases: object

A wrapper of repeated dataset.

The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.

Parameters

dataset (Dataset) – The dataset to be repeated.
times (int) – Repeat times.

__init__(dataset, times)[source]¶: Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.OdpsReader(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]¶

Bases: object

__init__(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]¶

Init odps reader and datasource set to load data from odps table

Parameters

table_name (str) – odps table to load
selected_cols (list(str)) – select column
excluded_cols (list(str)) – exclude column
random_start (bool) – random start for odps table
odps_io_config (dict) – odps config contains access_id, access_key, endpoint
image_col (list(str)) – image column names
image_type (list(str)) – image column types support url/base64, must be same length with image type or 0

Returns :: None

reset_reader(dataloader_workid, dataloader_worknum)[source]¶

b64_decode()[source]¶

class easycv.datasets.shared.RawDataset(data_source, pipeline, profiling=False)[source]¶

Bases: Generic[torch.utils.data.dataset.T_co]

__init__(data_source, pipeline, profiling=False)[source]¶: Initialize self. See help(type(self)) for accurate signature.

evaluate(scores, keyword, logger=None)[source]¶

class easycv.datasets.shared.BaseDataset(data_source, pipeline, profiling=False)[source]¶

Bases: Generic[torch.utils.data.dataset.T_co]

Base Dataset

__init__(data_source, pipeline, profiling=False)[source]¶: Initialize self. See help(type(self)) for accurate signature.

abstract evaluate(results, evaluators, logger=None, **kwargs)[source]¶

visualize(results, **kwargs)[source]¶

Visulaize the model output results on validation data. Returns: A dictionary

If add image visualization, return dict containing
images: List of visulaized images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

class easycv.datasets.shared.MultiViewDataset(data_source, num_views, pipelines)[source]¶

Bases: Generic[torch.utils.data.dataset.T_co]

The dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines)[source]¶: Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]¶

Subpackages¶

Submodules¶

easycv.datasets.shared.base module¶

class easycv.datasets.shared.base.BaseDataset(data_source, pipeline, profiling=False)[source]¶

Bases: Generic[torch.utils.data.dataset.T_co]

Base Dataset

__init__(data_source, pipeline, profiling=False)[source]¶: Initialize self. See help(type(self)) for accurate signature.

abstract evaluate(results, evaluators, logger=None, **kwargs)[source]¶

visualize(results, **kwargs)[source]¶

Visulaize the model output results on validation data. Returns: A dictionary

If add image visualization, return dict containing
images: List of visulaized images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

easycv.datasets.shared.dali_tfrecord_imagenet module¶

class easycv.datasets.shared.dali_tfrecord_imagenet.DaliLoaderWrapper(dali_loader, batch_size, label_offset=0)[source]¶

Bases: object

__init__(dali_loader, batch_size, label_offset=0)[source]¶: Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]¶

evaluate classification task

Parameters

results – a dict of list of tensor, including prediction and groundtruth info, where prediction tensor is NxC，and the same with groundtruth labels.
evaluators – a list of evaluator

Returns

a dict of float, different metric values

Return type

eval_result

class easycv.datasets.shared.dali_tfrecord_imagenet.DaliImageNetTFRecordDataSet(data_source, pipeline, distributed, batch_size, label_offset=0, random_shuffle=True, workers_per_gpu=2)[source]¶

Bases: object

__init__(data_source, pipeline, distributed, batch_size, label_offset=0, random_shuffle=True, workers_per_gpu=2)[source]¶: Initialize self. See help(type(self)) for accurate signature.

get_dataloader()[source]¶

easycv.datasets.shared.dali_tfrecord_multi_view module¶

class easycv.datasets.shared.dali_tfrecord_multi_view.DaliLoaderWrapper(dali_loader, batch_size, return_list)[source]¶

Bases: object

__init__(dali_loader, batch_size, return_list)[source]¶: Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.dali_tfrecord_multi_view.DaliTFRecordMultiViewDataset(data_source, num_views, pipelines, distributed, batch_size, random_shuffle=True, workers_per_gpu=2)[source]¶

Bases: object

Adapt to dali, the dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines, distributed, batch_size, random_shuffle=True, workers_per_gpu=2)[source]¶: Initialize self. See help(type(self)) for accurate signature.

get_dataloader()[source]¶

easycv.datasets.shared.dataset_wrappers module¶

class easycv.datasets.shared.dataset_wrappers.ConcatDataset(datasets)[source]¶

Bases: torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]

A wrapper of concatenated dataset.

Same as torch.utils.data.dataset.ConcatDataset, but concat the group flag for image aspect ratio.

Parameters: datasets (list[Dataset]) – A list of datasets.

__init__(datasets)[source]¶: Initialize self. See help(type(self)) for accurate signature.

datasets: List[torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]]¶

cumulative_sizes: List[int]¶

class easycv.datasets.shared.dataset_wrappers.RepeatDataset(dataset, times)[source]¶

Bases: object

A wrapper of repeated dataset.

The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.

Parameters

dataset (Dataset) – The dataset to be repeated.
times (int) – Repeat times.

__init__(dataset, times)[source]¶: Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.shared.multi_view module¶

class easycv.datasets.shared.multi_view.MultiViewDataset(data_source, num_views, pipelines)[source]¶

Bases: Generic[torch.utils.data.dataset.T_co]

The dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines)[source]¶: Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]¶

easycv.datasets.shared.odps_reader module¶

easycv.datasets.shared.odps_reader.set_dataloader_workid(value)[source]¶

easycv.datasets.shared.odps_reader.set_dataloader_worknum(value)[source]¶

easycv.datasets.shared.odps_reader.get_dist_image(img_url, max_try=10)[source]¶

class easycv.datasets.shared.odps_reader.OdpsReader(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]¶

Bases: object

__init__(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]¶

Init odps reader and datasource set to load data from odps table

Parameters

table_name (str) – odps table to load
selected_cols (list(str)) – select column
excluded_cols (list(str)) – exclude column
random_start (bool) – random start for odps table
odps_io_config (dict) – odps config contains access_id, access_key, endpoint
image_col (list(str)) – image column names
image_type (list(str)) – image column types support url/base64, must be same length with image type or 0

Returns :: None

reset_reader(dataloader_workid, dataloader_worknum)[source]¶

b64_decode()[source]¶

easycv.datasets.shared.raw module¶

class easycv.datasets.shared.raw.RawDataset(data_source, pipeline, profiling=False)[source]¶

Bases: Generic[torch.utils.data.dataset.T_co]

__init__(data_source, pipeline, profiling=False)[source]¶: Initialize self. See help(type(self)) for accurate signature.

evaluate(scores, keyword, logger=None)[source]¶