easycv.datasets.shared package¶
- class easycv.datasets.shared.ConcatDataset(datasets)[source]¶
Bases:
torch.utils.data.dataset.Dataset
[torch.utils.data.dataset.T_co
]A wrapper of concatenated dataset.
Same as
torch.utils.data.dataset.ConcatDataset
, but concat the group flag for image aspect ratio.- Parameters
datasets (list[
Dataset
]) – A list of datasets.
- datasets: List[torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]]¶
- cumulative_sizes: List[int]¶
- class easycv.datasets.shared.RepeatDataset(dataset, times)[source]¶
Bases:
object
A wrapper of repeated dataset.
The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.
- Parameters
dataset (
Dataset
) – The dataset to be repeated.times (int) – Repeat times.
- class easycv.datasets.shared.OdpsReader(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]¶
Bases:
object
- __init__(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]¶
Init odps reader and datasource set to load data from odps table
- Parameters
table_name (str) – odps table to load
selected_cols (list(str)) – select column
excluded_cols (list(str)) – exclude column
random_start (bool) – random start for odps table
odps_io_config (dict) – odps config contains access_id, access_key, endpoint
image_col (list(str)) – image column names
image_type (list(str)) – image column types support url/base64, must be same length with image type or 0
- Returns :
None
- class easycv.datasets.shared.RawDataset(data_source, pipeline, profiling=False)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]
- class easycv.datasets.shared.BaseDataset(data_source, pipeline, profiling=False)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]Base Dataset
- __init__(data_source, pipeline, profiling=False)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- visualize(results, **kwargs)[source]¶
Visulaize the model output results on validation data. Returns: A dictionary
- If add image visualization, return dict containing
images: List of visulaized images. img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.
- class easycv.datasets.shared.MultiViewDataset(data_source, num_views, pipelines)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]The dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]
Submodules¶
easycv.datasets.shared.base module¶
- class easycv.datasets.shared.base.BaseDataset(data_source, pipeline, profiling=False)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]Base Dataset
- __init__(data_source, pipeline, profiling=False)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- visualize(results, **kwargs)[source]¶
Visulaize the model output results on validation data. Returns: A dictionary
- If add image visualization, return dict containing
images: List of visulaized images. img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.
easycv.datasets.shared.dali_tfrecord_imagenet module¶
- class easycv.datasets.shared.dali_tfrecord_imagenet.DaliLoaderWrapper(dali_loader, batch_size, label_offset=0)[source]¶
Bases:
object
- __init__(dali_loader, batch_size, label_offset=0)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- evaluate(results, evaluators, logger=None)[source]¶
evaluate classification task
- Parameters
results – a dict of list of tensor, including prediction and groundtruth info, where prediction tensor is NxC,and the same with groundtruth labels.
evaluators – a list of evaluator
- Returns
a dict of float, different metric values
- Return type
eval_result
- class easycv.datasets.shared.dali_tfrecord_imagenet.DaliImageNetTFRecordDataSet(data_source, pipeline, distributed, batch_size, label_offset=0, random_shuffle=True, workers_per_gpu=2)[source]¶
Bases:
object
easycv.datasets.shared.dali_tfrecord_multi_view module¶
- class easycv.datasets.shared.dali_tfrecord_multi_view.DaliLoaderWrapper(dali_loader, batch_size, return_list)[source]¶
Bases:
object
- class easycv.datasets.shared.dali_tfrecord_multi_view.DaliTFRecordMultiViewDataset(data_source, num_views, pipelines, distributed, batch_size, random_shuffle=True, workers_per_gpu=2)[source]¶
Bases:
object
Adapt to dali, the dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]
easycv.datasets.shared.dataset_wrappers module¶
- class easycv.datasets.shared.dataset_wrappers.ConcatDataset(datasets)[source]¶
Bases:
torch.utils.data.dataset.Dataset
[torch.utils.data.dataset.T_co
]A wrapper of concatenated dataset.
Same as
torch.utils.data.dataset.ConcatDataset
, but concat the group flag for image aspect ratio.- Parameters
datasets (list[
Dataset
]) – A list of datasets.
- datasets: List[torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]]¶
- cumulative_sizes: List[int]¶
- class easycv.datasets.shared.dataset_wrappers.RepeatDataset(dataset, times)[source]¶
Bases:
object
A wrapper of repeated dataset.
The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.
- Parameters
dataset (
Dataset
) – The dataset to be repeated.times (int) – Repeat times.
easycv.datasets.shared.multi_view module¶
- class easycv.datasets.shared.multi_view.MultiViewDataset(data_source, num_views, pipelines)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]The dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]
easycv.datasets.shared.odps_reader module¶
- class easycv.datasets.shared.odps_reader.OdpsReader(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]¶
Bases:
object
- __init__(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]¶
Init odps reader and datasource set to load data from odps table
- Parameters
table_name (str) – odps table to load
selected_cols (list(str)) – select column
excluded_cols (list(str)) – exclude column
random_start (bool) – random start for odps table
odps_io_config (dict) – odps config contains access_id, access_key, endpoint
image_col (list(str)) – image column names
image_type (list(str)) – image column types support url/base64, must be same length with image type or 0
- Returns :
None