EasyCV DOCUMENTATION¶
EasyCV is an all-in-one computer vision toolbox based on PyTorch, mainly focus on self-supervised learning, image classification, metric-learning, object detection and so on.
Prepare Datasets¶
EasyCV provides various datasets for multi tasks. Please refer to the following guide for data preparation and keep the same data structure.
Image Classification¶
Cifar10¶
The CIFAR-10 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.
It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class.
There are 50000 training images and 10000 test images.
Here is the list of classes in the CIFAR-10: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck
.
For more detailed information, please refer to CIFAR.
Download¶
Download data from cifar-10-python.tar.gz (163MB). And uncompress files to data/cifar10
.
Directory structure is as follows:
data/cifar10
└── cifar-10-batches-py
├── batches.meta
├── data_batch_1
├── data_batch_2
├── data_batch_3
├── data_batch_4
├── data_batch_5
├── readme.html
├── read.py
└── test_batch
Cifar100¶
The CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.
This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each.
There are 500 training images and 100 testing images per class.
The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a “fine” label (the class to which it belongs) and a “coarse” label (the superclass to which it belongs).
For more detailed information, please refer to CIFAR.
Download¶
Download data from cifar-100-python.tar.gz (161MB). And uncompress files to data/cifar100
.
Directory structure should be as follows:
data/cifar100
└── cifar-100-python
├── file.txt~
├── meta
├── test
├── train
Imagenet-1k¶
ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns).
It is used in the ImageNet Large Scale Visual Recognition Challenge(ILSVRC) and is a benchmark for image classification.
For more detailed information, please refer to ImageNet.
Download¶
ILSVRC2012 is widely used, download it as follows:
Go to the download-url, Register an account and log in .
Recommended ILSVRC2012, download the following files:
Training images (Task 1 & 2). 138GB.
Validation images (all tasks). 6.3GB.
Unzip the downloaded file.
Using this scrip to get data meta.
Directory structure should be as follows:
data/imagenet
└── train
└── n01440764
└── n01443537
└── ...
└── val
└── n01440764
└── n01443537
└── ...
└── meta
├── train.txt
├── val.txt
├── ...
Imagenet-1k-TFrecords¶
Original imagenet raw images packed in TFrecord format.
For more detailed information about Imagenet dataset, please refer to ImageNet.
Download¶
Go to the download-url, Register an account and log in .
The dataset is divided into two parts, part0 (79GB) and part1 (75GB), you need download all of them.
Directory structure should be as follows, put the image file and the idx file in the same folder:
data/imagenet
└── train
├── train-00000-of-01024
├── train-00000-of-01024.idx
├── train-00001-of-01024
├── train-00001-of-01024.idx
├── ...
└── validation
├── validation-00000-of-00128
├── validation-00000-of-00128.idx
├── validation-00001-of-00128
├── validation-00001-of-00128.idx
├── ...
Object Detection¶
PAI-iTAG detection¶
PAI-iTAG
is a platform for intelligent data annotation, which supports the annotation of various data types such as images, texts, videos, and audios, as well as multi-modal mixed annotation.
Please refer to 智能标注iTAG for file format and data annotation.
COCO2017¶
The COCO dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.
The COCO dataset has been updated for several editions, and coco2017 is widely used. In 2017, the training/validation split was 118K/5K and test set is a subset of 41K images of the 2015 test set.
For more detailed information, please refer to COCO.
Download¶
Download train2017.zip (18G) ,val2017.zip (1G), annotations_trainval2017.zip (241MB) and uncompress files to to data/coco2017
.
Directory structure is as follows:
data/coco2017
└── annotations
├── instances_train2017.json
├── instances_val2017.json
└── train2017
├── 000000000009.jpg
├── 000000000025.jpg
├── ...
└── val2017
├── 000000000139.jpg
├── 000000000285.jpg
├── ...
VOC2007¶
PASCAL VOC 2007 is a dataset for image recognition. The twenty object classes that have been selected are:
Person: person
Animal: bird, cat, cow, dog, horse, sheep
Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations.
For more detailed information, please refer to voc2007.
Download¶
Download VOCtrainval_06-Nov-2007.tar (439MB) and uncompress files to to data/VOCdevkit
.
Directory structure is as follows:
data/VOCdevkit
└── VOC2007
└── Annotations
├── 000005.xml
├── 001010.xml
├── ...
└── JPEGImages
├── 000005.jpg
├── 001010.jpg
├── ...
└── SegmentationClass
├── 000005.png
├── 001010.png
├── ...
└── SegmentationObject
├── 000005.png
├── 001010.png
├── ...
└── ImageSets
└── Layout
├── train.txt
├── trainval.txt
├── val.txt
└── Main
├── train.txt
├── val.txt
├── ...
└── Segmentation
├── train.txt
├── trainval.txt
├── val.txt
VOC2012¶
The PASCAL VOC 2012 dataset contains 20 object categories including:
Person: person
Animal: bird, cat, cow, dog, horse, sheep
Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train
Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor
Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations.
For more detailed information, please refer to voc2012.
Download¶
Download VOCtrainval_11-May-2012.tar (2G) and uncompress files to to data/VOCdevkit
.
Directory structure is as follows:
data/VOCdevkit
└── VOC2012
└── Annotations
├── 000005.xml
├── 001010.xml
├── ...
└── JPEGImages
├── 000005.jpg
├── 001010.jpg
├── ...
└── SegmentationClass
├── 000005.png
├── 001010.png
├── ...
└── SegmentationObject
├── 000005.png
├── 001010.png
├── ...
└── ImageSets
└── Layout
├── train.txt
├── trainval.txt
├── val.txt
└── Main
├── train.txt
├── val.txt
├── ...
└── Segmentation
├── train.txt
├── trainval.txt
├── val.txt
Self-Supervised Learning¶
Imagenet-1k¶
Refer to Image Classification: Imagenet-1k.
Imagenet-1k-TFrecords¶
Pose¶
COCO2017¶
The COCO dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.
The COCO dataset has been updated for several editions, and coco2017 is widely used. In 2017, the training/validation split was 118K/5K and test set is a subset of 41K images of the 2015 test set.
For more detailed information, please refer to COCO.
Download¶
Download it as follows:
Download data: train2017.zip (18G) , val2017.zip (1G)
Download annotations: annotations_trainval2017.zip (241MB)
Download person detection results: HRNet-Human-Pose-Estimation provides person detection result of COCO val2017 to reproduce our multi-person pose estimation results. Please download from OneDrive or GoogleDrive (26.2MB).
Then uncompress files to data/coco2017
, directory structure is as follows:
data/coco2017
└── annotations
├── person_keypoints_train2017.json
├── person_keypoints_val2017.json
└── person_detection_results
├── COCO_val2017_detections_AP_H_56_person.json
├── COCO_test-dev2017_detections_AP_H_609_person.json
└── train2017
├── 000000000009.jpg
├── 000000000025.jpg
├── ...
└── val2017
├── 000000000139.jpg
├── 000000000285.jpg
├── ...
Image Segmentation¶
COCO Stuff 164k¶
For COCO Stuff 164k dataset, please run the following commands to download and convert the augmented dataset.
# download
mkdir coco_stuff164k && cd coco_stuff164k
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip
# unzip
unzip train2017.zip -d images/
unzip val2017.zip -d images/
unzip stuffthingmaps_trainval2017.zip -d annotations/
# --nproc means 8 process for conversion, which could be omitted as well.
python tools/prepare_data/coco_stuff164k.py /path/to/coco_stuff164k --nproc 8
By convention, mask labels in /path/to/coco_stuff164k/annotations/*2017/*_labelTrainIds.png
are used for COCO Stuff 164k training and testing.
The details of this dataset could be found at here.
Object Detection 3D¶
NuScenes¶
Download nuScenes V1.0 full dataset data and CAN bus expansion data HERE. Prepare nuscenes data by running:
python tools/prepare_data/prepare_nuscenes.py \
--root_path=./data/nuscenes \
--canbus_root_path=./data/canbus \
--out_dir=./data/nuscenes \
--version=v1.0
It will generate nuscenes_infos_temporal_{train,val}.pkl
files.
The data structure is as follows:
data/nuscenes
├── can_bus
├── nuscenes-v1.0
│ ├── maps
│ ├── samples
│ ├── sweeps
│ ├── v1.0-test
| ├── v1.0-trainval
| ├── nuscenes_infos_temporal_train.pkl
| ├── nuscenes_infos_temporal_val.pkl
Quick Start¶
Prerequisites¶
python >= 3.6
Pytorch >= 1.5
mmcv >= 1.2.0
nvidia-dali == 0.25.0
Installation¶
Prepare environment¶
Create a conda virtual environment and activate it.
conda create -n ev python=3.6 -y conda activate ev
Install PyTorch and torchvision
The master branch works with PyTorch 1.5.1 or higher.
conda install pytorch==1.7.0 torchvision==0.8.0 -c pytorch
Install some python dependencies
replace {cu_version} and {torch_version} to the version used in your environment
# install mmcv pip install mmcv-full==1.4.4 -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html # for example, install mmcv-full for cuda10.1 and pytorch 1.7.0 pip install mmcv-full==1.4.4 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html # install nvidia-dali pip install http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/third_party/nvidia_dali_cuda100-0.25.0-1535750-py3-none-manylinux2014_x86_64.whl # install common_io for MaxCompute table read (optional) pip install https://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/third_party/common_io-0.3.0-cp36-cp36m-linux_x86_64.whl
Install EasyCV
You can simply install easycv with the following command:
pip install pai-easycv
or clone the repository and then install it:
git clone https://github.com/Alibaba/EasyCV.git cd easycv pip install -r requirements.txt pip install -v -e . # or "python setup.py develop"
Install pai_nni and blade_compressin
When you use model quantize and prune, you need to install pai_nni and blade_compression with the following command:
# install torch >= 1.8.0 pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 # install mmcv >= 1.3.0 (torch version >= 1.8.0 does not support mmcv version < 1.3.0) pip install mmcv-full==1.4.4 -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html # install onnx and pai_nni pip install onnx pip install https://pai-nni.oss-cn-zhangjiakou.aliyuncs.com/release/2.5/pai_nni-2.5-py3-none-manylinux1_x86_64.whl # install blade_compression pip install http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/third_party/blade_compression-0.0.1-py3-none-any.whl
If you want to use MSDeformAttn, you need to compiling CUDA operators
cd easycv/thirdparty/deformable_attention/ python setup.py build install # unit test (should see all checking is True) python test.py cd ../../..
Verification¶
Simple verification
from easycv.apis import *
You can also verify your installation using following quick-start examples
Self-supervised Learning Model Zoo¶
Pretrained models¶
MAE¶
Pretrained on ImageNet dataset.
Config | Backbone | Params (backbone/total) |
Train memory (GB) |
Flops | inference time(V100) (ms/img) |
Epochs | Download |
---|---|---|---|---|---|---|---|
mae_vit_base_patch16_8xb64_400e | ViT-B/16 | 85M/111M | 9.5 | 9.8G | 8.03 | 400 | model |
mae_vit_base_patch16_8xb64_1600e | ViT-B/16 | 85M/111M | 9.5 | 9.8G | 8.03 | 1600 | model |
mae_vit_large_patch16_8xb32_1600e | ViT-L/16 | 303M/329M | 11.3 | 20.8G | 16.30 | 1600 | model |
Fast ConvMAE¶
Pretrained on ImageNet dataset.
Config | Backbone | Params (backbone/total) |
Train memory (GB) |
Flops | inference time(V100) (ms/img) |
Total train time | Epochs | Download |
---|---|---|---|---|---|---|---|---|
fast_convmae_vit_base_patch16_8xb64_50e | ConvViT-B/16 | 88M/115M | 30.3 | 45.1G | 6.88 | 20h (8*A100) |
50 | model - log |
The flops of Fast ConvMAE is about four times of MAE, because the mask of MAE only retains 25% of the tokens each forward, but the mask of Fast ConvMAE adopts a complementary strategy, dividing the mask into four complementary parts with 25% token each part. This is equivalent to learning four samples at each forward, achieving 4 times the learning effect.
DINO¶
Pretrained on ImageNet dataset.
Config | Backbone | Params (backbone/total) |
Train memory (GB) |
inference time(V100) (ms/img) |
Epochs | Download |
---|---|---|---|---|---|---|
dino_deit_small_p16_8xb32_100e | DeiT-S/16 | 21M/88M | 10.5 | 6.17 | 100 | model |
MoBY¶
Pretrained on ImageNet dataset.
Config | Backbone | Params (backbone/total) |
Flops | Train memory (GB) |
inference time(V100) (ms/img) |
Epochs | Download |
---|---|---|---|---|---|---|---|
moby_deit_small_p16_4xb128_300e | DeiT-S/16 | 21M/26M | 18.6G | 21.4 | 6.17 | 300 | model - log |
moby_swin_tiny_8xb64_300e | Swin-T | 27M/33M | 18.1G | 16.1 | 9.74 | 300 | model - log |
MoCo V2¶
Pretrained on ImageNet dataset.
Config | Backbone | Params (backbone/total) |
Flops | Train memory (GB) |
inference time(V100) (ms/img) |
Epochs | Download |
---|---|---|---|---|---|---|---|
mocov2_resnet50_8xb32_200e | ResNet50 | 23M/28M | 8.2G | 5.4 | 8.59 | 200 | model |
SwAV¶
Pretrained on ImageNet dataset.
Config | Backbone | Params (backbone/total) |
Flops | Train memory (GB) |
inference time(V100) (ms/img) |
Epochs | Download |
---|---|---|---|---|---|---|---|
swav_resnet50_8xb32_200e | ResNet50 | 23M/28M | 12.9G | 11.3 | 8.59 | 200 | model - log |
Benchmarks¶
For detailed usage of benchmark tools, please refer to benchmark README.md.
ImageNet Linear Evaluation¶
Algorithm | Linear Eval Config | Pretrained Config | Top-1 (%) | Download |
---|---|---|---|---|
SwAV | swav_resnet50_8xb2048_20e_feature | swav_resnet50_8xb32_200e | 73.618 | log |
DINO | dino_deit_small_p16_8xb2048_20e_feature | dino_deit_small_p16_8xb32_100e | 71.248 | log |
MoBY | moby_deit_small_p16_8xb2048_30e_feature | moby_deit_small_p16_4xb128_300e | 72.214 | log |
MoCo-v2 | mocov2_resnet50_8xb2048_40e_feature | mocov2_resnet50_8xb32_200e | 66.8 | log |
ImageNet Finetuning¶
Algorithm | Fintune Config | Pretrained Config | Top-1 (%) | Download |
---|---|---|---|---|
MAE | mae_vit_base_patch16_8xb64_100e_lrdecay075_fintune | mae_vit_base_patch16_8xb64_400e | 83.13 | fintune model - log |
mae_vit_base_patch16_8xb64_100e_lrdecay065_fintune | mae_vit_base_patch16_8xb64_1600e | 83.55 | fintune model - log | |
mae_vit_large_patch16_8xb16_50e_lrdecay075_fintune | mae_vit_large_patch16_8xb32_1600e | 85.70 | fintune model - log | |
Fast ConvMAE | fast_convmae_vit_base_patch16_8xb64_100e_fintune | fast_convmae_vit_base_patch16_8xb64_50e | 84.37 | fintune model - log |
COCO2017 Object Detection¶
Algorithm | Eval Config | Pretrained Config | mAP (Box) | mAP (Mask) | Download |
---|---|---|---|---|---|
Fast ConvMAE | mask_rcnn_conv_vitdet_50e_coco | fast_convmae_vit_base_patch16_8xb64_50e | 51.3 | 45.6 | eval model |
SwAV | mask_rcnn_r50_fpn_1x_coco | swav_resnet50_8xb32_200e | 40.38 | 36.48 | eval model - log |
MoCo-v2 | mask_rcnn_r50_fpn_1x_coco | mocov2_resnet50_8xb32_200e | 39.9 | 35.8 | eval model - log |
MoBY | mask_rcnn_swin_tiny_1x_coco | moby_swin_tiny_8xb64_300e | 43.11 | 39.37 | eval model - log |
VOC2012 Aug Semantic Segmentation¶
Algorithm | Eval Config | Pretrained Config | mIOU | Download |
---|---|---|---|---|
SwAV | fcn_r50-d8_512x512_60e_voc12aug | swav_resnet50_8xb32_200e | 63.91 | eval model - log |
MoCo-v2 | fcn_r50-d8_512x512_60e_voc12aug | mocov2_resnet50_8xb32_200e | 68.49 | eval model - log |
Detection Model Zoo¶
Inference default use V100 16G.
YOLOX-PAI¶
Pretrained on COCO2017 dataset. (The result has been optimized with PAI-Blade, and only computes the model inference time. To learn about end2end inference time, you can refer to export.md.)
Algorithm | Config | Params | SpeedV100 fp16 b32 |
mAPval 0.5:0.95 |
APval 50 |
Download |
---|---|---|---|---|---|---|
YOLOX-s | yolox_s_8xb16_300e_coco | 9M | 0.68ms | 40.0 | 58.9 | model - log |
PAI-YOLOXs | yoloxs_pai_8xb16_300e_coco | 16M | 0.71ms | 41.4 | 60.0 | model - log |
PAI-YOLOXs-ASFF | yoloxs_pai_asff_8xb16_300e_coco | 21M | 0.87ms | 42.8 | 61.8 | model - log |
PAI-YOLOXs-ASFF-TOOD3 | yoloxs_pai_asff_tood3_8xb16_300e_coco | 24M | 1.15ms | 43.9 | 62.1 | model - log |
YOLOX-m | yolox_m_8xb16_300e_coco | 25M | 1.52ms | 46.3 | 64.9 | model - log |
YOLOX-l | yolox_l_8xb8_300e_coco | 54M | 2.47ms | 48.9 | 67.5 | model - log |
YOLOX-x | yolox_x_8xb8_300e_coco | 99M | 4.74ms | 50.9 | 69.2 | model - log |
YOLOX-tiny | yolox_tiny_8xb16_300e_coco | 5M | 0.28ms | 31.5 | 49.2 | model - log |
YOLOX-nano | yolox_nano_8xb16_300e_coco | 2.2M | 0.19ms | 26.5 | 42.6 | model - log |
ViTDet¶
| Algorithm | Config | Params
(backbone/total) | Train memory
(GB) | inference time(V100)
(ms/img) | bbox_mAPval
0.5:0.95 | mask_mAPval
0.5:0.95 | Download |
| ———- | ———————————————————— | ———————— | ———————————————————— | ———————————————————— | ———————————————————— | ———————————————————— | ———————————————————— |
| ViTDet_MaskRCNN | vitdet_maskrcnn | 86M/111M | 13.3 (fp16) | 138ms | 50.65 | 45.41 | model - log |
FCOS¶
Algorithm | Config | Params (backbone/total) |
Train memory (GB) |
inference time(V100) (ms/img) |
mAPval 0.5:0.95 |
APval 50 |
Download |
---|---|---|---|---|---|---|---|
FCOS-r50(caffe) | fcos-r50 | 23M/32M | 5.0 | 85.8ms | 38.58 | 57.18 | model - log |
FCOS-r50(torch) | fcos-r50 | 23M/32M | 4.0 (fp16) | 105.3ms | 38.88 | 58.01 | model - log |
DETR¶
Algorithm | Config | Params (backbone/total) |
Train memory (GB) |
inference time(V100) (ms/img) |
bbox_mAPval 0.5:0.95 |
APval 50 |
Download |
---|---|---|---|---|---|---|---|
DETR-r50 | detr-r50 | 23M/41M | 8.5 | 48.5ms | 39.92 | 60.52 | model - log |
DAB-DETR-r50 | dab-detr-r50 | 23M/43M | 2.6 | 58.5ms | 42.52 | 63.03 | model - log |
DN-DETR-r50 | dab-detr-r50 | 23M/43M | 7.8 | 58.5ms | 44.39 | 64.66 | model - log |
DINO¶
Algorithm | Config | Params (backbone/total) |
inference time(V100) (ms/img) |
bbox_mAPval 0.5:0.95 |
APval 50 |
Download | Comment |
---|---|---|---|---|---|---|---|
DINO_4sc_r50_12e | DINO_4sc_r50_12e | 23M/47M | 184ms | 48.71 | 66.27 | model - log | Inference use V100 32G |
DINO_4sc_r50_36e | DINO_4sc_r50_36e | 23M/47M | 184ms | 50.69 | 68.60 | model - log | Inference use V100 32G |
DINO_4sc_swinl_12e | DINO_4sc_swinl_12e | 195M/217M | 155ms | 56.86 | 75.61 | model - log | Inference use V100 32G |
DINO_4sc_swinl_36e | DINO_4sc_swinl_36e | 195M/217M | 155ms | 58.04 | 76.76 | model - log | Inference use V100 32G |
DINO_5sc_swinl_36e | DINO_5sc_swinl_36e | 195M/217M | 235ms | 58.47 | 77.10 | model - log | Inference use V100 32G |
DINO++_5sc_swinl_18e | DINO++_5sc_swinl_18e | 195M/218M | 325ms | 63.39 | 80.25 | model - log | Inference use A100 80G |
Develop¶
1. Code Style¶
We adopt PEP8 as the preferred code style.
We use the following toolsseed isortseed isortseed isort for linting and formatting:
Style configurations of yapf and isort can be found in setup.cfg.
We use pre-commit hook that checks and formats for flake8
, yapf
, seed-isort-config
, isort
, trailing whitespaces
,
fixes end-of-files
, sorts requirments.txt
automatically on every commit.
The config for a pre-commit hook is stored in .pre-commit-config.
After you clone the repository, you will need to install initialize pre-commit hook.
pip install -r requirements/tests.txt
From the repository folder
pre-commit install
After this on every commit check code linters and formatter will be enforced.
If you want to use pre-commit to check all the files, you can run
pre-commit run --all-files
If you only want to format and lint your code, you can run
sh scripts/linter.sh
2. Test¶
2.1 Unit test¶
bash scripts/ci_test.sh
2.2 Test data storage¶
As we need a lot of data for testing, including images, models. We use git lfs to store those large files.
install git-lfs(version>=2.5.0)
for mac
brew install git-lfs
git lfs install
for centos, please download rpm from git-lfs github release website
wget http://101374-public.oss-cn-hangzhou-zmf.aliyuncs.com/git-lfs-3.2.0-1.el7.x86_64.rpm
sudo rpm -ivh git-lfs-3.2.0-1.el7.x86_64.rpm
git lfs install
for ubuntu
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
git lfs install
track your data type using git lfs, for example, to track png files
git lfs track "*.png"
add your test files to
data/test/
folder, you can make directories if you need.
git add data/test/test.png
commit your test data to remote branch
git commit -m "xxx"
To pull data from remote repo, just as the same way you pull git files.
git pull origin branch_name
3. Build pip package¶
python setup.py sdist bdist_wheel
self-supervised learning tutorial¶
Data Preparation¶
To download the dataset, please refer to prepare_data.md.
Self-supervised learning support imagenet(raw and tfrecord) format data.
Imagenet format¶
You can download Imagenet data or use your own unlabeld image data. You should provide a directory which contains images for self-supervised training and a filelist which contains image path to the root directory. For example, the image directory is as follows
images/
├── 0001.jpg
├── 0002.jpg
├── 0003.jpg
|...
└── 9999.jpg
the content of filelist is
0001.jpg
0002.jpg
0003.jpg
...
9999.jpg
Local & PAI-DSW¶
We use configs/selfsup/mocov2/mocov2_rn50_8xb32_200e_jpg.py as an example config in which two config variable should be modified
data_train_list = 'filelist.txt'
data_train_root = 'images'
Training¶
Single gpu:
python tools/train.py \
${CONFIG_PATH} \
--work_dir ${WORK_DIR}
Multi gpus:
bash tools/dist_train.sh \
${NUM_GPUS} \
${CONFIG_PATH} \
--work_dir ${WORK_DIR}
Arguments
NUM_GPUS
: number of gpusCONFIG_PATH
: the config file path of a selfsup methodWORK_DIR
: your path to save models and logs
Examples:
Edit data_root
path in the ${CONFIG_PATH}
to your own data path.
GPUS=8
bash tools/dist_train.sh configs/selfsup/mocov2/mocov2_rn50_8xb32_200e_jpg.py $GPUS
Export model¶
python tools/export.py \
${CONFIG_PATH} \
${CHECKPOINT} \
${EXPORT_PATH}
Arguments
CONFIG_PATH
: the config file path of a selfsup methodCHECKPOINT
:your checkpoint file of a selfsup method named as epoch_*.pthEXPORT_PATH
: your path to save export model
Examples:
python tools/export.py configs/selfsup/mocov2/mocov2_rn50_8xb32_200e_jpg.py \
work_dirs/selfsup/mocov2/epoch_200.pth \
work_dirs/selfsup/mocov2/epoch_200_export.pth
Feature extract¶
Download test_image
import cv2
from easycv.predictors.feature_extractor import TorchFeatureExtractor
output_ckpt = 'work_dirs/selfsup/mocov2/epoch_200_export.pth'
fe = TorchFeatureExtractor(output_ckpt)
img = cv2.imread('248347732153_1040.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
feature = fe.predict([img])
print(feature[0]['feature'].shape)
YOLOX-PAI Turtorial¶
Introduction¶
Welcome to YOLOX-PAI! YOLOX-PAI is an incremental work of YOLOX based on PAI-EasyCV. We use various existing detection methods and PAI-Blade to boost the performance. We also provide an efficient way for end2end object detction.
In breif, our main contributions are:
Investigate various detection methods upon YOLOX to achieve SOTA object detection results.
Provide an easy way to use PAI-Blade to accelerate the inference process.
Provide a convenient way to train/evaluate/export YOLOX-PAI model and conduct end2end object detection.
To learn more details of YOLOX-PAI, you can refer to our technical report or arxiv paper.
Data preparation¶
To download the dataset, please refer to prepare_data.md.
Yolox support both coco format and PAI-Itag detection format,
COCO format¶
To use coco data to train detection, you can refer to configs/detection/yolox/yolox_s_8xb16_300e_coco.py for more configuration details.
PAI-Itag detection format¶
To use pai-itag detection format data to train detection, you can refer to configs/detection/yolox/yolox_s_8xb16_300e_coco_pai.py for more configuration details.
Quick Start¶
To use COCO format data, use config file configs/detection/yolox/yolox_s_8xb16_300e_coco.py
To use PAI-Itag format data, use config file configs/detection/yolox/yolox_s_8xb16_300e_coco_pai.py
You can use the quick_start.md for local installation or use our provided doker images (for both training and inference).
Pull Docker¶
sudo docker pull registry.cn-shanghai.aliyuncs.com/pai-ai-test/pai-easycv:yolox-pai
Start Container¶
sudo nvidia-docker run -it -v path:path --name easycv_yolox_pai --shm-size=10g --network=host registry.cn-shanghai.aliyuncs.com/pai-ai-test/pai-easycv:yolox-pai
Train¶
Single gpu:
python tools/train.py \
${CONFIG_PATH} \
--work_dir ${WORK_DIR}
Multi gpus:
bash tools/dist_train.sh \
${NUM_GPUS} \
${CONFIG_PATH} \
--work_dir ${WORK_DIR}
Arguments
NUM_GPUS
: number of gpusCONFIG_PATH
: the config file path of a detection methodWORK_DIR
: your path to save models and logs
Examples:
Edit data_root
path in the ${CONFIG_PATH}
to your own data path.
GPUS=8
bash tools/dist_train.sh configs/detection/yolox/yolox_s_8xb16_300e_coco.py $GPUS
Evaluation¶
The pretrained model of YOLOX-PAI can be found here.
Single gpu:
python tools/eval.py \
${CONFIG_PATH} \
${CHECKPOINT} \
--eval
Multi gpus:
bash tools/dist_test.sh \
${CONFIG_PATH} \
${NUM_GPUS} \
${CHECKPOINT} \
--eval
Arguments
CONFIG_PATH
: the config file path of a detection methodNUM_GPUS
: number of gpusCHECKPOINT
: the checkpoint file named as epoch_*.pth.
Examples:
GPUS=8
bash tools/dist_test.sh configs/detection/yolox/yolox_s_8xb16_300e_coco.py $GPUS work_dirs/detection/yolox/epoch_300.pth --eval
Export model¶
python tools/export.py \
${CONFIG_PATH} \
${CHECKPOINT} \
${EXPORT_PATH}
For more details of the export process, you can refer to export.md.
Arguments
CONFIG_PATH
: the config file path of a detection methodCHECKPOINT
:your checkpoint file of a detection method named as epoch_*.pth.EXPORT_PATH
: your path to save export model
Examples:
python tools/export.py configs/detection/yolox/yolox_s_8xb16_300e_coco.py \
work_dirs/detection/yolox/epoch_300.pth \
work_dirs/detection/yolox/epoch_300_export.pth
Inference¶
Download exported models(preprocess, model, meta) or export your own model. Put them in the following format:
export_blade/
epoch_300_pre_notrt.pt.blade
epoch_300_pre_notrt.pt.blade.config.json
epoch_300_pre_notrt.pt.preprocess
Download test_image
import cv2
from easycv.predictors import TorchYoloXPredictor
output_ckpt = 'export_blade/epoch_300_pre_notrt.pt.blade'
detector = TorchYoloXPredictor(output_ckpt,use_trt_efficientnms=False)
img = cv2.imread('000000017627.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
output = detector.predict([img])
print(output)
# visualize image
image = img.copy()
for box, cls_name in zip(output[0]['detection_boxes'], output[0]['detection_class_names']):
# box is [x1,y1,x2,y2]
box = [int(b) for b in box]
image = cv2.rectangle(image, tuple(box[:2]), tuple(box[2:4]), (0,255,0), 2)
cv2.putText(image, cls_name, (box[0], box[1]-5), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0,0,255), 2)
cv2.imwrite('result.jpg',image)
image classification tutorial¶
Data Preparation¶
To download the dataset, please refer to prepare_data.md.
Image classification support cifar and imagenet(raw and tfrecord) format data.
Cifar¶
To use Cifar data to train classification, you can refer to configs/classification/cifar10/swintiny_b64_5e_jpg.py for more configuration details.
Imagenet format¶
You can also use your self-defined data which follows imagenet format
, you should provide a root directory which condatains images for classification training and a filelist which contains image path to the root directory. For example, the image root directory is as follows
images/
├── 0001.jpg
├── 0002.jpg
├── 0003.jpg
|...
└── 9999.jpg
each line of the filelist consists of two parts, subpath to the image files starting from the image root directory, class label string for the corresponding image, which are seperated by space
0001.jpg label1
0002.jpg label2
0003.jpg label3
...
9999.jpg label9999
To use Imagenet format data to train classification, you can refer to configs/classification/imagenet/imagenet_rn50_jpg.py for more configuration details.
Local & PAI-DSW¶
Training¶
Single gpu:
python tools/train.py \
${CONFIG_PATH} \
--work_dir ${WORK_DIR}
Multi gpus:
bash tools/dist_train.sh \
${NUM_GPUS} \
${CONFIG_PATH} \
--work_dir ${WORK_DIR}
Arguments
NUM_GPUS
: number of gpusCONFIG_PATH
: the config file path of a image classification methodWORK_DIR
: your path to save models and logs
Examples:
Edit data_root
path in the ${CONFIG_PATH}
to your own data path.
single gpu training:
```shell
python tools/train.py configs/classification/cifar10/swintiny_b64_5e_jpg.py --work_dir work_dirs/classification/cifar10/swintiny --fp16
```
multi gpu training
```shell
GPUS=8
bash tools/dist_train.sh configs/classification/cifar10/swintiny_b64_5e_jpg.py $GPUS --fp16
```
training using python api
```python
import easycv.tools
import os
# config_path can be a local file or http url
config_path = 'configs/classification/cifar10/swintiny_b64_5e_jpg.py'
easycv.tools.train(config_path, gpus=8, fp16=False, master_port=29527)
```
Evaluation¶
Single gpu:
python tools/eval.py \
${CONFIG_PATH} \
${CHECKPOINT} \
--eval
Multi gpus:
bash tools/dist_test.sh \
${CONFIG_PATH} \
${NUM_GPUS} \
${CHECKPOINT} \
--eval
Arguments
CONFIG_PATH
: the config file path of a image classification methodNUM_GPUS
: number of gpusCHECKPOINT
: the checkpoint file named as epoch_*.pth
Examples:
single gpu evaluation
```shell
python tools/eval.py configs/classification/cifar10/swintiny_b64_5e_jpg.py work_dirs/classification/cifar10/swintiny/epoch_350.pth --eval --fp16
```
multi-gpu evaluation
```shell
GPUS=8
bash tools/dist_test.sh configs/classification/cifar10/swintiny_b64_5e_jpg.py $GPUS work_dirs/classification/cifar10/swintiny/epoch_350.pth --eval --fp16
```
evaluation using python api
```python
import easycv.tools
import os
os.environ['CUDA_VISIBLE_DEVICES']='3,4,5,6'
config_path = 'configs/classification/cifar10/swintiny_b64_5e_jpg.py'
checkpoint_path = 'work_dirs/classification/cifar10/swintiny/epoch_350.pth'
easycv.tools.eval(config_path, checkpoint_path, gpus=8)
```
Export model for inference¶
If SyncBN is configured, we should replace it with BN in config file
```python
# imagenet_rn50.py
model = dict(
...
backbone=dict(
...
norm_cfg=dict(type='BN')), # SyncBN --> BN
...)
```
```shell
python tools/export.py configs/classification/cifar10/swintiny_b64_5e_jpg.py \
work_dirs/classification/cifar10/swintiny/epoch_350.pth \
work_dirs/classification/cifar10/swintiny/epoch_350_export.pth
```
or using python api
```python
import easycv.tools
config_path = './imagenet_rn50.py'
checkpoint_path = 'oss://pai-vision-data-hz/pretrained_models/easycv/resnet/resnet50.pth'
export_path = './resnet50_export.pt'
easycv.tools.export(config_path, checkpoint_path, export_path)
```
Inference¶
Download [test_image](http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/cifar10/qince_data/predict/aeroplane_s_000004.png)
```python
import cv2
from easycv.predictors.classifier import TorchClassifier
output_ckpt = 'work_dirs/classification/cifar10/swintiny/epoch_350_export.pth'
tcls = TorchClassifier(output_ckpt)
img = cv2.imread('aeroplane_s_000004.png')
# input image should be RGB order
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
output = tcls.predict([img])
print(output)
```
file tutorial¶
The file module of easycv supports operations both on local and oss files, oss introduction please refer to: https://www.aliyun.com/product/oss .
If you operate oss files, you need refer to access_oss to authorize oss first.
Support operations¶
access_oss¶
Authorize oss.
Method1:
from easycv.file import io
io.access_oss(
ak_id='your_accesskey_id',
ak_secret='your_accesskey_secret',
hosts='your endpoint' or ['your endpoint1', 'your endpoint2'],
buckets='your bucket' or ['your bucket1', 'your bucket2'])
Method2:
Add oss config to your local file ~/.ossutilconfig
, as follows:
More oss config information, please refer to: https://help.aliyun.com/document_detail/120072.html
[Credentials]
language = CH
endpoint = your endpoint
accessKeyID = your_accesskey_id
accessKeySecret = your_accesskey_secret
[Bucket-Endpoint]
bucket1 = endpoint1
bucket2 = endpoint2
If you want to modify the path of the default oss config file (~/.ossutilconfig
), you can do as follows:
$ export OSS_CONFIG_FILE='your oss config file path'
Then run the following command, the config file will be read by default to authorize oss.
from easycv.file import io
io.access_oss()
Method3:
Set environment variables as follow, EasyCV will automatically parse environment variables for authorization:
import os
os.environ['OSS_ACCESS_KEY_ID'] = 'your_accesskey_id'
os.environ['OSS_ACCESS_KEY_SECRET'] = 'your_accesskey_secret'
os.environ['OSS_ENDPOINTS'] = 'your endpoint1,your endpoint2' # split with ","
os.environ['OSS_BUCKETS'] = 'your bucket1,your bucket2' # split with ","
open¶
Support w,wb, a, r, rb modes on oss path. Local path is the same usage as the python build-in open
.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
# Write something to a oss file.
with io.open('oss://bucket_name/demo.txt', 'w') as f:
f.write("test")
# Read from a oss file.
with io.open('oss://bucket_name/demo.txt', 'r') as f:
print(f.read())
Example for local:
from easycv.file import io
# Write something to a oss file.
with io.open('/your/local/path/demo.txt', 'w') as f:
f.write("test")
# Read from a oss file.
with io.open('/your/local/path/demo.txt', 'r') as f:
print(f.read())
exists¶
Whether the file exists, same usage as os.path.exists
. Support local path and oss path.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
ret = io.exists('oss://bucket_name/dir')
print(ret)
Example for Local:
from easycv.file import io
ret = io.exists('oss://bucket_name/dir')
print(ret)
move¶
Move src to dst, same usage as shutil.move. Support local path and oss path.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
# move oss file to local
io.move('oss://bucket_name/file.txt', '/your/local/path/file.txt')
# move oss file to oss
io.move('oss://bucket_name/dir1/file.txt', 'oss://bucket_name/dir2/file.txt')
# move local file to oss
io.move('/your/local/file.txt', 'oss://bucket_name/file.txt')
# move directory
io.move('oss://bucket_name/dir1/', 'oss://bucket_name/dir2/')
Example for local:
from easycv.file import io
# move local file to local
io.move('/your/local/path1/file.txt', '/your/local/path2/file.txt')
# move local dir to local
io.move('/your/local/dir1', '/your/local/dir2')
copy¶
Copy a file from src to dst. Same usage as shutil.copyfile
.If you want to copy a directory, please refert to [copytree](# copytree).
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
# Copy a file from local to oss:
io.copy('/your/local/file.txt', 'oss://bucket/dir/file.txt')
# Copy a oss file to local:
io.copy('oss://bucket/dir/file.txt', '/your/local/file.txt')
# Copy a file from oss to oss::
io.copy('oss://bucket/dir/file.txt', 'oss://bucket/dir/file2.txt')
Example for local:
from easycv.file import io
# Copy a file from local to local:
io.copy('/your/local/path1/file.txt', '/your/local/path2/file.txt'')
copytree¶
Copy files recursively from src to dst. Same usage as shutil.copytree
.
If you want to copy a file, please use [copy](# copy).
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
# copy files from local to oss
io.copytree(src='/your/local/dir1', dst='oss://bucket_name/dir2')
# copy files from oss to local
io.copytree(src='oss://bucket_name/dir2', dst='/your/local/dir1')
# copy files from oss to oss
io.copytree(src='oss://bucket_name/dir1', dst='oss://bucket_name/dir2')
Example for local:
from easycv.file import io
# copy files from local to local
io.copytree(src='/your/local/dir1', dst='/your/local/dir2')
listdir¶
List all objects in path. Same usage as os.listdir
.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
ret = io.listdir('oss://bucket/dir', recursive=True)
print(ret)
Example for local:
from easycv.file import io
ret = io.listdir('oss://bucket/dir', recursive=True)
print(ret)
remove¶
Remove a file or a directory recursively. Same usage as os.remove
or shutil.rmtree
.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
# Remove a oss file
io.remove('oss://bucket_name/file.txt')
# Remove a oss directory
io.remove('oss://bucket_name/dir/')
Example for local:
from easycv.file import io
# Remove a local file
io.remove('/your/local/path/file.txt')
# Remove a local directory
io.remove('/your/local/dir/')
rmtree¶
Remove directory recursively, same usage as shutil.rmtree
.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
io.remove('oss://bucket_name/dir_name/')
Example for local:
from easycv.file import io
io.remove('/your/local/dir/')
makedirs¶
Create directories recursively, same usage as os.makedirs
.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
io.makedirs('oss://bucket/new_dir/')
Example for local:
from easycv.file import io
io.makedirs('/your/local/new_dir/')
isdir¶
Return whether a path is directory, same usage as os.path.isdir
.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config') # only oss file need, refer to `IO.access_oss`
ret = io.isdir('oss://bucket/dir/')
print(ret)
Example for local:
from easycv.file import io
ret = io.isdir('your/local/dir/')
print(ret)
isfile¶
Return whether a path is file object, same usage as os.path.isfile
.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
ret = io.isfile('oss://bucket/file.txt')
print(ret)
Example for local:
from easycv.file import io
ret = io.isfile('/your/local/path/file.txt')
print(ret)
glob¶
Return a list of paths matching a pathname pattern.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
ret = io.glob('oss://bucket/dir/*.txt')
print(ret)
Example for local:
from easycv.file import io
ret = io.glob('/your/local/dir/*.txt')
print(ret)
size¶
Get the size of file path, same usage as os.path.getsize
.
Example for oss:
io.access_oss
please refer to [access_oss](# access_oss).
from easycv.file import io
io.access_oss('your oss config')
size = io.size('oss://bucket/file.txt')
print(size)
Example for local:
from easycv.file import io
size = io.size('/your/local/path/file.txt')
print(size)
v 0.11.0 (09/05/2023)¶
Highlights¶
Support EasyCV as a plug-in for [modelscope](https://github.com/modelscope/modelscope.
v 0.10.0 (06/03/2023)¶
Highlights¶
Support STDC, STGCN, ReID and Multi-len MOT.
Support multi processes for predictor data preprocessing. For the model with more time consuming in data preprocessing, the speedup can reach more than 50%.
New Features¶
Improvements¶
Speed up inference for face detector when using mtcnn. (#273)
Add mobilenet config for itag and imagenet dataset, and optimize
ClsSourceImageList
api to support string label. (#276) (#283)Support multi-rows replacement for first order parameter. (#282)
Add a tool to convert itag dataset to raw dataset. (#290)
Add
PoseTopDownPredictor
to replaceTorchPoseTopDownPredictorWithDetector
(#296)
v 0.8.0 (5/12/2022)¶
Highlights¶
New Features¶
Improvements¶
Unify the parsing method of config scripts, and support both local and pai platform products (#235)
Add more data source apis for open source datasets, involving classification, detection, segmentation and keypoints tasks. And part of the data source apis support automatic download. For more information, please refer to data_hub (#206 #229)
Add confusion matrix metric for Classification models (#241)
Add prediction script (#239)
v 0.7.0 (3/11/2022)¶
Highlights¶
New Features¶
Support semantic mask2former (#199)
Support face 2d keypoint detection (#191)
Support hand keypoints detection (#191)
Support wholebody keypoint detection (#207)
Support auto hyperparameter optimization of NNI (#211)
Add DeiT III (#171)
Add semantic segmentation model SegFormer (#191)
Add 3d detection model BEVFormer (#203)
Improvements¶
v 0.2.2 (07/04/2022)¶
initial commit & first release
SOTA SSL Algorithms
EasyCV provides state-of-the-art algorithms in self-supervised learning based on contrastive learning such as SimCLR, MoCO V2, Swav, DINO and also MAE based on masked image modeling. We also provides standard benchmark tools for ssl model evaluation.
Vision Transformers
EasyCV aims to provide plenty vision transformer models trained either using supervised learning or self-supervised learning, such as ViT, Swin-Transformer and XCit. More models will be added in the future.
Functionality & Extensibility
In addition to SSL, EasyCV also support image classification, object detection, metric learning, and more area will be supported in the future. Although convering different area, EasyCV decompose the framework into different componets such as dataset, model, running hook, making it easy to add new compoenets and combining it with existing modules. EasyCV provide simple and comprehensive interface for inference. Additionaly, all models are supported on PAI-EAS, which can be easily deployed as online service and support automatic scaling and service moniting.
Efficiency
EasyCV support multi-gpu and multi worker training. EasyCV use DALI to accelerate data io and preprocessing process, and use fp16 to accelerate training process. For inference optimization, EasyCV export model using jit script, which can be optimized by PAI-Blade.
easycv.apis package¶
Submodules¶
easycv.apis.export module¶
- easycv.apis.export.export(cfg, ckpt_path, filename, model=None, **kwargs)[source]¶
export model for inference
- Parameters
cfg – Config object
ckpt_path (str) – path to checkpoint file
filename (str) – filename to save exported models
model (nn.module) – model instance
- class easycv.apis.export.PreProcess(target_size: Tuple[int, int] = (640, 640), keep_ratio: bool = True)[source]¶
Bases:
object
Process the data input to model.
- Parameters
target_size (Tuple[int, int]) – output spatial size.
keep_ratio (bool) – Whether to keep the aspect ratio when resizing the image.
- class easycv.apis.export.ModelExportWrapper(model, example_inputs, trace_model: bool = True)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(model, example_inputs, trace_model: bool = True) → None[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(image)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.apis.export.ProcessExportWrapper(example_inputs, process_fn: Optional[Callable] = None)[source]¶
Bases:
torch.nn.modules.module.Module
split the preprocess that can be wrapped as a preprocess jit model the preproprocess procedure cannot be optimized in an end2end blade model due to dynamic shape problem
- __init__(example_inputs, process_fn: Optional[Callable] = None) → None[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(image)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.apis.test module¶
- easycv.apis.test.single_cpu_test(model, data_loader, mode='test', show=False, out_dir=None, show_score_thr=0.3, **kwargs)[source]¶
- easycv.apis.test.single_gpu_test(model, data_loader, mode='test', use_fp16=False, **kwargs)[source]¶
Test model with single.
This method tests model with single
- Parameters
model (str) – Model to be tested.
data_loader (nn.Dataloader) – Pytorch data loader.
model – mode for model to forward
use_fp16 – Use fp16 inference
- Returns
The prediction results.
- Return type
list
- easycv.apis.test.multi_gpu_test(model, data_loader, mode='test', tmpdir=None, gpu_collect=False, use_fp16=False, **kwargs)[source]¶
Test model with multiple gpus.
This method tests model with multiple gpus and collects the results under two different modes: gpu and cpu modes. By setting ‘gpu_collect=True’ it encodes results to gpu tensors and use gpu communication for results collection. On cpu mode it saves the results on different gpus to ‘tmpdir’ and collects them by the rank 0 worker.
- Parameters
model (str) – Model to be tested.
data_loader (nn.Dataloader) – Pytorch data loader.
model – mode for model to forward
tmpdir (str) – Path of directory to save the temporary results from different gpus under cpu mode.
gpu_collect (bool) – Option to use either gpu or cpu to collect results.
use_fp16 – Use fp16 inference
- Returns
The prediction results.
- Return type
list
easycv.apis.train module¶
- easycv.apis.train.init_random_seed(seed=None, device='cuda')[source]¶
Initialize random seed. If the seed is not set, the seed will be automatically randomized, and then broadcast to all processes to prevent some potential bugs. :param seed: The seed. Default to None. :type seed: int, Optional :param device: The device where the seed will be put on.
Default to ‘cuda’.
- Returns
Seed to be used.
- Return type
int
- easycv.apis.train.set_random_seed(seed, deterministic=False)[source]¶
Set random seed.
- Parameters
seed (int) – Seed to be used.
deterministic (bool) – Whether to set the deterministic option for CUDNN backend, i.e., set torch.backends.cudnn.deterministic to True and torch.backends.cudnn.benchmark to False. Default: False.
- easycv.apis.train.train_model(model, data_loaders, cfg, distributed=False, timestamp=None, meta=None, use_fp16=False, validate=True, gpu_collect=True)[source]¶
Training API.
- Parameters
model (
nn.Module
) – user defined modeldata_loaders – a list of dataloader for training data
cfg – config object
distributed – distributed training or not
timestamp – time str formated as ‘%Y%m%d_%H%M%S’
meta – a dict containing meta data info, such as env_info, seed, iter, epoch
use_fp16 – use fp16 training or not
validate – do evaluation while training
gpu_collect – use gpu collect or cpu collect for tensor gathering
- easycv.apis.train.build_optimizer(model, optimizer_cfg)[source]¶
Build optimizer from configs.
- Parameters
model (
nn.Module
) – The model with parameters to be optimized.optimizer_cfg (dict) –
The config dict of the optimizer.
- Positional fields are:
type: class name of the optimizer.
lr: base learning rate.
- Optional fields are:
any arguments of the corresponding optimizer type, e.g., weight_decay, momentum, etc.
paramwise_options: a dict with regular expression as keys to match parameter names and a dict containing options as values. Options include 6 fields: lr, lr_mult, momentum, momentum_mult, weight_decay, weight_decay_mult.
- Returns
The initialized optimizer.
- Return type
torch.optim.Optimizer
Example
>>> model = torch.nn.modules.Conv1d(1, 1, 1) >>> paramwise_options = { >>> '(bn|gn)(\d+)?.(weight|bias)': dict(weight_decay_mult=0.1), >>> '\Ahead.': dict(lr_mult=10, momentum=0)} >>> optimizer_cfg = dict(type='SGD', lr=0.01, momentum=0.9, >>> weight_decay=0.0001, >>> paramwise_options=paramwise_options) >>> optimizer = build_optimizer(model, optimizer_cfg)
easycv.datasets package¶
Subpackages¶
easycv.datasets.classification package¶
- class easycv.datasets.classification.ClsDataset(data_source, pipeline)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]Dataset for classification
- Parameters
data_source – data source to parse input data
pipeline – transforms list
- __init__(data_source, pipeline)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- evaluate(results, evaluators, logger=None, topk=(1, 5))[source]¶
evaluate classification task
- Parameters
results – a dict of list of tensor, including prediction and groundtruth info, where prediction tensor is NxC,and the same with groundtruth labels.
evaluators – a list of evaluator
- Returns
a dict of float, different metric values
- Return type
eval_result
- visualize(results, vis_num=10, **kwargs)[source]¶
Visulaize the model output on validation data. :param results: A dictionary containing
class: List of length number of test images. img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape and so on.
- Parameters
vis_num – number of images visualized
- Returns: A dictionary containing
images: Visulaized images, list of np.ndarray. img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape and so on.
- class easycv.datasets.classification.ClsOdpsDataset(data_source, pipeline, image_key='url_image', label_key='label', **kwargs)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]Dataset for rotation prediction
Subpackages¶
easycv.datasets.classification.data_sources package¶
- class easycv.datasets.classification.data_sources.ClsSourceCifar10(root, split, download=True)[source]¶
Bases:
object
- CLASSES = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']¶
- class easycv.datasets.classification.data_sources.ClsSourceCifar100(root, split, download=True)[source]¶
Bases:
object
- CLASSES = None¶
- class easycv.datasets.classification.data_sources.ClsSourceImageListByClass(root, list_file, m_per_class=2, delimeter=' ', split_huge_listfile_byrank=False, cache_path='data/', max_try=20)[source]¶
Bases:
object
Get the same m_per_class samples by the label idx.
- Parameters
list_file – str / list(str), str means a input image list file path, this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label
root – str / list(str), root path for image_path, each list_file will need a root.
m_per_class – num of samples for each class.
delimeter – str, delimeter of each line in the list_file
split_huge_listfile_byrank – Adapt to the situation that the memory cannot fully load a huge amount of data list. If split, data list will be split to each rank.
cache_path – if split_huge_listfile_byrank is true, cache list_file will be saved to cache_path.
max_try – int, max try numbers of reading image
- class easycv.datasets.classification.data_sources.ClsSourceImageList(list_file, root='', delimeter=' ', split_huge_listfile_byrank=False, split_label_balance=False, cache_path='data/', class_list=None)[source]¶
Bases:
object
data source for classification :param list_file: str / list(str), str means a input image list file path,
this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label
- Parameters
root – str / list(str), root path for image_path, each list_file will need a root, if len(root) < len(list_file), we will use root[-1] to fill root list.
delimeter – str, delimeter of each line in the list_file
split_huge_listfile_byrank – Adapt to the situation that the memory cannot fully load a huge amount of data list. If split, data list will be split to each rank.
split_label_balance – if split_huge_listfile_byrank is true, whether split with label balance
cache_path – if split_huge_listfile_byrank is true, cache list_file will be saved to cache_path.
- class easycv.datasets.classification.data_sources.ClsSourceItag(list_file, root='', class_list=None)[source]¶
Bases:
easycv.datasets.classification.data_sources.image_list.ClsSourceImageList
data source itag for classification :param list_file: str / list(str), str means a input image list file path,
this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label
- class easycv.datasets.classification.data_sources.ClsSourceImageNetTFRecord(list_file='', root='', file_pattern=None, cache_path='data/cache/', max_try=10)[source]¶
Bases:
object
data source for imagenet tfrecord.
- class easycv.datasets.classification.data_sources.ClsSourceCUB(*args, ann_file, image_class_labels_file, train_test_split_file, test_mode, data_prefix, **kwargs)[source]¶
Bases:
object
The CUB-200-2011 Dataset. Support the CUB-200-2011 Dataset. Comparing with the CUB-200 Dataset, there are much more pictures in CUB-200-2011. :param ann_file: the annotation file.
images.txt in CUB.
- Parameters
image_class_labels_file (str) – the label file. image_class_labels.txt in CUB.
train_test_split_file (str) – the split file. train_test_split_file.txt in CUB.
- CLASSES = ['Black_footed_Albatross', 'Laysan_Albatross', 'Sooty_Albatross', 'Groove_billed_Ani', 'Crested_Auklet', 'Least_Auklet', 'Parakeet_Auklet', 'Rhinoceros_Auklet', 'Brewer_Blackbird', 'Red_winged_Blackbird', 'Rusty_Blackbird', 'Yellow_headed_Blackbird', 'Bobolink', 'Indigo_Bunting', 'Lazuli_Bunting', 'Painted_Bunting', 'Cardinal', 'Spotted_Catbird', 'Gray_Catbird', 'Yellow_breasted_Chat', 'Eastern_Towhee', 'Chuck_will_Widow', 'Brandt_Cormorant', 'Red_faced_Cormorant', 'Pelagic_Cormorant', 'Bronzed_Cowbird', 'Shiny_Cowbird', 'Brown_Creeper', 'American_Crow', 'Fish_Crow', 'Black_billed_Cuckoo', 'Mangrove_Cuckoo', 'Yellow_billed_Cuckoo', 'Gray_crowned_Rosy_Finch', 'Purple_Finch', 'Northern_Flicker', 'Acadian_Flycatcher', 'Great_Crested_Flycatcher', 'Least_Flycatcher', 'Olive_sided_Flycatcher', 'Scissor_tailed_Flycatcher', 'Vermilion_Flycatcher', 'Yellow_bellied_Flycatcher', 'Frigatebird', 'Northern_Fulmar', 'Gadwall', 'American_Goldfinch', 'European_Goldfinch', 'Boat_tailed_Grackle', 'Eared_Grebe', 'Horned_Grebe', 'Pied_billed_Grebe', 'Western_Grebe', 'Blue_Grosbeak', 'Evening_Grosbeak', 'Pine_Grosbeak', 'Rose_breasted_Grosbeak', 'Pigeon_Guillemot', 'California_Gull', 'Glaucous_winged_Gull', 'Heermann_Gull', 'Herring_Gull', 'Ivory_Gull', 'Ring_billed_Gull', 'Slaty_backed_Gull', 'Western_Gull', 'Anna_Hummingbird', 'Ruby_throated_Hummingbird', 'Rufous_Hummingbird', 'Green_Violetear', 'Long_tailed_Jaeger', 'Pomarine_Jaeger', 'Blue_Jay', 'Florida_Jay', 'Green_Jay', 'Dark_eyed_Junco', 'Tropical_Kingbird', 'Gray_Kingbird', 'Belted_Kingfisher', 'Green_Kingfisher', 'Pied_Kingfisher', 'Ringed_Kingfisher', 'White_breasted_Kingfisher', 'Red_legged_Kittiwake', 'Horned_Lark', 'Pacific_Loon', 'Mallard', 'Western_Meadowlark', 'Hooded_Merganser', 'Red_breasted_Merganser', 'Mockingbird', 'Nighthawk', 'Clark_Nutcracker', 'White_breasted_Nuthatch', 'Baltimore_Oriole', 'Hooded_Oriole', 'Orchard_Oriole', 'Scott_Oriole', 'Ovenbird', 'Brown_Pelican', 'White_Pelican', 'Western_Wood_Pewee', 'Sayornis', 'American_Pipit', 'Whip_poor_Will', 'Horned_Puffin', 'Common_Raven', 'White_necked_Raven', 'American_Redstart', 'Geococcyx', 'Loggerhead_Shrike', 'Great_Grey_Shrike', 'Baird_Sparrow', 'Black_throated_Sparrow', 'Brewer_Sparrow', 'Chipping_Sparrow', 'Clay_colored_Sparrow', 'House_Sparrow', 'Field_Sparrow', 'Fox_Sparrow', 'Grasshopper_Sparrow', 'Harris_Sparrow', 'Henslow_Sparrow', 'Le_Conte_Sparrow', 'Lincoln_Sparrow', 'Nelson_Sharp_tailed_Sparrow', 'Savannah_Sparrow', 'Seaside_Sparrow', 'Song_Sparrow', 'Tree_Sparrow', 'Vesper_Sparrow', 'White_crowned_Sparrow', 'White_throated_Sparrow', 'Cape_Glossy_Starling', 'Bank_Swallow', 'Barn_Swallow', 'Cliff_Swallow', 'Tree_Swallow', 'Scarlet_Tanager', 'Summer_Tanager', 'Artic_Tern', 'Black_Tern', 'Caspian_Tern', 'Common_Tern', 'Elegant_Tern', 'Forsters_Tern', 'Least_Tern', 'Green_tailed_Towhee', 'Brown_Thrasher', 'Sage_Thrasher', 'Black_capped_Vireo', 'Blue_headed_Vireo', 'Philadelphia_Vireo', 'Red_eyed_Vireo', 'Warbling_Vireo', 'White_eyed_Vireo', 'Yellow_throated_Vireo', 'Bay_breasted_Warbler', 'Black_and_white_Warbler', 'Black_throated_Blue_Warbler', 'Blue_winged_Warbler', 'Canada_Warbler', 'Cape_May_Warbler', 'Cerulean_Warbler', 'Chestnut_sided_Warbler', 'Golden_winged_Warbler', 'Hooded_Warbler', 'Kentucky_Warbler', 'Magnolia_Warbler', 'Mourning_Warbler', 'Myrtle_Warbler', 'Nashville_Warbler', 'Orange_crowned_Warbler', 'Palm_Warbler', 'Pine_Warbler', 'Prairie_Warbler', 'Prothonotary_Warbler', 'Swainson_Warbler', 'Tennessee_Warbler', 'Wilson_Warbler', 'Worm_eating_Warbler', 'Yellow_Warbler', 'Northern_Waterthrush', 'Louisiana_Waterthrush', 'Bohemian_Waxwing', 'Cedar_Waxwing', 'American_Three_toed_Woodpecker', 'Pileated_Woodpecker', 'Red_bellied_Woodpecker', 'Red_cockaded_Woodpecker', 'Red_headed_Woodpecker', 'Downy_Woodpecker', 'Bewick_Wren', 'Cactus_Wren', 'Carolina_Wren', 'House_Wren', 'Marsh_Wren', 'Rock_Wren', 'Winter_Wren', 'Common_Yellowthroat']¶
- class easycv.datasets.classification.data_sources.ClsSourceImageNet1k(root, split)[source]¶
Bases:
object
- class easycv.datasets.classification.data_sources.ClsSourceCaltech101(root, download=True)[source]¶
Bases:
object
- class easycv.datasets.classification.data_sources.ClsSourceCaltech256(root, download=True)[source]¶
Bases:
object
- class easycv.datasets.classification.data_sources.ClsSourceFlowers102(root, split, download=False)[source]¶
Bases:
object
- class easycv.datasets.classification.data_sources.ClsSourceMnist(root, split, download=True)[source]¶
Bases:
object
- class easycv.datasets.classification.data_sources.ClsSourceFashionMnist(root, split, download=True)[source]¶
Bases:
object
- class easycv.datasets.classification.data_sources.class_list.ClsSourceImageListByClass(root, list_file, m_per_class=2, delimeter=' ', split_huge_listfile_byrank=False, cache_path='data/', max_try=20)[source]¶
Bases:
object
Get the same m_per_class samples by the label idx.
- Parameters
list_file – str / list(str), str means a input image list file path, this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label
root – str / list(str), root path for image_path, each list_file will need a root.
m_per_class – num of samples for each class.
delimeter – str, delimeter of each line in the list_file
split_huge_listfile_byrank – Adapt to the situation that the memory cannot fully load a huge amount of data list. If split, data list will be split to each rank.
cache_path – if split_huge_listfile_byrank is true, cache list_file will be saved to cache_path.
max_try – int, max try numbers of reading image
- class easycv.datasets.classification.data_sources.image_list.ClsSourceImageList(list_file, root='', delimeter=' ', split_huge_listfile_byrank=False, split_label_balance=False, cache_path='data/', class_list=None)[source]¶
Bases:
object
data source for classification :param list_file: str / list(str), str means a input image list file path,
this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label
- Parameters
root – str / list(str), root path for image_path, each list_file will need a root, if len(root) < len(list_file), we will use root[-1] to fill root list.
delimeter – str, delimeter of each line in the list_file
split_huge_listfile_byrank – Adapt to the situation that the memory cannot fully load a huge amount of data list. If split, data list will be split to each rank.
split_label_balance – if split_huge_listfile_byrank is true, whether split with label balance
cache_path – if split_huge_listfile_byrank is true, cache list_file will be saved to cache_path.
- class easycv.datasets.classification.data_sources.image_list.ClsSourceItag(list_file, root='', class_list=None)[source]¶
Bases:
easycv.datasets.classification.data_sources.image_list.ClsSourceImageList
data source itag for classification :param list_file: str / list(str), str means a input image list file path,
this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label
easycv.datasets.classification.pipelines package¶
- class easycv.datasets.classification.pipelines.MMAutoAugment(policies=[[{'type': 'Posterize', 'bits': 4, 'prob': 0.4}, {'type': 'Rotate', 'angle': 30.0, 'prob': 0.6}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 5, 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}], [{'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 6, 'prob': 0.8}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'Rotate', 'angle': 10.0, 'prob': 0.2}, {'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 30.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.0}, {'type': 'Equalize', 'prob': 0.8}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8}, {'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8}], [{'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4}, {'type': 'Invert', 'prob': 0.6}], [{'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal'}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}]], hparams={'pad_val': 128})[source]¶
Bases:
object
Auto augmentation. This data augmentation is proposed in AutoAugment: Learning Augmentation Policies from Data. :param policies: The policies of auto augmentation. Each
policy in
policies
is a specific augmentation policy, and is composed by several augmentations (dict). When AutoAugment is called, a random policy inpolicies
will be selected to augment images.- Parameters
hparams (dict) – Configs of hyperparameters. Hyperparameters will be used in policies that require these arguments if these arguments are not set in policy dicts. Defaults to use _HPARAMS_DEFAULT.
- __init__(policies=[[{'type': 'Posterize', 'bits': 4, 'prob': 0.4}, {'type': 'Rotate', 'angle': 30.0, 'prob': 0.6}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 5, 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}], [{'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 6, 'prob': 0.8}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'Rotate', 'angle': 10.0, 'prob': 0.2}, {'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 30.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.0}, {'type': 'Equalize', 'prob': 0.8}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8}, {'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8}], [{'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4}, {'type': 'Invert', 'prob': 0.6}], [{'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal'}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}]], hparams={'pad_val': 128})[source]¶
Initialize self. See help(type(self)) for accurate signature.
- class easycv.datasets.classification.pipelines.MMRandAugment(num_policies, magnitude_level, magnitude_std=0.0, total_level=30, policies=[{'type': 'AutoContrast'}, {'type': 'Equalize'}, {'type': 'Invert'}, {'type': 'Rotate', 'magnitude_key': 'angle', 'magnitude_range': (0, 30)}, {'type': 'Posterize', 'magnitude_key': 'bits', 'magnitude_range': (4, 0)}, {'type': 'Solarize', 'magnitude_key': 'thr', 'magnitude_range': (256, 0)}, {'type': 'SolarizeAdd', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 110)}, {'type': 'ColorTransform', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Contrast', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Brightness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Sharpness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'horizontal'}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'vertical'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'horizontal'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'vertical'}], hparams={'pad_val': 128})[source]¶
Bases:
object
Random augmentation. This data augmentation is proposed in RandAugment: Practical automated data augmentation with a reduced search space. :param policies: The policies of random augmentation. Each
policy in
policies
is one specific augmentation policy (dict). The policy shall at least have key type, indicating the type of augmentation. For those which have magnitude, (given to the fact they are named differently in different augmentation, ) magnitude_key and magnitude_range shall be the magnitude argument (str) and the range of magnitude (tuple in the format of (val1, val2)), respectively. Note that val1 is not necessarily less than val2.- Parameters
num_policies (int) – Number of policies to select from policies each time.
magnitude_level (int | float) – Magnitude level for all the augmentation selected.
total_level (int | float) – Total level for the magnitude. Defaults to 30.
magnitude_std (Number | str) –
Deviation of magnitude noise applied. - If positive number, magnitude is sampled from normal distribution
(mean=magnitude, std=magnitude_std).
If 0 or negative number, magnitude remains unchanged.
If str “inf”, magnitude is sampled from uniform distribution (range=[min, magnitude]).
hparams (dict) – Configs of hyperparameters. Hyperparameters will be used in policies that require these arguments if these arguments are not set in policy dicts. Defaults to use _HPARAMS_DEFAULT.
Note
magnitude_std will introduce some randomness to policy, modified by https://github.com/rwightman/pytorch-image-models. When magnitude_std=0, we calculate the magnitude as follows: .. math:
\text{magnitude} = \frac{\text{magnitude\_level}} {\text{total\_level}} \times (\text{val2} - \text{val1}) + \text{val1}
- __init__(num_policies, magnitude_level, magnitude_std=0.0, total_level=30, policies=[{'type': 'AutoContrast'}, {'type': 'Equalize'}, {'type': 'Invert'}, {'type': 'Rotate', 'magnitude_key': 'angle', 'magnitude_range': (0, 30)}, {'type': 'Posterize', 'magnitude_key': 'bits', 'magnitude_range': (4, 0)}, {'type': 'Solarize', 'magnitude_key': 'thr', 'magnitude_range': (256, 0)}, {'type': 'SolarizeAdd', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 110)}, {'type': 'ColorTransform', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Contrast', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Brightness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Sharpness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'horizontal'}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'vertical'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'horizontal'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'vertical'}], hparams={'pad_val': 128})[source]¶
Initialize self. See help(type(self)) for accurate signature.
- class easycv.datasets.classification.pipelines.MMRandomErasing(erase_prob=0.5, min_area_ratio=0.02, max_area_ratio=0.4, aspect_range=(0.3, 3.3333333333333335), mode='const', fill_color=(128, 128, 128), fill_std=None)[source]¶
Bases:
object
Randomly selects a rectangle region in an image and erase pixels. :param erase_prob: Probability that image will be randomly erased.
Default: 0.5
- Parameters
min_area_ratio (float) – Minimum erased area / input image area Default: 0.02
max_area_ratio (float) – Maximum erased area / input image area Default: 0.4
aspect_range (sequence | float) – Aspect ratio range of erased area. if float, it will be converted to (aspect_ratio, 1/aspect_ratio) Default: (3/10, 10/3)
mode (str) – Fill method in erased area, can be: - const (default): All pixels are assign with the same value. - rand: each pixel is assigned with a random value in [0, 255]
fill_color (sequence | Number) – Base color filled in erased area. Defaults to (128, 128, 128).
fill_std (sequence | Number, optional) – If set and
mode
is ‘rand’, fill erased area with random color from normal distribution (mean=fill_color, std=fill_std); If not set, fill erased area with random color from uniform distribution (0~255). Defaults to None.
Note
See Random Erasing Data Augmentation This paper provided 4 modes: RE-R, RE-M, RE-0, RE-255, and use RE-M as default. The config of these 4 modes are: - RE-R: RandomErasing(mode=’rand’) - RE-M: RandomErasing(mode=’const’, fill_color=(123.67, 116.3, 103.5)) - RE-0: RandomErasing(mode=’const’, fill_color=0) - RE-255: RandomErasing(mode=’const’, fill_color=255)
- easycv.datasets.classification.pipelines.auto_augment.random_negative(value, random_negative_prob)[source]¶
Randomly negate value based on random_negative_prob.
- easycv.datasets.classification.pipelines.auto_augment.merge_hparams(policy: dict, hparams: dict)[source]¶
Merge hyperparameters into policy config. Only merge partial hyperparameters required of the policy. :param policy: Original policy config dict. :type policy: dict :param hparams: Hyperparameters need to be merged. :type hparams: dict
- Returns
Policy config dict after adding
hparams
.- Return type
dict
- class easycv.datasets.classification.pipelines.auto_augment.MMAutoAugment(policies=[[{'type': 'Posterize', 'bits': 4, 'prob': 0.4}, {'type': 'Rotate', 'angle': 30.0, 'prob': 0.6}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 5, 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}], [{'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 6, 'prob': 0.8}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'Rotate', 'angle': 10.0, 'prob': 0.2}, {'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 30.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.0}, {'type': 'Equalize', 'prob': 0.8}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8}, {'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8}], [{'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4}, {'type': 'Invert', 'prob': 0.6}], [{'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal'}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}]], hparams={'pad_val': 128})[source]¶
Bases:
object
Auto augmentation. This data augmentation is proposed in AutoAugment: Learning Augmentation Policies from Data. :param policies: The policies of auto augmentation. Each
policy in
policies
is a specific augmentation policy, and is composed by several augmentations (dict). When AutoAugment is called, a random policy inpolicies
will be selected to augment images.- Parameters
hparams (dict) – Configs of hyperparameters. Hyperparameters will be used in policies that require these arguments if these arguments are not set in policy dicts. Defaults to use _HPARAMS_DEFAULT.
- __init__(policies=[[{'type': 'Posterize', 'bits': 4, 'prob': 0.4}, {'type': 'Rotate', 'angle': 30.0, 'prob': 0.6}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 5, 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}], [{'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 6, 'prob': 0.8}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'Rotate', 'angle': 10.0, 'prob': 0.2}, {'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 30.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.0}, {'type': 'Equalize', 'prob': 0.8}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8}, {'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8}], [{'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4}, {'type': 'Invert', 'prob': 0.6}], [{'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal'}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}]], hparams={'pad_val': 128})[source]¶
Initialize self. See help(type(self)) for accurate signature.
- class easycv.datasets.classification.pipelines.auto_augment.MMRandAugment(num_policies, magnitude_level, magnitude_std=0.0, total_level=30, policies=[{'type': 'AutoContrast'}, {'type': 'Equalize'}, {'type': 'Invert'}, {'type': 'Rotate', 'magnitude_key': 'angle', 'magnitude_range': (0, 30)}, {'type': 'Posterize', 'magnitude_key': 'bits', 'magnitude_range': (4, 0)}, {'type': 'Solarize', 'magnitude_key': 'thr', 'magnitude_range': (256, 0)}, {'type': 'SolarizeAdd', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 110)}, {'type': 'ColorTransform', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Contrast', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Brightness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Sharpness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'horizontal'}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'vertical'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'horizontal'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'vertical'}], hparams={'pad_val': 128})[source]¶
Bases:
object
Random augmentation. This data augmentation is proposed in RandAugment: Practical automated data augmentation with a reduced search space. :param policies: The policies of random augmentation. Each
policy in
policies
is one specific augmentation policy (dict). The policy shall at least have key type, indicating the type of augmentation. For those which have magnitude, (given to the fact they are named differently in different augmentation, ) magnitude_key and magnitude_range shall be the magnitude argument (str) and the range of magnitude (tuple in the format of (val1, val2)), respectively. Note that val1 is not necessarily less than val2.- Parameters
num_policies (int) – Number of policies to select from policies each time.
magnitude_level (int | float) – Magnitude level for all the augmentation selected.
total_level (int | float) – Total level for the magnitude. Defaults to 30.
magnitude_std (Number | str) –
Deviation of magnitude noise applied. - If positive number, magnitude is sampled from normal distribution
(mean=magnitude, std=magnitude_std).
If 0 or negative number, magnitude remains unchanged.
If str “inf”, magnitude is sampled from uniform distribution (range=[min, magnitude]).
hparams (dict) – Configs of hyperparameters. Hyperparameters will be used in policies that require these arguments if these arguments are not set in policy dicts. Defaults to use _HPARAMS_DEFAULT.
Note
magnitude_std will introduce some randomness to policy, modified by https://github.com/rwightman/pytorch-image-models. When magnitude_std=0, we calculate the magnitude as follows: .. math:
\text{magnitude} = \frac{\text{magnitude\_level}} {\text{total\_level}} \times (\text{val2} - \text{val1}) + \text{val1}
- __init__(num_policies, magnitude_level, magnitude_std=0.0, total_level=30, policies=[{'type': 'AutoContrast'}, {'type': 'Equalize'}, {'type': 'Invert'}, {'type': 'Rotate', 'magnitude_key': 'angle', 'magnitude_range': (0, 30)}, {'type': 'Posterize', 'magnitude_key': 'bits', 'magnitude_range': (4, 0)}, {'type': 'Solarize', 'magnitude_key': 'thr', 'magnitude_range': (256, 0)}, {'type': 'SolarizeAdd', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 110)}, {'type': 'ColorTransform', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Contrast', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Brightness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Sharpness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'horizontal'}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'vertical'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'horizontal'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'vertical'}], hparams={'pad_val': 128})[source]¶
Initialize self. See help(type(self)) for accurate signature.
- class easycv.datasets.classification.pipelines.auto_augment.Shear(magnitude, pad_val=128, prob=0.5, direction='horizontal', random_negative_prob=0.5, interpolation='bicubic')[source]¶
Bases:
object
Shear images. :param magnitude: The magnitude used for shear. :type magnitude: int | float :param pad_val: Pixel pad_val value for constant fill.
If a sequence of length 3, it is used to pad_val R, G, B channels respectively. Defaults to 128.
- Parameters
prob (float) – The probability for performing Shear therefore should be in range [0, 1]. Defaults to 0.5.
direction (str) – The shearing direction. Options are ‘horizontal’ and ‘vertical’. Defaults to ‘horizontal’.
random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.
interpolation (str) – Interpolation method. Options are ‘nearest’, ‘bilinear’, ‘bicubic’, ‘area’, ‘lanczos’. Defaults to ‘bicubic’.
- class easycv.datasets.classification.pipelines.auto_augment.Translate(magnitude, pad_val=128, prob=0.5, direction='horizontal', random_negative_prob=0.5, interpolation='nearest')[source]¶
Bases:
object
Translate images. :param magnitude: The magnitude used for translate. Note that
the offset is calculated by magnitude * size in the corresponding direction. With a magnitude of 1, the whole image will be moved out of the range.
- Parameters
pad_val (int, Sequence[int]) – Pixel pad_val value for constant fill. If a sequence of length 3, it is used to pad_val R, G, B channels respectively. Defaults to 128.
prob (float) – The probability for performing translate therefore should be in range [0, 1]. Defaults to 0.5.
direction (str) – The translating direction. Options are ‘horizontal’ and ‘vertical’. Defaults to ‘horizontal’.
random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.
interpolation (str) – Interpolation method. Options are ‘nearest’, ‘bilinear’, ‘bicubic’, ‘area’, ‘lanczos’. Defaults to ‘nearest’.
- class easycv.datasets.classification.pipelines.auto_augment.Rotate(angle, center=None, scale=1.0, pad_val=128, prob=0.5, random_negative_prob=0.5, interpolation='nearest')[source]¶
Bases:
object
Rotate images. :param angle: The angle used for rotate. Positive values stand for
clockwise rotation.
- Parameters
center (tuple[float], optional) – Center point (w, h) of the rotation in the source image. If None, the center of the image will be used. Defaults to None.
scale (float) – Isotropic scale factor. Defaults to 1.0.
pad_val (int, Sequence[int]) – Pixel pad_val value for constant fill. If a sequence of length 3, it is used to pad_val R, G, B channels respectively. Defaults to 128.
prob (float) – The probability for performing Rotate therefore should be in range [0, 1]. Defaults to 0.5.
random_negative_prob (float) – The probability that turns the angle negative, which should be in range [0,1]. Defaults to 0.5.
interpolation (str) – Interpolation method. Options are ‘nearest’, ‘bilinear’, ‘bicubic’, ‘area’, ‘lanczos’. Defaults to ‘nearest’.
- class easycv.datasets.classification.pipelines.auto_augment.AutoContrast(prob=0.5)[source]¶
Bases:
object
Auto adjust image contrast. :param prob: The probability for performing invert therefore should
be in range [0, 1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.Invert(prob=0.5)[source]¶
Bases:
object
Invert images. :param prob: The probability for performing invert therefore should
be in range [0, 1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.Equalize(prob=0.5)[source]¶
Bases:
object
Equalize the image histogram. :param prob: The probability for performing invert therefore should
be in range [0, 1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.Solarize(thr, prob=0.5)[source]¶
Bases:
object
Solarize images (invert all pixel values above a threshold). :param thr: The threshold above which the pixels value will be
inverted.
- Parameters
prob (float) – The probability for solarizing therefore should be in range [0, 1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.SolarizeAdd(magnitude, thr=128, prob=0.5)[source]¶
Bases:
object
SolarizeAdd images (add a certain value to pixels below a threshold). :param magnitude: The value to be added to pixels below the thr. :type magnitude: int | float :param thr: The threshold below which the pixels value will be
adjusted.
- Parameters
prob (float) – The probability for solarizing therefore should be in range [0, 1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.Posterize(bits, prob=0.5)[source]¶
Bases:
object
Posterize images (reduce the number of bits for each color channel). :param bits: Number of bits for each pixel in the output img,
which should be less or equal to 8.
- Parameters
prob (float) – The probability for posterizing therefore should be in range [0, 1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.Contrast(magnitude, prob=0.5, random_negative_prob=0.5)[source]¶
Bases:
object
Adjust images contrast. :param magnitude: The magnitude used for adjusting contrast. A
positive magnitude would enhance the contrast and a negative magnitude would make the image grayer. A magnitude=0 gives the origin img.
- Parameters
prob (float) – The probability for performing contrast adjusting therefore should be in range [0, 1]. Defaults to 0.5.
random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.ColorTransform(magnitude, prob=0.5, random_negative_prob=0.5)[source]¶
Bases:
object
Adjust images color balance. :param magnitude: The magnitude used for color transform. A
positive magnitude would enhance the color and a negative magnitude would make the image grayer. A magnitude=0 gives the origin img.
- Parameters
prob (float) – The probability for performing ColorTransform therefore should be in range [0, 1]. Defaults to 0.5.
random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.Brightness(magnitude, prob=0.5, random_negative_prob=0.5)[source]¶
Bases:
object
Adjust images brightness. :param magnitude: The magnitude used for adjusting brightness. A
positive magnitude would enhance the brightness and a negative magnitude would make the image darker. A magnitude=0 gives the origin img.
- Parameters
prob (float) – The probability for performing contrast adjusting therefore should be in range [0, 1]. Defaults to 0.5.
random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.Sharpness(magnitude, prob=0.5, random_negative_prob=0.5)[source]¶
Bases:
object
Adjust images sharpness. :param magnitude: The magnitude used for adjusting sharpness. A
positive magnitude would enhance the sharpness and a negative magnitude would make the image bulr. A magnitude=0 gives the origin img.
- Parameters
prob (float) – The probability for performing contrast adjusting therefore should be in range [0, 1]. Defaults to 0.5.
random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.auto_augment.Cutout(shape, pad_val=128, prob=0.5)[source]¶
Bases:
object
Cutout images. :param shape: Expected cutout shape (h, w).
If given as a single value, the value will be used for both h and w.
- Parameters
pad_val (int, Sequence[int]) – Pixel pad_val value for constant fill. If it is a sequence, it must have the same length with the image channels. Defaults to 128.
prob (float) – The probability for performing cutout therefore should be in range [0, 1]. Defaults to 0.5.
- class easycv.datasets.classification.pipelines.transform.MMRandomErasing(erase_prob=0.5, min_area_ratio=0.02, max_area_ratio=0.4, aspect_range=(0.3, 3.3333333333333335), mode='const', fill_color=(128, 128, 128), fill_std=None)[source]¶
Bases:
object
Randomly selects a rectangle region in an image and erase pixels. :param erase_prob: Probability that image will be randomly erased.
Default: 0.5
- Parameters
min_area_ratio (float) – Minimum erased area / input image area Default: 0.02
max_area_ratio (float) – Maximum erased area / input image area Default: 0.4
aspect_range (sequence | float) – Aspect ratio range of erased area. if float, it will be converted to (aspect_ratio, 1/aspect_ratio) Default: (3/10, 10/3)
mode (str) – Fill method in erased area, can be: - const (default): All pixels are assign with the same value. - rand: each pixel is assigned with a random value in [0, 255]
fill_color (sequence | Number) – Base color filled in erased area. Defaults to (128, 128, 128).
fill_std (sequence | Number, optional) – If set and
mode
is ‘rand’, fill erased area with random color from normal distribution (mean=fill_color, std=fill_std); If not set, fill erased area with random color from uniform distribution (0~255). Defaults to None.
Note
See Random Erasing Data Augmentation This paper provided 4 modes: RE-R, RE-M, RE-0, RE-255, and use RE-M as default. The config of these 4 modes are: - RE-R: RandomErasing(mode=’rand’) - RE-M: RandomErasing(mode=’const’, fill_color=(123.67, 116.3, 103.5)) - RE-0: RandomErasing(mode=’const’, fill_color=0) - RE-255: RandomErasing(mode=’const’, fill_color=255)
Submodules¶
easycv.datasets.classification.odps module¶
- class easycv.datasets.classification.odps.ClsOdpsDataset(data_source, pipeline, image_key='url_image', label_key='label', **kwargs)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]Dataset for rotation prediction
easycv.datasets.classification.raw module¶
- class easycv.datasets.classification.raw.ClsDataset(data_source, pipeline)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]Dataset for classification
- Parameters
data_source – data source to parse input data
pipeline – transforms list
- __init__(data_source, pipeline)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- evaluate(results, evaluators, logger=None, topk=(1, 5))[source]¶
evaluate classification task
- Parameters
results – a dict of list of tensor, including prediction and groundtruth info, where prediction tensor is NxC,and the same with groundtruth labels.
evaluators – a list of evaluator
- Returns
a dict of float, different metric values
- Return type
eval_result
- visualize(results, vis_num=10, **kwargs)[source]¶
Visulaize the model output on validation data. :param results: A dictionary containing
class: List of length number of test images. img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape and so on.
- Parameters
vis_num – number of images visualized
- Returns: A dictionary containing
images: Visulaized images, list of np.ndarray. img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape and so on.
easycv.datasets.detection package¶
- class easycv.datasets.detection.DetDataset(data_source, pipeline, profiling=False, classes=None)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]Dataset for Detection
- __init__(data_source, pipeline, profiling=False, classes=None)[source]¶
- Parameters
data_source – Data_source config dict
pipeline – Pipeline config list
profiling – If set True, will print pipeline time
classes – A list of class names, used in evaluation for result and groundtruth visualization
- evaluate(results, evaluators=None, logger=None)[source]¶
Evaluates the detection boxes. :param results: A dictionary containing
- detection_boxes: List of length number of test images.
Float32 numpy array of shape [num_boxes, 4] and format [ymin, xmin, ymax, xmax] in absolute image coordinates.
- detection_scores: List of length number of test images,
detection scores for the boxes, float32 numpy array of shape [num_boxes].
- detection_classes: List of length number of test images,
integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.
- img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.
- Parameters
evaluators – evaluators to calculate metric with results and groundtruth_dict
- visualize(results, vis_num=10, score_thr=0.3, **kwargs)[source]¶
Visulaize the model output on validation data. :param results: A dictionary containing
- detection_boxes: List of length number of test images.
Float32 numpy array of shape [num_boxes, 4] and format [ymin, xmin, ymax, xmax] in absolute image coordinates.
- detection_scores: List of length number of test images,
detection scores for the boxes, float32 numpy array of shape [num_boxes].
- detection_classes: List of length number of test images,
integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.
- img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.
- Parameters
vis_num – number of images visualized
score_thr – The threshold to filter box, boxes with scores greater than score_thr will be kept.
- Returns: A dictionary containing
images: Visulaized images. img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.
- class easycv.datasets.detection.DetImagesMixDataset(data_source, pipeline, dynamic_scale=None, skip_type_keys=None, profiling=False, classes=None, yolo_format=True, label_padding=True)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]A wrapper of multiple images mixed dataset.
Suitable for training on multiple images mixed data augmentation like mosaic and mixup. For the augmentation pipeline of mixed image data, the get_indexes method needs to be provided to obtain the image indexes, and you can set skip_flags to change the pipeline running process. At the same time, we provide the dynamic_scale parameter to dynamically change the output image size.
output boxes format: cx, cy, w, h
- Parameters
data_source (
DetSourceCoco
) – The dataset to be mixed.pipeline (Sequence[dict]) – Sequence of transform object or config dict to be composed.
dynamic_scale (tuple[int], optional) – The image scale can be changed dynamically. Default to None.
skip_type_keys (list[str], optional) – Sequence of type string to be skip pipeline. Default to None.
label_padding – out labeling padding [N, 120, 5]
- __init__(data_source, pipeline, dynamic_scale=None, skip_type_keys=None, profiling=False, classes=None, yolo_format=True, label_padding=True)[source]¶
Args: data_source: Data_source config dict pipeline: Pipeline config list profiling: If set True, will print pipeline time classes: A list of class names, used in evaluation for result and groundtruth visualization
- update_skip_type_keys(skip_type_keys)[source]¶
Update skip_type_keys. It is called by an external hook.
- Parameters
skip_type_keys (list[str], optional) – Sequence of type string to be skip pipeline.
- update_dynamic_scale(dynamic_scale)[source]¶
Update dynamic_scale. It is called by an external hook.
- Parameters
dynamic_scale (tuple[int]) – The image scale can be changed dynamically.
- results2json(results, outfile_prefix)[source]¶
Dump the detection results to a COCO style json file.
There are 3 types of results: proposals, bbox predictions, mask predictions, and they have different data types. This method will automatically recognize the type, and dump them to json files.
- Parameters
results (list[list | tuple | ndarray]) – Testing results of the dataset.
outfile_prefix (str) – The filename prefix of the json files. If the prefix is “somepath/xxx”, the json files will be named “somepath/xxx.bbox.json”, “somepath/xxx.segm.json”, “somepath/xxx.proposal.json”.
- Returns
str]: Possible keys are “bbox”, “segm”, “proposal”, and values are corresponding filenames.
- Return type
dict[str
- format_results(results, jsonfile_prefix=None, **kwargs)[source]¶
Format the results to json (standard format for COCO evaluation).
- Parameters
results (list[tuple | numpy.ndarray]) – Testing results of the dataset.
jsonfile_prefix (str | None) – The prefix of json files. It includes the file path and the prefix of filename, e.g., “a/b/prefix”. If not specified, a temp file will be created. Default: None.
- Returns
(result_files, tmp_dir), result_files is a dict containing the json filepaths, tmp_dir is the temporal directory created for saving json files when jsonfile_prefix is not specified.
- Return type
tuple
Subpackages¶
easycv.datasets.detection.data_sources package¶
- class easycv.datasets.detection.data_sources.DetSourceCoco(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶
Bases:
object
coco data source
- __init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶
- Parameters
ann_file – Path of annotation file.
img_prefix – coco path prefix
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as True
- load_annotations(ann_file)[source]¶
Load annotation from COCO style annotation file. :param ann_file: Path of annotation file. :type ann_file: str
- Returns
Annotation info from COCO api.
- Return type
list[dict]
- get_ann_info(idx)[source]¶
Get COCO annotation by index. :param idx: Index of data. :type idx: int
- Returns
Annotation info of specified index.
- Return type
dict
- get_cat_ids(idx)[source]¶
Get COCO category ids by index. :param idx: Index of data. :type idx: int
- Returns
All categories in the image of specified index.
- Return type
list[int]
- class easycv.datasets.detection.data_sources.DetSourceCocoPanoptic(ann_file, pan_ann_file, img_prefix, seg_prefix, pipeline, outfile_prefix='test/test_pan', test_mode=False, filter_empty_gt=False, thing_classes=None, stuff_classes=None, iscrowd=False)[source]¶
Bases:
easycv.datasets.detection.data_sources.coco.DetSourceCoco
cocopanoptic data source
- __init__(ann_file, pan_ann_file, img_prefix, seg_prefix, pipeline, outfile_prefix='test/test_pan', test_mode=False, filter_empty_gt=False, thing_classes=None, stuff_classes=None, iscrowd=False)[source]¶
- Parameters
ann_file (str) – Path of coco detection annotation file
pan_ann_file (str) – Path of coco panoptic annotation file
img_prefix (str) – Path of image file
seg_prefix (str) – Path of semantic image file
pipeline (list[dict]) – list of data augmentatin operation
outfile_prefix (str, optional) – The filename prefix of the output files. If the prefix is “somepath/xxx”, the json files will be named “somepath/xxx.panoptic.json”, “somepath/xxx.bbox.json”, “somepath/xxx.segm.json”
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
thing_classes (list[str], optional) – list of thing classes. Defaults to None.
stuff_classes (list[str], optional) – list of thing classes. Defaults to None.
iscrowd (bool, optional) – when traing setted as False, when val setted as True. Defaults to False.
- load_annotations_pan(ann_file)[source]¶
Load annotation from COCO Panoptic style annotation file.
- Parameters
ann_file (str) – Path of annotation file.
- Returns
Annotation info from COCO api.
- Return type
list[dict]
- get_ann_info_pan(idx)[source]¶
Get COCO annotation by index.
- Parameters
idx (int) – Index of data.
- Returns
Annotation info of specified index.
- Return type
dict
- prepare_train_img(idx)[source]¶
Get training data and annotations after pipeline.
- Parameters
idx (int) – Index of data.
- Returns
Training data and annotation after pipeline with new keys introduced by pipeline.
- Return type
dict
- results2json(results)[source]¶
Dump the results to a COCO style json file.
There are 4 types of results: proposals, bbox predictions, mask predictions, panoptic segmentation predictions, and they have different data types. This method will automatically recognize the type, and dump them to json files.
[ { 'pan_results': np.array, # shape (h, w) # ins_results which includes bboxes and RLE encoded masks # is optional. 'ins_results': (list[np.array], list[list[str]]) }, ... ]
- Parameters
results (list[dict]) – Testing results of the dataset.
- Returns
str]: Possible keys are “panoptic”, “bbox”, “segm”, “proposal”, and values are corresponding filenames.
- Return type
dict[str
- get_gt_json(result_files)[source]¶
get input for coco panptic evaluation
- Parameters
result_files (dict) – path of predict result
- Returns
gt label gt_folder (str): path of gt file pred_json(dict): predict result pred_folder(str): path of pred file categories(dict): panoptic categories
- Return type
gt_json (dict)
- class easycv.datasets.detection.data_sources.DetSourceObjects365(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=[], iscrowd=False)[source]¶
Bases:
easycv.datasets.detection.data_sources.coco.DetSourceCoco
objects365 data source. The form of the objects365 dataset folder build:
- __init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=[], iscrowd=False)[source]¶
- Parameters
ann_file – Path of annotation file.
img_prefix – coco path prefix
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as True
- class easycv.datasets.detection.data_sources.DetSourcePAI(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
data format please refer to: https://help.aliyun.com/document_detail/311173.html
- __init__(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]¶
- Parameters
path – Path of manifest path with pai label format
classes – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourceRaw(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
data dir is as follows: ``` |- data_dir
` Label txt file is as follows: The first column is the label id, and columns 2 to 5 are coordinates relative to the image width and height [x_center, y_center, bbox_w, bbox_h]. `
15 0.519398 0.544087 0.476359 0.572061 2 0.501859 0.820726 0.996281 0.332178 … ``` .. rubric:: Example- data_source = DetSourceRaw(
img_root_path=’/your/data_dir/images’, label_root_path=’/your/data_dir/labels’,
)
- __init__(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]¶
- Parameters
img_root_path – images dir path
label_root_path – labels dir path
classes (list, optional) – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
delimeter – delimeter of txt file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourceVOC(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
data dir is as follows: ``` |- voc_data
``` Example1:
- data_source = DetSourceVOC(
path=’/your/voc_data/ImageSets/Main/train.txt’, classes=${VOC_CLASSES},
)
- Example1:
- data_source = DetSourceVOC(
path=’/your/voc_data/train.txt’, classes=${VOC_CLASSES}, img_root_path=’/your/voc_data/images’, img_root_path=’/your/voc_data/annotations’
)
- __init__(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
- Parameters
path – path of img id list file in ImageSets/Main/
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourceVOC2007(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.voc.DetSourceVOC
- __init__(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
- Parameters
path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading
download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used
split – train or val
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourceVOC2012(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.voc.DetSourceVOC
- __init__(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
- Parameters
path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading
download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used
split – train or val
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourceCoco2017(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶
Bases:
easycv.datasets.detection.data_sources.coco.DetSourceCoco
coco2017 data source
- __init__(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶
- Parameters
path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading
download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used
split – train or val
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as True
- class easycv.datasets.detection.data_sources.DetSourceLvis(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.coco.DetSourceCoco
lvis data source
- __init__(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False, **kwargs)[source]¶
- Parameters
path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading
download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used
split – train or val
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as True
- cfg = {'dataset': 'images', 'links': ['https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_train.json.zip', 'https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_val.json.zip', 'http://images.cocodataset.org/zips/train2017.zip', 'http://images.cocodataset.org/zips/val2017.zip'], 'train': 'lvis_v1_train.json', 'val': 'lvis_v1_val.json'}¶
- class easycv.datasets.detection.data_sources.DetSourceWiderPerson(path, classes=['pedestrians', 'riders', 'partially-visible persons', 'ignore regions', 'crowd'], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.txt', parse_fn=<function parse_txt>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
- CLASSES = ['pedestrians', 'riders', 'partially-visible persons', 'ignore regions', 'crowd']¶
dataset_name=’Wider Person’, paper_info=@article{zhang2019widerperson, Author = {Zhang, Shifeng and Xie, Yiliang and Wan, Jun and Xia, Hansheng and Li, Stan Z. and Guo, Guodong}, journal = {IEEE Transactions on Multimedia (TMM)}, Title = {WiderPerson: A Diverse Dataset for Dense Pedestrian Detection in the Wild}, Year = {2019}}
- __init__(path, classes=['pedestrians', 'riders', 'partially-visible persons', 'ignore regions', 'crowd'], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.txt', parse_fn=<function parse_txt>, num_processes=1, **kwargs) → None[source]¶
- Parameters
path – path of img id list file in root
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the WiderPerso data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the WiderPerso data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourceAfricanWildlife(path, classes=['buffalo', 'elephant', 'rhino', 'zebra'], cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.txt', parse_fn=<function parse_txt>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
data dir is as follows: ``` |- data
``` Example1:
- data_source = DetSourceAfricanWildlife(
path=’/your/data/’, classes=${CLASSES},
)
- CLASSES = ['buffalo', 'elephant', 'rhino', 'zebra']¶
- __init__(path, classes=['buffalo', 'elephant', 'rhino', 'zebra'], cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.txt', parse_fn=<function parse_txt>, num_processes=1, **kwargs) → None[source]¶
- Parameters
path – path of img id list file in root
classes – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourcePet(path, classes_id=1, img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
data dir is as follows: ``` |- data
``` Example0:
- data_source = DetSourcePet(
path=’/your/data/annotations/annotations/trainval.txt’, classes_id=1 or 2 or 3,
- Example1:
- data_source = DetSourcePet(
path=’/your/data/annotations/annotations/trainval.txt’, classes_id=1 or 2 or 3, img_root_path=’/your/data//images’, img_root_path=’/your/data/annotations/annotations/xmls’
)
- CLASSES_CFG = {1: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37], 2: [1, 2], 3: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]}¶
- __init__(path, classes_id=1, img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
- Parameters
path – path of img id list file in pet format
classes_id – 1= 1:37 Class ids, 2 = 1:Cat 2:Dog, 3 = 1-25:Cat 1:12:Dog
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourceWiderFace(ann_file, img_prefix, classes='blur', cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parse_load>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
- CLASSES = {'blur': ['clear', 'normal blur', 'heavy blur'], 'expression': ['typical expression', 'exaggerate expression'], 'illumination': ['normal illumination', 'extreme illumination'], 'invalid': ['false valid image)', 'true (invalid image)'], 'occlusion': ['no occlusion', 'partial occlusion', 'heavy occlusion'], 'pose': ['typical pose', 'atypical pose']}¶
Citation: @inproceedings{yang2016wider, Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou}, Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, Title = {WIDER FACE: A Face Detection Benchmark}, Year = {2016}}
- __init__(ann_file, img_prefix, classes='blur', cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parse_load>, num_processes=1, **kwargs) → None[source]¶
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held.
classes (str) – classes defalut=’blur’
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.DetSourceCrowdHuman(ann_file, img_prefix, gt_op='vbox', classes=['mask', 'person'], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parse_load>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
- CLASSES = ['mask', 'person']¶
- Citation:
@article{shao2018crowdhuman, title={CrowdHuman: A Benchmark for Detecting Human in a Crowd}, author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian}, journal={arXiv preprint arXiv:1805.00123}, year={2018}
}
- __init__(ann_file, img_prefix, gt_op='vbox', classes=['mask', 'person'], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parse_load>, num_processes=1, **kwargs) → None[source]¶
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held.
gt_op (str) – vbox(visible box), fbox(full box), hbox(head box), defalut vbox
classes (list) – classes defalut=[‘mask’, ‘person’]
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.coco.DetSourceCoco(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶
Bases:
object
coco data source
- __init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶
- Parameters
ann_file – Path of annotation file.
img_prefix – coco path prefix
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as True
- load_annotations(ann_file)[source]¶
Load annotation from COCO style annotation file. :param ann_file: Path of annotation file. :type ann_file: str
- Returns
Annotation info from COCO api.
- Return type
list[dict]
- get_ann_info(idx)[source]¶
Get COCO annotation by index. :param idx: Index of data. :type idx: int
- Returns
Annotation info of specified index.
- Return type
dict
- get_cat_ids(idx)[source]¶
Get COCO category ids by index. :param idx: Index of data. :type idx: int
- Returns
All categories in the image of specified index.
- Return type
list[int]
- class easycv.datasets.detection.data_sources.coco.DetSourceCoco2017(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶
Bases:
easycv.datasets.detection.data_sources.coco.DetSourceCoco
coco2017 data source
- __init__(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]¶
- Parameters
path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading
download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used
split – train or val
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as True
- class easycv.datasets.detection.data_sources.coco.DetSourceTinyPerson(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=['sea_person', 'earth_person'], iscrowd=False)[source]¶
Bases:
easycv.datasets.detection.data_sources.coco.DetSourceCoco
TINY PERSON data source
- CLASSES = ['sea_person', 'earth_person']¶
- __init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=['sea_person', 'earth_person'], iscrowd=False)[source]¶
- Parameters
ann_file – Path of annotation file.
img_prefix – coco path prefix
test_mode (bool, optional) – If set True, self._filter_imgs will not works.
filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.
iscrowd – when traing setted as False, when val setted as True
- easycv.datasets.detection.data_sources.pai_format.get_prior_task_id(keys)[source]¶
“The task id ends with check is the highest priority.
- easycv.datasets.detection.data_sources.pai_format.is_itag_v2(row)[source]¶
The keyword of the data source is picUrl in v1, but is source in v2
- easycv.datasets.detection.data_sources.pai_format.parser_manifest_row_str(row_str, classes)[source]¶
- class easycv.datasets.detection.data_sources.pai_format.DetSourcePAI(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
data format please refer to: https://help.aliyun.com/document_detail/311173.html
- __init__(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]¶
- Parameters
path – Path of manifest path with pai label format
classes – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- easycv.datasets.detection.data_sources.raw.parse_raw(source_iter, classes=None, delimeter=' ')[source]¶
- class easycv.datasets.detection.data_sources.raw.DetSourceRaw(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
data dir is as follows: ``` |- data_dir
` Label txt file is as follows: The first column is the label id, and columns 2 to 5 are coordinates relative to the image width and height [x_center, y_center, bbox_w, bbox_h]. `
15 0.519398 0.544087 0.476359 0.572061 2 0.501859 0.820726 0.996281 0.332178 … ``` .. rubric:: Example- data_source = DetSourceRaw(
img_root_path=’/your/data_dir/images’, label_root_path=’/your/data_dir/labels’,
)
- __init__(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]¶
- Parameters
img_root_path – images dir path
label_root_path – labels dir path
classes (list, optional) – classes list
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
delimeter – delimeter of txt file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.voc.DetSourceVOC(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.base.DetSourceBase
data dir is as follows: ``` |- voc_data
``` Example1:
- data_source = DetSourceVOC(
path=’/your/voc_data/ImageSets/Main/train.txt’, classes=${VOC_CLASSES},
)
- Example1:
- data_source = DetSourceVOC(
path=’/your/voc_data/train.txt’, classes=${VOC_CLASSES}, img_root_path=’/your/voc_data/images’, img_root_path=’/your/voc_data/annotations’
)
- __init__(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
- Parameters
path – path of img id list file in ImageSets/Main/
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.voc.DetSourceVOC2012(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.voc.DetSourceVOC
- __init__(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
- Parameters
path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading
download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used
split – train or val
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
- class easycv.datasets.detection.data_sources.voc.DetSourceVOC2007(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
Bases:
easycv.datasets.detection.data_sources.voc.DetSourceVOC
- __init__(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]¶
- Parameters
path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading
download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used
split – train or val
classes – classes list
img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.
label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.
cache_at_init – if set True, will cache in memory in __init__ for faster training
cache_on_the_fly – if set True, will cache in memroy during training
img_suffix – suffix of image file
label_suffix – suffix of label file
parse_fn – parse function to parse item of source iterator
num_processes – number of processes to parse samples
easycv.datasets.detection.pipelines package¶
- class easycv.datasets.detection.pipelines.MMToTensor[source]¶
Bases:
object
Transform image to Tensor. Required key: ‘img’. Modifies key: ‘img’. :param results: contain all information about training. :type results: dict
- class easycv.datasets.detection.pipelines.NormalizeTensor(mean, std)[source]¶
Bases:
object
Normalize the Tensor image (CxHxW), with mean and std. Required key: ‘img’. Modifies key: ‘img’. :param mean: Mean values of 3 channels. :type mean: list[float] :param std: Std values of 3 channels. :type std: list[float]
- class easycv.datasets.detection.pipelines.MMMosaic(img_scale=(640, 640), center_ratio_range=(0.5, 1.5), pad_val=114)[source]¶
Bases:
object
Mosaic augmentation. Given 4 images, mosaic transform combines them into one output image. The output image is composed of the parts from each sub- image. .. code:: text
- mosaic transform
center_x
- center_y |----+-------------+-----------|
- | cropped | |
|pad | image3 | image4 | | | | | +----|————-+———–+
- The mosaic transform steps are as follows:
Choose the mosaic center as the intersections of 4 images
Get the left top image according to the index, and randomly sample another 3 images from the custom dataset.
Sub image will be cropped if image is larger than mosaic patch
- Parameters
img_scale (Sequence[int]) – Image size after mosaic pipeline of single image. Default to (640, 640).
center_ratio_range (Sequence[float]) – Center ratio range of mosaic output. Default to (0.5, 1.5).
pad_val (int) – Pad value. Default to 114.
- class easycv.datasets.detection.pipelines.MMMixUp(img_scale=(640, 640), ratio_range=(0.5, 1.5), flip_ratio=0.5, pad_val=114, max_iters=15, min_bbox_size=5, min_area_ratio=0.2, max_aspect_ratio=20)[source]¶
Bases:
object
MixUp data augmentation. .. code:: text
- The mixup transform steps are as follows::
Another random image is picked by dataset and embedded in the top left patch(after padding and resizing)
The target of mixup transform is the weighted average of mixup image and origin image.
- Parameters
img_scale (Sequence[int]) – Image output size after mixup pipeline. Default: (640, 640).
ratio_range (Sequence[float]) – Scale ratio of mixup image. Default: (0.5, 1.5).
flip_ratio (float) – Horizontal flip ratio of mixup image. Default: 0.5.
pad_val (int) – Pad value. Default: 114.
max_iters (int) – The maximum number of iterations. If the number of iterations is greater than max_iters, but gt_bbox is still empty, then the iteration is terminated. Default: 15.
min_bbox_size (float) – Width and height threshold to filter bboxes. If the height or width of a box is smaller than this value, it will be removed. Default: 5.
min_area_ratio (float) – Threshold of area ratio between original bboxes and wrapped bboxes. If smaller than this value, the box will be removed. Default: 0.2.
max_aspect_ratio (float) – Aspect ratio of width and height threshold to filter bboxes. If max(h/w, w/h) larger than this value, the box will be removed. Default: 20.
- class easycv.datasets.detection.pipelines.MMRandomAffine(max_rotate_degree=10.0, max_translate_ratio=0.1, scaling_ratio_range=(0.5, 1.5), max_shear_degree=2.0, border=(0, 0), border_val=(114, 114, 114), min_bbox_size=2, min_area_ratio=0.2, max_aspect_ratio=20)[source]¶
Bases:
object
Random affine transform data augmentation. for yolox This operation randomly generates affine transform matrix which including rotation, translation, shear and scaling transforms. :param max_rotate_degree: Maximum degrees of rotation transform.
Default: 10.
- Parameters
max_translate_ratio (float) – Maximum ratio of translation. Default: 0.1.
scaling_ratio_range (tuple[float]) – Min and max ratio of scaling transform. Default: (0.5, 1.5).
max_shear_degree (float) – Maximum degrees of shear transform. Default: 2.
border (tuple[int]) – Distance from height and width sides of input image to adjust output shape. Only used in mosaic dataset. Default: (0, 0).
border_val (tuple[int]) – Border padding values of 3 channels. Default: (114, 114, 114).
min_bbox_size (float) – Width and height threshold to filter bboxes. If the height or width of a box is smaller than this value, it will be removed. Default: 2.
min_area_ratio (float) – Threshold of area ratio between original bboxes and wrapped bboxes. If smaller than this value, the box will be removed. Default: 0.2.
max_aspect_ratio (float) – Aspect ratio of width and height threshold to filter bboxes. If max(h/w, w/h) larger than this value, the box will be removed.
- class easycv.datasets.detection.pipelines.MMPhotoMetricDistortion(brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18)[source]¶
Bases:
object
Apply photometric distortion to image sequentially, every transformation is applied with a probability of 0.5. The position of random contrast is in second or second to last. 1. random brightness 2. random contrast (mode 0) 3. convert color from BGR to HSV 4. random saturation 5. random hue 6. convert color from HSV to BGR 7. random contrast (mode 1) 8. randomly swap channels :param brightness_delta: delta of brightness. :type brightness_delta: int :param contrast_range: range of contrast. :type contrast_range: tuple :param saturation_range: range of saturation. :type saturation_range: tuple :param hue_delta: delta of hue. :type hue_delta: int
- class easycv.datasets.detection.pipelines.MMResize(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', override=False)[source]¶
Bases:
object
Resize images & bbox & mask. This transform resizes the input image to some scale. Bboxes and masks are then resized with the same scale factor. If the input dict contains the key “scale”, then the scale in the input dict is used, otherwise the specified scale in the init method is used. If the input dict contains the key “scale_factor” (if MultiScaleFlipAug does not give img_scale but scale_factor), the actual scale will be computed by image shape and scale_factor. img_scale can either be a tuple (single-scale) or a list of tuple (multi-scale). There are 3 multiscale modes: -
ratio_range is not None
: randomly sample a ratio from the ratio range and multiply it with the image scale. -ratio_range is None
andmultiscale_mode == "range"
: randomly sample a scale from the multiscale range. -ratio_range is None
andmultiscale_mode == "value"
: randomly sample a scale from multiple scales. :param img_scale: Images scales for resizing. :type img_scale: tuple or list[tuple] :param multiscale_mode: Either “range” or “value”. :type multiscale_mode: str :param ratio_range: (min_ratio, max_ratio) :type ratio_range: tuple[float] :param keep_ratio: Whether to keep the aspect ratio when resizing theimage.
- Parameters
bbox_clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.
backend (str) – Image resize backend, choices are ‘cv2’ and ‘pillow’. These two backends generates slightly different results. Defaults to ‘cv2’.
override (bool, optional) – Whether to override scale and scale_factor so as to call resize twice. Default False. If True, after the first resizing, the existed scale and scale_factor will be ignored so the second resizing can be allowed. This option is a work-around for multiple times of resize in DETR. Defaults to False.
- __init__(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', override=False)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- static random_select(img_scales)[source]¶
Randomly select an img_scale from given candidates. :param img_scales: Images scales for selection. :type img_scales: list[tuple]
- Returns
Returns a tuple
(img_scale, scale_dix)
, whereimg_scale
is the selected image scale andscale_idx
is the selected index in the given candidates.- Return type
(tuple, int)
- static random_sample(img_scales)[source]¶
Randomly sample an img_scale when
multiscale_mode=='range'
. :param img_scales: Images scale range for sampling.There must be two tuples in img_scales, which specify the lower and upper bound of image scales.
- Returns
Returns a tuple
(img_scale, None)
, whereimg_scale
is sampled scale and None is just a placeholder to be consistent withrandom_select()
.- Return type
(tuple, None)
- static random_sample_ratio(img_scale, ratio_range)[source]¶
Randomly sample an img_scale when
ratio_range
is specified. A ratio will be randomly sampled from the range specified byratio_range
. Then it would be multiplied withimg_scale
to generate sampled scale. :param img_scale: Images scale base to multiply with ratio. :type img_scale: tuple :param ratio_range: The minimum and maximum ratio to scalethe
img_scale
.- Returns
Returns a tuple
(scale, None)
, wherescale
is sampled ratio multiplied withimg_scale
and None is just a placeholder to be consistent withrandom_select()
.- Return type
(tuple, None)
- class easycv.datasets.detection.pipelines.MMRandomFlip(flip_ratio=None, direction='horizontal')[source]¶
Bases:
object
Flip the image & bbox & mask. If the input dict contains the key “flip”, then the flag will be used, otherwise it will be randomly decided by a ratio specified in the init method. When random flip is enabled,
flip_ratio
/direction
can either be a float/string or tuple of float/string. There are 3 flip modes: -flip_ratio
is float,direction
is string: the image will bedirection``ly flipped with probability of ``flip_ratio
. E.g.,flip_ratio=0.5
,direction='horizontal'
, then image will be horizontally flipped with probability of 0.5.flip_ratio
is float,direction
is list of string: the image wilbe
direction[i]``ly flipped with probability of ``flip_ratio/len(direction)
. E.g.,flip_ratio=0.5
,direction=['horizontal', 'vertical']
, then image will be horizontally flipped with probability of 0.25, vertically with probability of 0.25.
flip_ratio
is list of float,direction
is list of string:given
len(flip_ratio) == len(direction)
, the image wil bedirection[i]``ly flipped with probability of ``flip_ratio[i]
. E.g.,flip_ratio=[0.3, 0.5]
,direction=['horizontal', 'vertical']
, then image will be horizontally flipped with probability of 0.3, vertically with probability of 0.5.
- Parameters
flip_ratio (float | list[float], optional) – The flipping probability. Default: None.
direction (str | list[str], optional) – The flipping direction. Options are ‘horizontal’, ‘vertical’, ‘diagonal’. Default: ‘horizontal’. If input is a list, the length must equal
flip_ratio
. Each element inflip_ratio
indicates the flip probability of corresponding direction.
- __init__(flip_ratio=None, direction='horizontal')[source]¶
Initialize self. See help(type(self)) for accurate signature.
- bbox_flip(bboxes, img_shape, direction)[source]¶
Flip bboxes horizontally. :param bboxes: Bounding boxes, shape (…, 4*k) :type bboxes: numpy.ndarray :param img_shape: Image shape (height, width) :type img_shape: tuple[int] :param direction: Flip direction. Options are ‘horizontal’,
‘vertical’.
- Returns
Flipped bounding boxes.
- Return type
numpy.ndarray
- class easycv.datasets.detection.pipelines.MMPad(size=None, size_divisor=None, pad_to_square=False, pad_val={'img': 0, 'masks': 0, 'seg': 255})[source]¶
Bases:
object
Pad the image & mask. There are two padding modes: (1) pad to a fixed size and (2) pad to the minimum size that is divisible by some number. Added keys are “pad_shape”, “pad_fixed_size”, “pad_size_divisor”, :param size: Fixed padding size. :type size: tuple, optional :param size_divisor: The divisor of padded size. :type size_divisor: int, optional :param pad_to_square: Whether to pad the image into a square.
Currently only used for YOLOX. Default: False.
- Parameters
pad_val (dict, optional) – A dict for padding value, the default value is dict(img=0, masks=0, seg=255).
- class easycv.datasets.detection.pipelines.MMNormalize(mean, std, to_rgb=True)[source]¶
Bases:
object
Normalize the image. Added key is “img_norm_cfg”. :param mean: Mean values of 3 channels. :type mean: sequence :param std: Std values of 3 channels. :type std: sequence :param to_rgb: Whether to convert the image from BGR to RGB,
default is true.
- class easycv.datasets.detection.pipelines.LoadImageFromFile(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]¶
Bases:
object
Load an image from file. Required keys are “img_prefix” and “img_info” (a dict that must contain the key “filename”). Added or updated keys are “filename”, “img”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0) and “img_norm_cfg” (means=0 and stds=1). :param to_float32: Whether to convert the loaded image to a float32
numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.
- Parameters
color_type (str) – The flag argument for
mmcv.imfrombytes()
. Defaults to ‘color’.file_client_args (dict) – Arguments to instantiate a FileClient. See
mmcv.fileio.FileClient
for details. Defaults todict(backend='disk')
.
- class easycv.datasets.detection.pipelines.LoadImageFromWebcam(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]¶
Bases:
easycv.datasets.detection.pipelines.mm_transforms.LoadImageFromFile
Load an image from webcam.
Similar with
LoadImageFromFile
, but the image read from webcam is inresults['img']
.
- class easycv.datasets.detection.pipelines.LoadMultiChannelImageFromFiles(to_float32=False, color_type='unchanged', file_client_args={'backend': 'disk'})[source]¶
Bases:
object
Load multi-channel images from a list of separate channel files. Required keys are “img_prefix” and “img_info” (a dict that must contain the key “filename”, which is expected to be a list of filenames). Added or updated keys are “filename”, “img”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0) and “img_norm_cfg” (means=0 and stds=1). :param to_float32: Whether to convert the loaded image to a float32
numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.
- Parameters
color_type (str) – The flag argument for
mmcv.imfrombytes()
. Defaults to ‘color’.file_client_args (dict) – Arguments to instantiate a FileClient. See
mmcv.fileio.FileClient
for details. Defaults todict(backend='disk')
.
- class easycv.datasets.detection.pipelines.LoadAnnotations(with_bbox=True, with_label=True, with_mask=False, with_seg=False, poly2mask=True, file_client_args={'backend': 'disk'})[source]¶
Bases:
object
Load multiple types of annotations. :param with_bbox: Whether to parse and load the bbox annotation.
Default: True.
- Parameters
with_label (bool) – Whether to parse and load the label annotation. Default: True.
with_mask (bool) – Whether to parse and load the mask annotation. Default: False.
with_seg (bool) – Whether to parse and load the semantic segmentation annotation. Default: False.
poly2mask (bool) – Whether to convert the instance masks from polygons to bitmaps. Default: True.
file_client_args (dict) – Arguments to instantiate a FileClient. See
mmcv.fileio.FileClient
for details. Defaults todict(backend='disk')
.
- class easycv.datasets.detection.pipelines.MMMultiScaleFlipAug(transforms, img_scale=None, scale_factor=None, flip=False, flip_direction='horizontal')[source]¶
Bases:
object
Test-time augmentation with multiple scales and flipping. An example configuration is as followed: .. code-block:
img_scale=[(1333, 400), (1333, 800)], flip=True, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]
After MultiScaleFLipAug with above configuration, the results are wrapped into lists of the same length as followed: .. code-block:
dict( img=[...], img_shape=[...], scale=[(1333, 400), (1333, 400), (1333, 800), (1333, 800)] flip=[False, True, False, True] ... )
- Parameters
transforms (list[dict]) – Transforms to apply in each augmentation.
img_scale (tuple | list[tuple] | None) – Images scales for resizing.
scale_factor (float | list[float] | None) – Scale factors for resizing.
flip (bool) – Whether apply flip augmentation. Default: False.
flip_direction (str | list[str]) – Flip augmentation directions, options are “horizontal”, “vertical” and “diagonal”. If flip_direction is a list, multiple flip augmentations will be applied. It has no effect when flip == False. Default: “horizontal”.
- class easycv.datasets.detection.pipelines.MMRandomCrop(crop_size, crop_type='absolute', allow_negative_crop=False, recompute_bbox=False, bbox_clip_border=True)[source]¶
Bases:
object
Random crop the image & bboxes & masks.
The absolute crop_size is sampled based on crop_type and image_size, then the cropped results are generated.
- Parameters
crop_size (tuple) – The relative ratio or absolute pixels of height and width.
crop_type (str, optional) – one of “relative_range”, “relative”, “absolute”, “absolute_range”. “relative” randomly crops (h * crop_size[0], w * crop_size[1]) part from an input of size (h, w). “relative_range” uniformly samples relative crop size from range [crop_size[0], 1] and [crop_size[1], 1] for height and width respectively. “absolute” crops from an input with absolute size (crop_size[0], crop_size[1]). “absolute_range” uniformly samples crop_h in range [crop_size[0], min(h, crop_size[1])] and crop_w in range [crop_size[0], min(w, crop_size[1])]. Default “absolute”.
allow_negative_crop (bool, optional) – Whether to allow a crop that does not contain any bbox area. Default False.
recompute_bbox (bool, optional) – Whether to re-compute the boxes based on cropped instance masks. Default False.
bbox_clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.
Note
- If the image is smaller than the absolute crop size, return the
original image.
The keys for bboxes, labels and masks must be aligned. That is, gt_bboxes corresponds to gt_labels and gt_masks, and gt_bboxes_ignore corresponds to gt_labels_ignore and gt_masks_ignore.
If the crop does not contain any gt-bbox region and allow_negative_crop is set to False, skip this image.
- class easycv.datasets.detection.pipelines.MMFilterAnnotations(min_gt_bbox_wh=(1.0, 1.0), min_gt_mask_area=1, by_box=True, by_mask=False, keep_empty=True)[source]¶
Bases:
object
Filter invalid annotations. :param min_gt_bbox_wh: Minimum width and height of ground truth
boxes. Default: (1., 1.)
- Parameters
min_gt_mask_area (int) – Minimum foreground area of ground truth masks. Default: 1
by_box (bool) – Filter instances with bounding boxes not meeting the min_gt_bbox_wh threshold. Default: True
by_mask (bool) – Filter instances with masks not meeting min_gt_mask_area threshold. Default: False
keep_empty (bool) – Whether to return None when it becomes an empty bbox after filtering. Default: True
- class easycv.datasets.detection.pipelines.mm_transforms.MMToTensor[source]¶
Bases:
object
Transform image to Tensor. Required key: ‘img’. Modifies key: ‘img’. :param results: contain all information about training. :type results: dict
- class easycv.datasets.detection.pipelines.mm_transforms.NormalizeTensor(mean, std)[source]¶
Bases:
object
Normalize the Tensor image (CxHxW), with mean and std. Required key: ‘img’. Modifies key: ‘img’. :param mean: Mean values of 3 channels. :type mean: list[float] :param std: Std values of 3 channels. :type std: list[float]
- class easycv.datasets.detection.pipelines.mm_transforms.MMMosaic(img_scale=(640, 640), center_ratio_range=(0.5, 1.5), pad_val=114)[source]¶
Bases:
object
Mosaic augmentation. Given 4 images, mosaic transform combines them into one output image. The output image is composed of the parts from each sub- image. .. code:: text
- mosaic transform
center_x
- center_y |----+-------------+-----------|
- | cropped | |
|pad | image3 | image4 | | | | | +----|————-+———–+
- The mosaic transform steps are as follows:
Choose the mosaic center as the intersections of 4 images
Get the left top image according to the index, and randomly sample another 3 images from the custom dataset.
Sub image will be cropped if image is larger than mosaic patch
- Parameters
img_scale (Sequence[int]) – Image size after mosaic pipeline of single image. Default to (640, 640).
center_ratio_range (Sequence[float]) – Center ratio range of mosaic output. Default to (0.5, 1.5).
pad_val (int) – Pad value. Default to 114.
- class easycv.datasets.detection.pipelines.mm_transforms.MMMixUp(img_scale=(640, 640), ratio_range=(0.5, 1.5), flip_ratio=0.5, pad_val=114, max_iters=15, min_bbox_size=5, min_area_ratio=0.2, max_aspect_ratio=20)[source]¶
Bases:
object
MixUp data augmentation. .. code:: text
- The mixup transform steps are as follows::
Another random image is picked by dataset and embedded in the top left patch(after padding and resizing)
The target of mixup transform is the weighted average of mixup image and origin image.
- Parameters
img_scale (Sequence[int]) – Image output size after mixup pipeline. Default: (640, 640).
ratio_range (Sequence[float]) – Scale ratio of mixup image. Default: (0.5, 1.5).
flip_ratio (float) – Horizontal flip ratio of mixup image. Default: 0.5.
pad_val (int) – Pad value. Default: 114.
max_iters (int) – The maximum number of iterations. If the number of iterations is greater than max_iters, but gt_bbox is still empty, then the iteration is terminated. Default: 15.
min_bbox_size (float) – Width and height threshold to filter bboxes. If the height or width of a box is smaller than this value, it will be removed. Default: 5.
min_area_ratio (float) – Threshold of area ratio between original bboxes and wrapped bboxes. If smaller than this value, the box will be removed. Default: 0.2.
max_aspect_ratio (float) – Aspect ratio of width and height threshold to filter bboxes. If max(h/w, w/h) larger than this value, the box will be removed. Default: 20.
- class easycv.datasets.detection.pipelines.mm_transforms.MMRandomAffine(max_rotate_degree=10.0, max_translate_ratio=0.1, scaling_ratio_range=(0.5, 1.5), max_shear_degree=2.0, border=(0, 0), border_val=(114, 114, 114), min_bbox_size=2, min_area_ratio=0.2, max_aspect_ratio=20)[source]¶
Bases:
object
Random affine transform data augmentation. for yolox This operation randomly generates affine transform matrix which including rotation, translation, shear and scaling transforms. :param max_rotate_degree: Maximum degrees of rotation transform.
Default: 10.
- Parameters
max_translate_ratio (float) – Maximum ratio of translation. Default: 0.1.
scaling_ratio_range (tuple[float]) – Min and max ratio of scaling transform. Default: (0.5, 1.5).
max_shear_degree (float) – Maximum degrees of shear transform. Default: 2.
border (tuple[int]) – Distance from height and width sides of input image to adjust output shape. Only used in mosaic dataset. Default: (0, 0).
border_val (tuple[int]) – Border padding values of 3 channels. Default: (114, 114, 114).
min_bbox_size (float) – Width and height threshold to filter bboxes. If the height or width of a box is smaller than this value, it will be removed. Default: 2.
min_area_ratio (float) – Threshold of area ratio between original bboxes and wrapped bboxes. If smaller than this value, the box will be removed. Default: 0.2.
max_aspect_ratio (float) – Aspect ratio of width and height threshold to filter bboxes. If max(h/w, w/h) larger than this value, the box will be removed.
- class easycv.datasets.detection.pipelines.mm_transforms.MMPhotoMetricDistortion(brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18)[source]¶
Bases:
object
Apply photometric distortion to image sequentially, every transformation is applied with a probability of 0.5. The position of random contrast is in second or second to last. 1. random brightness 2. random contrast (mode 0) 3. convert color from BGR to HSV 4. random saturation 5. random hue 6. convert color from HSV to BGR 7. random contrast (mode 1) 8. randomly swap channels :param brightness_delta: delta of brightness. :type brightness_delta: int :param contrast_range: range of contrast. :type contrast_range: tuple :param saturation_range: range of saturation. :type saturation_range: tuple :param hue_delta: delta of hue. :type hue_delta: int
- class easycv.datasets.detection.pipelines.mm_transforms.MMResize(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', override=False)[source]¶
Bases:
object
Resize images & bbox & mask. This transform resizes the input image to some scale. Bboxes and masks are then resized with the same scale factor. If the input dict contains the key “scale”, then the scale in the input dict is used, otherwise the specified scale in the init method is used. If the input dict contains the key “scale_factor” (if MultiScaleFlipAug does not give img_scale but scale_factor), the actual scale will be computed by image shape and scale_factor. img_scale can either be a tuple (single-scale) or a list of tuple (multi-scale). There are 3 multiscale modes: -
ratio_range is not None
: randomly sample a ratio from the ratio range and multiply it with the image scale. -ratio_range is None
andmultiscale_mode == "range"
: randomly sample a scale from the multiscale range. -ratio_range is None
andmultiscale_mode == "value"
: randomly sample a scale from multiple scales. :param img_scale: Images scales for resizing. :type img_scale: tuple or list[tuple] :param multiscale_mode: Either “range” or “value”. :type multiscale_mode: str :param ratio_range: (min_ratio, max_ratio) :type ratio_range: tuple[float] :param keep_ratio: Whether to keep the aspect ratio when resizing theimage.
- Parameters
bbox_clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.
backend (str) – Image resize backend, choices are ‘cv2’ and ‘pillow’. These two backends generates slightly different results. Defaults to ‘cv2’.
override (bool, optional) – Whether to override scale and scale_factor so as to call resize twice. Default False. If True, after the first resizing, the existed scale and scale_factor will be ignored so the second resizing can be allowed. This option is a work-around for multiple times of resize in DETR. Defaults to False.
- __init__(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', override=False)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- static random_select(img_scales)[source]¶
Randomly select an img_scale from given candidates. :param img_scales: Images scales for selection. :type img_scales: list[tuple]
- Returns
Returns a tuple
(img_scale, scale_dix)
, whereimg_scale
is the selected image scale andscale_idx
is the selected index in the given candidates.- Return type
(tuple, int)
- static random_sample(img_scales)[source]¶
Randomly sample an img_scale when
multiscale_mode=='range'
. :param img_scales: Images scale range for sampling.There must be two tuples in img_scales, which specify the lower and upper bound of image scales.
- Returns
Returns a tuple
(img_scale, None)
, whereimg_scale
is sampled scale and None is just a placeholder to be consistent withrandom_select()
.- Return type
(tuple, None)
- static random_sample_ratio(img_scale, ratio_range)[source]¶
Randomly sample an img_scale when
ratio_range
is specified. A ratio will be randomly sampled from the range specified byratio_range
. Then it would be multiplied withimg_scale
to generate sampled scale. :param img_scale: Images scale base to multiply with ratio. :type img_scale: tuple :param ratio_range: The minimum and maximum ratio to scalethe
img_scale
.- Returns
Returns a tuple
(scale, None)
, wherescale
is sampled ratio multiplied withimg_scale
and None is just a placeholder to be consistent withrandom_select()
.- Return type
(tuple, None)
- class easycv.datasets.detection.pipelines.mm_transforms.MMRandomFlip(flip_ratio=None, direction='horizontal')[source]¶
Bases:
object
Flip the image & bbox & mask. If the input dict contains the key “flip”, then the flag will be used, otherwise it will be randomly decided by a ratio specified in the init method. When random flip is enabled,
flip_ratio
/direction
can either be a float/string or tuple of float/string. There are 3 flip modes: -flip_ratio
is float,direction
is string: the image will bedirection``ly flipped with probability of ``flip_ratio
. E.g.,flip_ratio=0.5
,direction='horizontal'
, then image will be horizontally flipped with probability of 0.5.flip_ratio
is float,direction
is list of string: the image wilbe
direction[i]``ly flipped with probability of ``flip_ratio/len(direction)
. E.g.,flip_ratio=0.5
,direction=['horizontal', 'vertical']
, then image will be horizontally flipped with probability of 0.25, vertically with probability of 0.25.
flip_ratio
is list of float,direction
is list of string:given
len(flip_ratio) == len(direction)
, the image wil bedirection[i]``ly flipped with probability of ``flip_ratio[i]
. E.g.,flip_ratio=[0.3, 0.5]
,direction=['horizontal', 'vertical']
, then image will be horizontally flipped with probability of 0.3, vertically with probability of 0.5.
- Parameters
flip_ratio (float | list[float], optional) – The flipping probability. Default: None.
direction (str | list[str], optional) – The flipping direction. Options are ‘horizontal’, ‘vertical’, ‘diagonal’. Default: ‘horizontal’. If input is a list, the length must equal
flip_ratio
. Each element inflip_ratio
indicates the flip probability of corresponding direction.
- __init__(flip_ratio=None, direction='horizontal')[source]¶
Initialize self. See help(type(self)) for accurate signature.
- bbox_flip(bboxes, img_shape, direction)[source]¶
Flip bboxes horizontally. :param bboxes: Bounding boxes, shape (…, 4*k) :type bboxes: numpy.ndarray :param img_shape: Image shape (height, width) :type img_shape: tuple[int] :param direction: Flip direction. Options are ‘horizontal’,
‘vertical’.
- Returns
Flipped bounding boxes.
- Return type
numpy.ndarray
- class easycv.datasets.detection.pipelines.mm_transforms.MMRandomCrop(crop_size, crop_type='absolute', allow_negative_crop=False, recompute_bbox=False, bbox_clip_border=True)[source]¶
Bases:
object
Random crop the image & bboxes & masks.
The absolute crop_size is sampled based on crop_type and image_size, then the cropped results are generated.
- Parameters
crop_size (tuple) – The relative ratio or absolute pixels of height and width.
crop_type (str, optional) – one of “relative_range”, “relative”, “absolute”, “absolute_range”. “relative” randomly crops (h * crop_size[0], w * crop_size[1]) part from an input of size (h, w). “relative_range” uniformly samples relative crop size from range [crop_size[0], 1] and [crop_size[1], 1] for height and width respectively. “absolute” crops from an input with absolute size (crop_size[0], crop_size[1]). “absolute_range” uniformly samples crop_h in range [crop_size[0], min(h, crop_size[1])] and crop_w in range [crop_size[0], min(w, crop_size[1])]. Default “absolute”.
allow_negative_crop (bool, optional) – Whether to allow a crop that does not contain any bbox area. Default False.
recompute_bbox (bool, optional) – Whether to re-compute the boxes based on cropped instance masks. Default False.
bbox_clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.
Note
- If the image is smaller than the absolute crop size, return the
original image.
The keys for bboxes, labels and masks must be aligned. That is, gt_bboxes corresponds to gt_labels and gt_masks, and gt_bboxes_ignore corresponds to gt_labels_ignore and gt_masks_ignore.
If the crop does not contain any gt-bbox region and allow_negative_crop is set to False, skip this image.
- class easycv.datasets.detection.pipelines.mm_transforms.MMPad(size=None, size_divisor=None, pad_to_square=False, pad_val={'img': 0, 'masks': 0, 'seg': 255})[source]¶
Bases:
object
Pad the image & mask. There are two padding modes: (1) pad to a fixed size and (2) pad to the minimum size that is divisible by some number. Added keys are “pad_shape”, “pad_fixed_size”, “pad_size_divisor”, :param size: Fixed padding size. :type size: tuple, optional :param size_divisor: The divisor of padded size. :type size_divisor: int, optional :param pad_to_square: Whether to pad the image into a square.
Currently only used for YOLOX. Default: False.
- Parameters
pad_val (dict, optional) – A dict for padding value, the default value is dict(img=0, masks=0, seg=255).
- class easycv.datasets.detection.pipelines.mm_transforms.MMNormalize(mean, std, to_rgb=True)[source]¶
Bases:
object
Normalize the image. Added key is “img_norm_cfg”. :param mean: Mean values of 3 channels. :type mean: sequence :param std: Std values of 3 channels. :type std: sequence :param to_rgb: Whether to convert the image from BGR to RGB,
default is true.
- class easycv.datasets.detection.pipelines.mm_transforms.LoadImageFromFile(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]¶
Bases:
object
Load an image from file. Required keys are “img_prefix” and “img_info” (a dict that must contain the key “filename”). Added or updated keys are “filename”, “img”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0) and “img_norm_cfg” (means=0 and stds=1). :param to_float32: Whether to convert the loaded image to a float32
numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.
- Parameters
color_type (str) – The flag argument for
mmcv.imfrombytes()
. Defaults to ‘color’.file_client_args (dict) – Arguments to instantiate a FileClient. See
mmcv.fileio.FileClient
for details. Defaults todict(backend='disk')
.
- class easycv.datasets.detection.pipelines.mm_transforms.LoadImageFromWebcam(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]¶
Bases:
easycv.datasets.detection.pipelines.mm_transforms.LoadImageFromFile
Load an image from webcam.
Similar with
LoadImageFromFile
, but the image read from webcam is inresults['img']
.
- class easycv.datasets.detection.pipelines.mm_transforms.LoadMultiChannelImageFromFiles(to_float32=False, color_type='unchanged', file_client_args={'backend': 'disk'})[source]¶
Bases:
object
Load multi-channel images from a list of separate channel files. Required keys are “img_prefix” and “img_info” (a dict that must contain the key “filename”, which is expected to be a list of filenames). Added or updated keys are “filename”, “img”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0) and “img_norm_cfg” (means=0 and stds=1). :param to_float32: Whether to convert the loaded image to a float32
numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.
- Parameters
color_type (str) – The flag argument for
mmcv.imfrombytes()
. Defaults to ‘color’.file_client_args (dict) – Arguments to instantiate a FileClient. See
mmcv.fileio.FileClient
for details. Defaults todict(backend='disk')
.
- class easycv.datasets.detection.pipelines.mm_transforms.LoadAnnotations(with_bbox=True, with_label=True, with_mask=False, with_seg=False, poly2mask=True, file_client_args={'backend': 'disk'})[source]¶
Bases:
object
Load multiple types of annotations. :param with_bbox: Whether to parse and load the bbox annotation.
Default: True.
- Parameters
with_label (bool) – Whether to parse and load the label annotation. Default: True.
with_mask (bool) – Whether to parse and load the mask annotation. Default: False.
with_seg (bool) – Whether to parse and load the semantic segmentation annotation. Default: False.
poly2mask (bool) – Whether to convert the instance masks from polygons to bitmaps. Default: True.
file_client_args (dict) – Arguments to instantiate a FileClient. See
mmcv.fileio.FileClient
for details. Defaults todict(backend='disk')
.
- class easycv.datasets.detection.pipelines.mm_transforms.LoadPanopticAnnotations(with_bbox=True, with_label=True, with_mask=True, with_seg=True, file_client_args={'backend': 'disk'})[source]¶
Bases:
easycv.datasets.detection.pipelines.mm_transforms.LoadAnnotations
Load multiple types of panoptic annotations.
- Parameters
with_bbox (bool) – Whether to parse and load the bbox annotation. Default: True.
with_label (bool) – Whether to parse and load the label annotation. Default: True.
with_mask (bool) – Whether to parse and load the mask annotation. Default: True.
with_seg (bool) – Whether to parse and load the semantic segmentation annotation. Default: True.
file_client_args (dict) – Arguments to instantiate a FileClient. See
mmcv.fileio.FileClient
for details. Defaults todict(backend='disk')
.
- class easycv.datasets.detection.pipelines.mm_transforms.MMMultiScaleFlipAug(transforms, img_scale=None, scale_factor=None, flip=False, flip_direction='horizontal')[source]¶
Bases:
object
Test-time augmentation with multiple scales and flipping. An example configuration is as followed: .. code-block:
img_scale=[(1333, 400), (1333, 800)], flip=True, transforms=[ dict(type='Resize', keep_ratio=True), dict(type='RandomFlip'), dict(type='Normalize', **img_norm_cfg), dict(type='Pad', size_divisor=32), dict(type='ImageToTensor', keys=['img']), dict(type='Collect', keys=['img']), ]
After MultiScaleFLipAug with above configuration, the results are wrapped into lists of the same length as followed: .. code-block:
dict( img=[...], img_shape=[...], scale=[(1333, 400), (1333, 400), (1333, 800), (1333, 800)] flip=[False, True, False, True] ... )
- Parameters
transforms (list[dict]) – Transforms to apply in each augmentation.
img_scale (tuple | list[tuple] | None) – Images scales for resizing.
scale_factor (float | list[float] | None) – Scale factors for resizing.
flip (bool) – Whether apply flip augmentation. Default: False.
flip_direction (str | list[str]) – Flip augmentation directions, options are “horizontal”, “vertical” and “diagonal”. If flip_direction is a list, multiple flip augmentations will be applied. It has no effect when flip == False. Default: “horizontal”.
- class easycv.datasets.detection.pipelines.mm_transforms.MMFilterAnnotations(min_gt_bbox_wh=(1.0, 1.0), min_gt_mask_area=1, by_box=True, by_mask=False, keep_empty=True)[source]¶
Bases:
object
Filter invalid annotations. :param min_gt_bbox_wh: Minimum width and height of ground truth
boxes. Default: (1., 1.)
- Parameters
min_gt_mask_area (int) – Minimum foreground area of ground truth masks. Default: 1
by_box (bool) – Filter instances with bounding boxes not meeting the min_gt_bbox_wh threshold. Default: True
by_mask (bool) – Filter instances with masks not meeting min_gt_mask_area threshold. Default: False
keep_empty (bool) – Whether to return None when it becomes an empty bbox after filtering. Default: True
Submodules¶
easycv.datasets.detection.mix module¶
- class easycv.datasets.detection.mix.DetImagesMixDataset(data_source, pipeline, dynamic_scale=None, skip_type_keys=None, profiling=False, classes=None, yolo_format=True, label_padding=True)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]A wrapper of multiple images mixed dataset.
Suitable for training on multiple images mixed data augmentation like mosaic and mixup. For the augmentation pipeline of mixed image data, the get_indexes method needs to be provided to obtain the image indexes, and you can set skip_flags to change the pipeline running process. At the same time, we provide the dynamic_scale parameter to dynamically change the output image size.
output boxes format: cx, cy, w, h
- Parameters
data_source (
DetSourceCoco
) – The dataset to be mixed.pipeline (Sequence[dict]) – Sequence of transform object or config dict to be composed.
dynamic_scale (tuple[int], optional) – The image scale can be changed dynamically. Default to None.
skip_type_keys (list[str], optional) – Sequence of type string to be skip pipeline. Default to None.
label_padding – out labeling padding [N, 120, 5]
- __init__(data_source, pipeline, dynamic_scale=None, skip_type_keys=None, profiling=False, classes=None, yolo_format=True, label_padding=True)[source]¶
Args: data_source: Data_source config dict pipeline: Pipeline config list profiling: If set True, will print pipeline time classes: A list of class names, used in evaluation for result and groundtruth visualization
- update_skip_type_keys(skip_type_keys)[source]¶
Update skip_type_keys. It is called by an external hook.
- Parameters
skip_type_keys (list[str], optional) – Sequence of type string to be skip pipeline.
- update_dynamic_scale(dynamic_scale)[source]¶
Update dynamic_scale. It is called by an external hook.
- Parameters
dynamic_scale (tuple[int]) – The image scale can be changed dynamically.
- results2json(results, outfile_prefix)[source]¶
Dump the detection results to a COCO style json file.
There are 3 types of results: proposals, bbox predictions, mask predictions, and they have different data types. This method will automatically recognize the type, and dump them to json files.
- Parameters
results (list[list | tuple | ndarray]) – Testing results of the dataset.
outfile_prefix (str) – The filename prefix of the json files. If the prefix is “somepath/xxx”, the json files will be named “somepath/xxx.bbox.json”, “somepath/xxx.segm.json”, “somepath/xxx.proposal.json”.
- Returns
str]: Possible keys are “bbox”, “segm”, “proposal”, and values are corresponding filenames.
- Return type
dict[str
- format_results(results, jsonfile_prefix=None, **kwargs)[source]¶
Format the results to json (standard format for COCO evaluation).
- Parameters
results (list[tuple | numpy.ndarray]) – Testing results of the dataset.
jsonfile_prefix (str | None) – The prefix of json files. It includes the file path and the prefix of filename, e.g., “a/b/prefix”. If not specified, a temp file will be created. Default: None.
- Returns
(result_files, tmp_dir), result_files is a dict containing the json filepaths, tmp_dir is the temporal directory created for saving json files when jsonfile_prefix is not specified.
- Return type
tuple
easycv.datasets.detection.raw module¶
- class easycv.datasets.detection.raw.DetDataset(data_source, pipeline, profiling=False, classes=None)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]Dataset for Detection
- __init__(data_source, pipeline, profiling=False, classes=None)[source]¶
- Parameters
data_source – Data_source config dict
pipeline – Pipeline config list
profiling – If set True, will print pipeline time
classes – A list of class names, used in evaluation for result and groundtruth visualization
- evaluate(results, evaluators=None, logger=None)[source]¶
Evaluates the detection boxes. :param results: A dictionary containing
- detection_boxes: List of length number of test images.
Float32 numpy array of shape [num_boxes, 4] and format [ymin, xmin, ymax, xmax] in absolute image coordinates.
- detection_scores: List of length number of test images,
detection scores for the boxes, float32 numpy array of shape [num_boxes].
- detection_classes: List of length number of test images,
integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.
- img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.
- Parameters
evaluators – evaluators to calculate metric with results and groundtruth_dict
- visualize(results, vis_num=10, score_thr=0.3, **kwargs)[source]¶
Visulaize the model output on validation data. :param results: A dictionary containing
- detection_boxes: List of length number of test images.
Float32 numpy array of shape [num_boxes, 4] and format [ymin, xmin, ymax, xmax] in absolute image coordinates.
- detection_scores: List of length number of test images,
detection scores for the boxes, float32 numpy array of shape [num_boxes].
- detection_classes: List of length number of test images,
integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.
- img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.
- Parameters
vis_num – number of images visualized
score_thr – The threshold to filter box, boxes with scores greater than score_thr will be kept.
- Returns: A dictionary containing
images: Visulaized images. img_metas: List of length number of test images,
dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.
easycv.datasets.loader package¶
- class easycv.datasets.loader.GroupSampler(dataset, samples_per_gpu=1)[source]¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]
- class easycv.datasets.loader.DistributedGroupSampler(dataset, samples_per_gpu=1, seed=0, num_replicas=None, rank=None)[source]¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]Sampler that restricts data loading to a subset of the dataset. It is especially useful in conjunction with
torch.nn.parallel.DistributedDataParallel
. In such case, each process can pass a DistributedSampler instance as a DataLoader sampler, and load a subset of the original dataset that is exclusive to it. .. note:Dataset is assumed to be of constant size.
- Parameters
dataset – Dataset used for sampling.
seed (int, Optional) – The seed. Default to 0.
num_replicas (optional) – Number of processes participating in distributed training.
rank (optional) – Rank of the current process within num_replicas.
- easycv.datasets.loader.build_dataloader(dataset, imgs_per_gpu, workers_per_gpu, num_gpus=1, dist=True, shuffle=True, replace=False, seed=None, reuse_worker_cache=False, odps_config=None, persistent_workers=False, collate_hooks=None, use_repeated_augment_sampler=False, sampler=None, pin_memory=False, **kwargs)[source]¶
Build PyTorch DataLoader. In distributed training, each GPU/process has a dataloader. In non-distributed training, there is only one dataloader for all GPUs. :param dataset: A PyTorch dataset. :type dataset: Dataset :param imgs_per_gpu: Number of images on each GPU, i.e., batch size of
each GPU.
- Parameters
workers_per_gpu (int) – How many subprocesses to use for data loading for each GPU.
num_gpus (int) – Number of GPUs. Only used in non-distributed training.
dist (bool) – Distributed training/test or not. Default: True.
shuffle (bool) – Whether to shuffle the data at every epoch. Default: True.
replace (bool) – Replace or not in random shuffle. It works on when shuffle is True.
seed (int, Optional) – The seed. Default to None.
reuse_worker_cache (bool) – If set true, will reuse worker process so that cached data in worker process can be reused.
persistent_workers (bool) – After pytorch1.7, could use persistent_workers=True to avoid reconstruct dataworker before each epoch, speed up before epoch
use_repeated_augment_sampler (bool) – If set true, it will use RASampler. Default: False.
kwargs – any keyword argument to be used to initialize DataLoader
- Returns
A PyTorch dataloader.
- Return type
DataLoader
- class easycv.datasets.loader.DistributedGivenIterationSampler(dataset, total_iter, batch_size, num_replicas=None, rank=None, last_iter=- 1)[source]¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]
- class easycv.datasets.loader.DistributedMPSampler(dataset, num_replicas=None, rank=None, shuffle=True, split_huge_listfile_byrank=False, **kwargs)[source]¶
Bases:
torch.utils.data.sampler.Sampler
[torch.utils.data.distributed.T_co
]- __init__(dataset, num_replicas=None, rank=None, shuffle=True, split_huge_listfile_byrank=False, **kwargs)[source]¶
A Distribute sampler which support sample m instance from one class once for classification dataset dataset: pytorch dataset object num_replicas (optional): Number of processes participating in
distributed training.
rank (optional): Rank of the current process within num_replicas. shuffle (optional): If true (default), sampler will shuffle the indices split_huge_listfile_byrank: if split, return all indice for each rank, because list for each rank has been
split before build dataset in dist training
- class easycv.datasets.loader.RASampler(dataset, num_replicas=None, rank=None, shuffle=True, num_repeats: int = 3, **kwargs)[source]¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]Sampler that restricts data loading to a subset of the dataset for distributed, with repeated augmentation. It ensures that different each augmented version of a sample will be visible to a different process (GPU) Heavily based on torch.utils.data.DistributedSampler
- class easycv.datasets.loader.DistributedSampler(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, replace=False, split_huge_listfile_byrank=False)[source]¶
Bases:
torch.utils.data.sampler.Sampler
[torch.utils.data.distributed.T_co
]- __init__(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, replace=False, split_huge_listfile_byrank=False)[source]¶
A Distribute sampler which support sample m instance from one class once for classification dataset :param dataset: pytorch dataset object :param num_replicas: Number of processes participating in
distributed training.
- Parameters
rank (optional) – Rank of the current process within num_replicas.
shuffle (optional) – If true (default), sampler will shuffle the indices
seed (int, Optional) – The seed. Default to 0.
split_huge_listfile_byrank – if split, return all indice for each rank, because list for each rank has been split before build dataset in dist training
Submodules¶
easycv.datasets.loader.build_loader module¶
- easycv.datasets.loader.build_loader.build_dataloader(dataset, imgs_per_gpu, workers_per_gpu, num_gpus=1, dist=True, shuffle=True, replace=False, seed=None, reuse_worker_cache=False, odps_config=None, persistent_workers=False, collate_hooks=None, use_repeated_augment_sampler=False, sampler=None, pin_memory=False, **kwargs)[source]¶
Build PyTorch DataLoader. In distributed training, each GPU/process has a dataloader. In non-distributed training, there is only one dataloader for all GPUs. :param dataset: A PyTorch dataset. :type dataset: Dataset :param imgs_per_gpu: Number of images on each GPU, i.e., batch size of
each GPU.
- Parameters
workers_per_gpu (int) – How many subprocesses to use for data loading for each GPU.
num_gpus (int) – Number of GPUs. Only used in non-distributed training.
dist (bool) – Distributed training/test or not. Default: True.
shuffle (bool) – Whether to shuffle the data at every epoch. Default: True.
replace (bool) – Replace or not in random shuffle. It works on when shuffle is True.
seed (int, Optional) – The seed. Default to None.
reuse_worker_cache (bool) – If set true, will reuse worker process so that cached data in worker process can be reused.
persistent_workers (bool) – After pytorch1.7, could use persistent_workers=True to avoid reconstruct dataworker before each epoch, speed up before epoch
use_repeated_augment_sampler (bool) – If set true, it will use RASampler. Default: False.
kwargs – any keyword argument to be used to initialize DataLoader
- Returns
A PyTorch dataloader.
- Return type
DataLoader
- easycv.datasets.loader.build_loader.worker_init_fn(worker_id, num_workers, rank, seed, odps_config=None)[source]¶
- class easycv.datasets.loader.build_loader.InfiniteDataLoader(*args, **kwargs)[source]¶
Bases:
Generic
[torch.utils.data.dataloader.T_co
]Dataloader that reuses workers. https://github.com/pytorch/pytorch/issues/15849 Uses same syntax as vanilla DataLoader.
- dataset: torch.utils.data.dataset.Dataset[torch.utils.data.dataloader.T_co]¶
- batch_size: Optional[int]¶
- num_workers: int¶
- pin_memory: bool¶
- drop_last: bool¶
- timeout: float¶
- sampler: Union[torch.utils.data.sampler.Sampler, Iterable]¶
- pin_memory_device: str¶
- prefetch_factor: int¶
easycv.datasets.loader.sampler module¶
- class easycv.datasets.loader.sampler.DistributedMPSampler(dataset, num_replicas=None, rank=None, shuffle=True, split_huge_listfile_byrank=False, **kwargs)[source]¶
Bases:
torch.utils.data.sampler.Sampler
[torch.utils.data.distributed.T_co
]- __init__(dataset, num_replicas=None, rank=None, shuffle=True, split_huge_listfile_byrank=False, **kwargs)[source]¶
A Distribute sampler which support sample m instance from one class once for classification dataset dataset: pytorch dataset object num_replicas (optional): Number of processes participating in
distributed training.
rank (optional): Rank of the current process within num_replicas. shuffle (optional): If true (default), sampler will shuffle the indices split_huge_listfile_byrank: if split, return all indice for each rank, because list for each rank has been
split before build dataset in dist training
- class easycv.datasets.loader.sampler.DistributedSampler(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, replace=False, split_huge_listfile_byrank=False)[source]¶
Bases:
torch.utils.data.sampler.Sampler
[torch.utils.data.distributed.T_co
]- __init__(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, replace=False, split_huge_listfile_byrank=False)[source]¶
A Distribute sampler which support sample m instance from one class once for classification dataset :param dataset: pytorch dataset object :param num_replicas: Number of processes participating in
distributed training.
- Parameters
rank (optional) – Rank of the current process within num_replicas.
shuffle (optional) – If true (default), sampler will shuffle the indices
seed (int, Optional) – The seed. Default to 0.
split_huge_listfile_byrank – if split, return all indice for each rank, because list for each rank has been split before build dataset in dist training
- class easycv.datasets.loader.sampler.GroupSampler(dataset, samples_per_gpu=1)[source]¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]
- class easycv.datasets.loader.sampler.DistributedGroupSampler(dataset, samples_per_gpu=1, seed=0, num_replicas=None, rank=None)[source]¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]Sampler that restricts data loading to a subset of the dataset. It is especially useful in conjunction with
torch.nn.parallel.DistributedDataParallel
. In such case, each process can pass a DistributedSampler instance as a DataLoader sampler, and load a subset of the original dataset that is exclusive to it. .. note:Dataset is assumed to be of constant size.
- Parameters
dataset – Dataset used for sampling.
seed (int, Optional) – The seed. Default to 0.
num_replicas (optional) – Number of processes participating in distributed training.
rank (optional) – Rank of the current process within num_replicas.
- class easycv.datasets.loader.sampler.DistributedGivenIterationSampler(dataset, total_iter, batch_size, num_replicas=None, rank=None, last_iter=- 1)[source]¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]
- class easycv.datasets.loader.sampler.RASampler(dataset, num_replicas=None, rank=None, shuffle=True, num_repeats: int = 3, **kwargs)[source]¶
Bases:
Generic
[torch.utils.data.sampler.T_co
]Sampler that restricts data loading to a subset of the dataset for distributed, with repeated augmentation. It ensures that different each augmented version of a sample will be visible to a different process (GPU) Heavily based on torch.utils.data.DistributedSampler
easycv.datasets.pose package¶
- class easycv.datasets.pose.PoseTopDownDataset(data_source, pipeline, profiling=False)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]PoseTopDownDataset dataset for top-down pose estimation. The dataset loads raw features and apply specified transforms to return a dict containing the image tensors and other information.
- Parameters
data_source – Data_source config dict
pipeline – Pipeline config list
profiling – If set True, will print pipeline time
- class easycv.datasets.pose.HandCocoWholeBodyDataset(data_source, pipeline, profiling=False)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]CocoWholeBodyDataset for top-down hand pose estimation.
- Parameters
data_source – Data_source config dict
pipeline – Pipeline config list
profiling – If set True, will print pipeline time
- class easycv.datasets.pose.WholeBodyCocoTopDownDataset(data_source, pipeline, profiling=False)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]CocoWholeBodyDataset dataset for top-down pose estimation.
- Parameters
data_source – Data_source config dict
pipeline – Pipeline config list
profiling – If set True, will print pipeline time
Subpackages¶
easycv.datasets.pose.data_sources package¶
- class easycv.datasets.pose.data_sources.PoseTopDownSourceCoco(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]¶
Bases:
easycv.datasets.pose.data_sources.top_down.PoseTopDownSource
CocoSource for top-down pose estimation.
Microsoft COCO: Common Objects in Context’ ECCV’2014 More details can be found in the `paper .
The source loads raw features to build a data meta object containing the image info, annotation info and others.
COCO keypoint indexes:
0: 'nose', 1: 'left_eye', 2: 'right_eye', 3: 'left_ear', 4: 'right_ear', 5: 'left_shoulder', 6: 'right_shoulder', 7: 'left_elbow', 8: 'right_elbow', 9: 'left_wrist', 10: 'right_wrist', 11: 'left_hip', 12: 'right_hip', 13: 'left_knee', 14: 'right_knee', 15: 'left_ankle', 16: 'right_ankle'
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held. Default: None.
data_cfg (dict) – config
dataset_info (DatasetInfo) – A class containing all dataset info.
test_mode (bool) – Store True when building test or
dataset. Default (validation) – False.
- class easycv.datasets.pose.data_sources.PoseTopDownSource(ann_file, img_prefix, data_cfg, dataset_info, coco_style=True, test_mode=False)[source]¶
Bases:
object
Class for keypoint 2D top-down pose estimation with single-view RGB image as the data source.
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held. Default: None.
data_cfg (dict) – config
dataset_info (DatasetInfo) – A class containing all dataset info.
coco_style (bool) – Whether the annotation json is coco-style. Default: True
test_mode (bool) – Store True when building test or validation dataset. Default: False.
- class easycv.datasets.pose.data_sources.HandCocoPoseTopDownSource(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]¶
Bases:
easycv.datasets.pose.data_sources.top_down.PoseTopDownSource
Coco Whole-Body-Hand Source for top-down hand pose estimation.
“Whole-Body Human Pose Estimation in the Wild”, ECCV’2020. More details can be found in the paper .
The dataset loads raw features and apply specified transforms to return a dict containing the image tensors and other information.
COCO-WholeBody Hand keypoint indexes:
0: 'wrist', 1: 'thumb1', 2: 'thumb2', 3: 'thumb3', 4: 'thumb4', 5: 'forefinger1', 6: 'forefinger2', 7: 'forefinger3', 8: 'forefinger4', 9: 'middle_finger1', 10: 'middle_finger2', 11: 'middle_finger3', 12: 'middle_finger4', 13: 'ring_finger1', 14: 'ring_finger2', 15: 'ring_finger3', 16: 'ring_finger4', 17: 'pinky_finger1', 18: 'pinky_finger2', 19: 'pinky_finger3', 20: 'pinky_finger4'
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held. Default: None.
data_cfg (dict) – config
dataset_info (DatasetInfo) – A class containing all dataset info.
test_mode (bool) – Store True when building test or validation dataset. Default: False.
- class easycv.datasets.pose.data_sources.WholeBodyCocoTopDownSource(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]¶
Bases:
easycv.datasets.pose.data_sources.top_down.PoseTopDownSource
CocoWholeBodyDataset dataset for top-down pose estimation.
“Whole-Body Human Pose Estimation in the Wild”, ECCV’2020. More details can be found in the paper .
The dataset loads raw features and apply specified transforms to return a dict containing the image tensors and other information.
COCO-WholeBody keypoint indexes:
0-16: 17 body keypoints, 17-22: 6 foot keypoints, 23-90: 68 face keypoints, 91-132: 42 hand keypoints In total, we have 133 keypoints for wholebody pose estimation.
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held. Default: None.
data_cfg (dict) – config
test_mode (bool) – Store True when building test or validation dataset. Default: False.
- class easycv.datasets.pose.data_sources.PoseTopDownSourceCoco2017(data_cfg, path='', download=True, split='train', dataset_info=None, test_mode=False)[source]¶
Bases:
easycv.datasets.pose.data_sources.coco.PoseTopDownSourceCoco
- Parameters
path – target dir
download – whether download
split – train or val
data_cfg (dict) – config
dataset_info (DatasetInfo) – A class containing all dataset info.
test_mode (bool) – Store True when building test or
dataset. Default (validation) – False.
- class easycv.datasets.pose.data_sources.PoseTopDownSourceCrowdPose(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False, **kwargs)[source]¶
Bases:
easycv.datasets.pose.data_sources.top_down.PoseTopDownSource
CrowdPose keypoint indexes:
0 ‘left_shoulder’, 1 ‘right_shoulder’, 2 ‘left_elbow’, 3 ‘right_elbow’, 4 ‘left_wrist’, 5 ‘right_wrist’, 6 ‘left_hip’, 7 ‘right_hip’, 8 ‘left_knee’, 9 ‘right_knee’, 10 ‘left_ankle’, 11 ‘right_ankle’, 12 ‘head’, 13 ‘neck’
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held. Default: None.
data_cfg (dict) – config
dataset_info (DatasetInfo) – A class containing all dataset info.
test_mode (bool) – Store True when building test or
dataset. Default (validation) – False.
- class easycv.datasets.pose.data_sources.PoseTopDownSourceChHuman(ann_file, img_prefix, data_cfg, subset=None, dataset_info=None, test_mode=False, **kwargs)[source]¶
Bases:
easycv.datasets.pose.data_sources.top_down.PoseTopDownSource
Oc Human Source for top-down pose estimation.
The source loads raw features to build a data meta object containing the image info, annotation info and others.
Oc Human keypoint indexes:
0: 'nose', 1: 'left_eye', 2: 'right_eye', 3: 'left_ear', 4: 'right_ear', 5: 'left_shoulder', 6: 'right_shoulder', 7: 'left_elbow', 8: 'right_elbow', 9: 'left_wrist', 10: 'right_wrist', 11: 'left_hip', 12: 'right_hip', 13: 'left_knee', 14: 'right_knee', 15: 'left_ankle', 16: 'right_ankle'
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held. Default: None.
data_cfg (dict) – config
subset – Applicable to non-coco or coco style data sets, if subset == train or val or test, in non-coco style else subset == None , in coco style
dataset_info (DatasetInfo) – A class containing all dataset info.
test_mode (bool) – Store True when building test or
- class easycv.datasets.pose.data_sources.PoseTopDownSourceMpii(data_cfg, path='/home/docs/.cache/easycv/', download=False, dataset_info=None, test_mode=False, **kwargs)[source]¶
Bases:
easycv.datasets.pose.data_sources.top_down.PoseTopDownSource
Oc Human Source for top-down pose estimation.
The source loads raw features to build a data meta object containing the image info, annotation info and others.
Oc Human keypoint indexes:
0: 'right_ankle', 1: 'right_knee', 2: 'right_hip', 3: 'left_hip', 4: 'right_ear', 5: 'left_ankle', 6: 'pelvis', 7: 'thorax', 8: 'neck', 9: 'head', 10: 'right_wrist', 11: 'right_elbow', 12: 'right_shoulder', 13: 'left_shoulder', 14: 'left_elbow', 15: 'left_wrist'
- Parameters
data_cfg (dict) – config
path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading
download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used
dataset_info (DatasetInfo) – A class containing all dataset info.
test_mode (bool) – Store True when building test or
- class easycv.datasets.pose.data_sources.coco.PoseTopDownSourceCoco(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]¶
Bases:
easycv.datasets.pose.data_sources.top_down.PoseTopDownSource
CocoSource for top-down pose estimation.
Microsoft COCO: Common Objects in Context’ ECCV’2014 More details can be found in the `paper .
The source loads raw features to build a data meta object containing the image info, annotation info and others.
COCO keypoint indexes:
0: 'nose', 1: 'left_eye', 2: 'right_eye', 3: 'left_ear', 4: 'right_ear', 5: 'left_shoulder', 6: 'right_shoulder', 7: 'left_elbow', 8: 'right_elbow', 9: 'left_wrist', 10: 'right_wrist', 11: 'left_hip', 12: 'right_hip', 13: 'left_knee', 14: 'right_knee', 15: 'left_ankle', 16: 'right_ankle'
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held. Default: None.
data_cfg (dict) – config
dataset_info (DatasetInfo) – A class containing all dataset info.
test_mode (bool) – Store True when building test or
dataset. Default (validation) – False.
- class easycv.datasets.pose.data_sources.coco.PoseTopDownSourceCoco2017(data_cfg, path='', download=True, split='train', dataset_info=None, test_mode=False)[source]¶
Bases:
easycv.datasets.pose.data_sources.coco.PoseTopDownSourceCoco
- Parameters
path – target dir
download – whether download
split – train or val
data_cfg (dict) – config
dataset_info (DatasetInfo) – A class containing all dataset info.
test_mode (bool) – Store True when building test or
dataset. Default (validation) – False.
- class easycv.datasets.pose.data_sources.top_down.PoseTopDownSource(ann_file, img_prefix, data_cfg, dataset_info, coco_style=True, test_mode=False)[source]¶
Bases:
object
Class for keypoint 2D top-down pose estimation with single-view RGB image as the data source.
- Parameters
ann_file (str) – Path to the annotation file.
img_prefix (str) – Path to a directory where images are held. Default: None.
data_cfg (dict) – config
dataset_info (DatasetInfo) – A class containing all dataset info.
coco_style (bool) – Whether the annotation json is coco-style. Default: True
test_mode (bool) – Store True when building test or validation dataset. Default: False.
easycv.datasets.pose.pipelines package¶
- class easycv.datasets.pose.pipelines.PoseCollect(keys, meta_keys, meta_name='img_metas')[source]¶
Bases:
object
Collect data from the loader relevant to the specific task.
This keeps the items in keys as it is, and collect items in meta_keys into a meta item called meta_name.This is usually the last stage of the data loader pipeline. For example, when keys=’imgs’, meta_keys=(‘filename’, ‘label’, ‘original_shape’), meta_name=’img_metas’, the results will be a dict with keys ‘imgs’ and ‘img_metas’, where ‘img_metas’ is a DataContainer of another dict with keys ‘filename’, ‘label’, ‘original_shape’.
- Parameters
keys (Sequence[str|tuple]) – Required keys to be collected. If a tuple (key, key_new) is given as an element, the item retrieved by key will be renamed as key_new in collected data.
meta_name (str) – The name of the key that contains meta information. This key is always populated. Default: “img_metas”.
meta_keys (Sequence[str|tuple]) – Keys that are collected under meta_name. The contents of the meta_name dictionary depends on meta_keys.
- class easycv.datasets.pose.pipelines.TopDownRandomFlip(flip_prob=0.5)[source]¶
Bases:
object
Data augmentation with random image flip.
Required keys: ‘img’, ‘joints_3d’, ‘joints_3d_visible’, ‘center’ and ‘ann_info’. Modifies key: ‘img’, ‘joints_3d’, ‘joints_3d_visible’, ‘center’ and ‘flipped’.
- Parameters
flip (bool) – Option to perform random flip.
flip_prob (float) – Probability of flip.
- class easycv.datasets.pose.pipelines.TopDownHalfBodyTransform(num_joints_half_body=8, prob_half_body=0.3)[source]¶
Bases:
object
Data augmentation with half-body transform. Keep only the upper body or the lower body at random.
Required keys: ‘joints_3d’, ‘joints_3d_visible’, and ‘ann_info’. Modifies key: ‘scale’ and ‘center’.
- Parameters
num_joints_half_body (int) – Threshold of performing half-body transform. If the body has fewer number of joints (< num_joints_half_body), ignore this step.
prob_half_body (float) – Probability of half-body transform.
- class easycv.datasets.pose.pipelines.TopDownGetRandomScaleRotation(rot_factor=40, scale_factor=0.5, rot_prob=0.6)[source]¶
Bases:
object
Data augmentation with random scaling & rotating.
Required key: ‘scale’. Modifies key: ‘scale’ and ‘rotation’.
- Parameters
rot_factor (int) – Rotating to
[-2*rot_factor, 2*rot_factor]
.scale_factor (float) – Scaling to
[1-scale_factor, 1+scale_factor]
.rot_prob (float) – Probability of random rotation.
- class easycv.datasets.pose.pipelines.TopDownAffine(use_udp=False)[source]¶
Bases:
object
Affine transform the image to make input.
Required keys:’img’, ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’,’scale’, ‘rotation’ and ‘center’. Modified keys:’img’, ‘joints_3d’, and ‘joints_3d_visible’.
- Parameters
use_udp (bool) – To use unbiased data processing. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
- class easycv.datasets.pose.pipelines.TopDownGenerateTarget(sigma=2, kernel=(11, 11), valid_radius_factor=0.0546875, target_type='GaussianHeatmap', encoding='MSRA', unbiased_encoding=False)[source]¶
Bases:
object
Generate the target heatmap.
Required keys: ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’. Modified keys: ‘target’, and ‘target_weight’.
- Parameters
sigma – Sigma of heatmap gaussian for ‘MSRA’ approach.
kernel – Kernel of heatmap gaussian for ‘Megvii’ approach.
encoding (str) – Approach to generate target heatmaps. Currently supported approaches: ‘MSRA’, ‘Megvii’, ‘UDP’. Default:’MSRA’
unbiased_encoding (bool) – Option to use unbiased encoding methods. Paper ref: Zhang et al. Distribution-Aware Coordinate Representation for Human Pose Estimation (CVPR 2020).
keypoint_pose_distance – Keypoint pose distance for UDP. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
target_type (str) – supported targets: ‘GaussianHeatmap’, ‘CombinedTarget’. Default:’GaussianHeatmap’ CombinedTarget: The combination of classification target (response map) and regression target (offset map). Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
- class easycv.datasets.pose.pipelines.TopDownGenerateTargetRegression[source]¶
Bases:
object
Generate the target regression vector (coordinates).
Required keys: ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’. Modified keys: ‘target’, and ‘target_weight’.
- class easycv.datasets.pose.pipelines.TopDownRandomTranslation(trans_factor=0.15, trans_prob=1.0)[source]¶
Bases:
object
Data augmentation with random translation.
Required key: ‘scale’ and ‘center’. Modifies key: ‘center’.
Notes
bbox height: H bbox width: W
- Parameters
trans_factor (float) – Translating center to
[-trans_factor, trans_factor] * [W, H] + center
.trans_prob (float) – Probability of random translation.
- class easycv.datasets.pose.pipelines.TopDownRandomShiftBboxCenter(shift_factor: float = 0.16, prob: float = 0.3)[source]¶
Bases:
object
Random shift the bbox center.
Required key: ‘center’, ‘scale’
Modifies key: ‘center’
- Parameters
shift_factor (float) – The factor to control the shift range, which is scale*pixel_std*scale_factor. Default: 0.16
prob (float) – Probability of applying random shift. Default: 0.3
- pixel_std: float = 200.0¶
- class easycv.datasets.pose.pipelines.TopDownGetBboxCenterScale(padding: float = 1.25)[source]¶
Bases:
object
Convert bbox from [x, y, w, h] to center and scale.
The center is the coordinates of the bbox center, and the scale is the bbox width and height normalized by a scale factor.
Required key: ‘bbox’, ‘ann_info’
Modifies key: ‘center’, ‘scale’
- Parameters
padding (float) – bbox padding scale that will be multilied to scale. Default: 1.25
- pixel_std: float = 200.0¶
- class easycv.datasets.pose.pipelines.transforms.PoseCollect(keys, meta_keys, meta_name='img_metas')[source]¶
Bases:
object
Collect data from the loader relevant to the specific task.
This keeps the items in keys as it is, and collect items in meta_keys into a meta item called meta_name.This is usually the last stage of the data loader pipeline. For example, when keys=’imgs’, meta_keys=(‘filename’, ‘label’, ‘original_shape’), meta_name=’img_metas’, the results will be a dict with keys ‘imgs’ and ‘img_metas’, where ‘img_metas’ is a DataContainer of another dict with keys ‘filename’, ‘label’, ‘original_shape’.
- Parameters
keys (Sequence[str|tuple]) – Required keys to be collected. If a tuple (key, key_new) is given as an element, the item retrieved by key will be renamed as key_new in collected data.
meta_name (str) – The name of the key that contains meta information. This key is always populated. Default: “img_metas”.
meta_keys (Sequence[str|tuple]) – Keys that are collected under meta_name. The contents of the meta_name dictionary depends on meta_keys.
- class easycv.datasets.pose.pipelines.transforms.TopDownRandomFlip(flip_prob=0.5)[source]¶
Bases:
object
Data augmentation with random image flip.
Required keys: ‘img’, ‘joints_3d’, ‘joints_3d_visible’, ‘center’ and ‘ann_info’. Modifies key: ‘img’, ‘joints_3d’, ‘joints_3d_visible’, ‘center’ and ‘flipped’.
- Parameters
flip (bool) – Option to perform random flip.
flip_prob (float) – Probability of flip.
- class easycv.datasets.pose.pipelines.transforms.TopDownHalfBodyTransform(num_joints_half_body=8, prob_half_body=0.3)[source]¶
Bases:
object
Data augmentation with half-body transform. Keep only the upper body or the lower body at random.
Required keys: ‘joints_3d’, ‘joints_3d_visible’, and ‘ann_info’. Modifies key: ‘scale’ and ‘center’.
- Parameters
num_joints_half_body (int) – Threshold of performing half-body transform. If the body has fewer number of joints (< num_joints_half_body), ignore this step.
prob_half_body (float) – Probability of half-body transform.
- class easycv.datasets.pose.pipelines.transforms.TopDownGetRandomScaleRotation(rot_factor=40, scale_factor=0.5, rot_prob=0.6)[source]¶
Bases:
object
Data augmentation with random scaling & rotating.
Required key: ‘scale’. Modifies key: ‘scale’ and ‘rotation’.
- Parameters
rot_factor (int) – Rotating to
[-2*rot_factor, 2*rot_factor]
.scale_factor (float) – Scaling to
[1-scale_factor, 1+scale_factor]
.rot_prob (float) – Probability of random rotation.
- class easycv.datasets.pose.pipelines.transforms.TopDownAffine(use_udp=False)[source]¶
Bases:
object
Affine transform the image to make input.
Required keys:’img’, ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’,’scale’, ‘rotation’ and ‘center’. Modified keys:’img’, ‘joints_3d’, and ‘joints_3d_visible’.
- Parameters
use_udp (bool) – To use unbiased data processing. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
- class easycv.datasets.pose.pipelines.transforms.TopDownGenerateTarget(sigma=2, kernel=(11, 11), valid_radius_factor=0.0546875, target_type='GaussianHeatmap', encoding='MSRA', unbiased_encoding=False)[source]¶
Bases:
object
Generate the target heatmap.
Required keys: ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’. Modified keys: ‘target’, and ‘target_weight’.
- Parameters
sigma – Sigma of heatmap gaussian for ‘MSRA’ approach.
kernel – Kernel of heatmap gaussian for ‘Megvii’ approach.
encoding (str) – Approach to generate target heatmaps. Currently supported approaches: ‘MSRA’, ‘Megvii’, ‘UDP’. Default:’MSRA’
unbiased_encoding (bool) – Option to use unbiased encoding methods. Paper ref: Zhang et al. Distribution-Aware Coordinate Representation for Human Pose Estimation (CVPR 2020).
keypoint_pose_distance – Keypoint pose distance for UDP. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
target_type (str) – supported targets: ‘GaussianHeatmap’, ‘CombinedTarget’. Default:’GaussianHeatmap’ CombinedTarget: The combination of classification target (response map) and regression target (offset map). Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
- class easycv.datasets.pose.pipelines.transforms.TopDownGenerateTargetRegression[source]¶
Bases:
object
Generate the target regression vector (coordinates).
Required keys: ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’. Modified keys: ‘target’, and ‘target_weight’.
- class easycv.datasets.pose.pipelines.transforms.TopDownRandomTranslation(trans_factor=0.15, trans_prob=1.0)[source]¶
Bases:
object
Data augmentation with random translation.
Required key: ‘scale’ and ‘center’. Modifies key: ‘center’.
Notes
bbox height: H bbox width: W
- Parameters
trans_factor (float) – Translating center to
[-trans_factor, trans_factor] * [W, H] + center
.trans_prob (float) – Probability of random translation.
- easycv.datasets.pose.pipelines.transforms.bbox_xywh2cs(bbox, aspect_ratio, padding=1.0, pixel_std=200.0)[source]¶
Transform the bbox format from (x,y,w,h) into (center, scale)
- Parameters
bbox (ndarray) – Single bbox in (x, y, w, h)
aspect_ratio (float) – The expected bbox aspect ratio (w over h)
padding (float) – Bbox padding factor that will be multilied to scale. Default: 1.0
pixel_std (float) – The scale normalization factor. Default: 200.0
- Returns
A tuple containing center and scale. - np.ndarray[float32](2,): Center of the bbox (x, y). - np.ndarray[float32](2,): Scale of the bbox w & h.
- Return type
tuple
- easycv.datasets.pose.pipelines.transforms.bbox_cs2xyxy(center, scale, padding=1.0, pixel_std=200.0)[source]¶
- class easycv.datasets.pose.pipelines.transforms.TopDownGetBboxCenterScale(padding: float = 1.25)[source]¶
Bases:
object
Convert bbox from [x, y, w, h] to center and scale.
The center is the coordinates of the bbox center, and the scale is the bbox width and height normalized by a scale factor.
Required key: ‘bbox’, ‘ann_info’
Modifies key: ‘center’, ‘scale’
- Parameters
padding (float) – bbox padding scale that will be multilied to scale. Default: 1.25
- pixel_std: float = 200.0¶
- class easycv.datasets.pose.pipelines.transforms.TopDownRandomShiftBboxCenter(shift_factor: float = 0.16, prob: float = 0.3)[source]¶
Bases:
object
Random shift the bbox center.
Required key: ‘center’, ‘scale’
Modifies key: ‘center’
- Parameters
shift_factor (float) – The factor to control the shift range, which is scale*pixel_std*scale_factor. Default: 0.16
prob (float) – Probability of applying random shift. Default: 0.3
- pixel_std: float = 200.0¶
Submodules¶
easycv.datasets.pose.top_down module¶
- class easycv.datasets.pose.top_down.PoseTopDownDataset(data_source, pipeline, profiling=False)[source]¶
Bases:
Generic
[torch.utils.data.dataset.T_co
]PoseTopDownDataset dataset for top-down pose estimation. The dataset loads raw features and apply specified transforms to return a dict containing the image tensors and other information.
- Parameters
data_source – Data_source config dict
pipeline – Pipeline config list
profiling – If set True, will print pipeline time
easycv.datasets.selfsup package¶
Subpackages¶
easycv.datasets.selfsup.data_sources package¶
- class easycv.datasets.selfsup.data_sources.SSLSourceImageList(list_file, root='', max_try=20)[source]¶
Bases:
object
datasource for classification
- Parameters
list_file – str / list(str), str means a input image list file path, this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label
root – str / list(str), root path for image_path, each list_file will need a root, if len(root) < len(list_file), we will use root[-1] to fill root list.
max_try – int, max try numbers of reading image
- class easycv.datasets.selfsup.data_sources.SSLSourceImageNetFeature(root_path, training=True, data_keyword='feat1', label_keyword='label', dynamic_load=True)[source]¶
Bases:
object
- class easycv.datasets.selfsup.data_sources.image_list.SSLSourceImageList(list_file, root='', max_try=20)[source]¶
Bases:
object
datasource for classification
- Parameters
list_file – str / list(str), str means a input image list file path, this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label
root – str / list(str), root path for image_path, each list_file will need a root, if len(root) < len(list_file), we will use root[-1] to fill root list.
max_try – int, max try numbers of reading image
easycv.datasets.selfsup.pipelines package¶
- class easycv.datasets.selfsup.pipelines.RandomAppliedTrans(transforms, p=0.5)[source]¶
Bases:
object
Randomly applied transformations. :param transforms: List of transformations in dictionaries. :type transforms: List[Dict]
- class easycv.datasets.selfsup.pipelines.Lighting[source]¶
Bases:
object
Lighting noise(AlexNet - style PCA - based noise)
- class easycv.datasets.selfsup.pipelines.transforms.MAEFtAugment(input_size=None, color_jitter=None, auto_augment=None, interpolation=None, re_prob=None, re_mode=None, re_count=None, mean=None, std=None, is_train=True)[source]¶
Bases:
object
RandAugment data augmentation method based on “RandAugment: Practical automated data augmentation with a reduced search space”. This code is borrowed from <https://github.com/pengzhiliang/MAE-pytorch> :param input_size: images input size :type input_size: int :param color_jitter: Color jitter factor :type color_jitter: float :param auto_augment: Use AutoAugment policy :param iterpolation: Training interpolation :param re_prob: Random erase prob :param re_mode: Random erase mode :param re_count: Random erase count :param mean: mean used for normalization :param std: std used for normalization :param is_train: If True use all augmentation strategy
- class easycv.datasets.selfsup.pipelines.transforms.RandomAppliedTrans(transforms, p=0.5)[source]¶
Bases:
object
Randomly applied transformations. :param transforms: List of transformations in dictionaries. :type transforms: List[Dict]
easycv.datasets.utils package¶
Submodules¶
easycv.datasets.utils.tfrecord_util module¶
- easycv.datasets.utils.tfrecord_util.download_tfrecord(file_list_or_path, target_path, slice_count=1, slice_id=0, force=False)[source]¶
Download data from oss. Use the processes on the gpus to slice download, each gpu process downloads part of the data. The number of slices is the same as the number of gpu processes. Support tfrecord of ImageNet style. tfrecord_dir
- Parameters
file_list_or_path – A list of absolute data path or a path str type(file_list) == list means this is the list type(file_list) == str means open(file_list).readlines()
target_path – A str, download path
slice_count – Download worker num
slice_id – Download worker ID
force – If false, skip download if the file already exists in the target path. If true, recopy and replace the original file.
- Returns
list of str, download tfrecord path index_path: list of str, download tfrecord idx path
- Return type
path
Submodules¶
easycv.datasets.builder module¶
easycv.datasets.registry module¶
easycv.hooks package¶
- class easycv.hooks.BestCkptSaverHook(by_epoch=True, save_optimizer=True, best_metric_name=[], best_metric_type=[], **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Save checkpoints periodically.
- Parameters
by_epoch (bool) – Saving checkpoints by epoch or by iteration. Default: True.
save_optimizer (bool) – Whether to save optimizer state_dict in the checkpoint. It is usually used for resuming experiments. Default: True.
best_metric_name (List(str)) – metric name to save best, such as “neck_top1”… Default: [], do not save anything
best_metric_type (List(str)) – metric type to define best, should be “max”, “min” if len(best_metric_type) <= len(best_metric_type), use “max” to append.
- class easycv.hooks.BYOLHook(end_momentum=1.0, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Hook in BYOL
- This hook including momentum adjustment in BYOL following:
m = 1 - ( 1- m_0) * (cos(pi * k / K) + 1) / 2, k: current step, K: total steps.
- class easycv.hooks.DINOHook(momentum_teacher=0.996, weight_decay=0.04, weight_decay_end=0.4, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Hook in DINO
- class easycv.hooks.EMAHook(decay=0.9999, copy_model_attr=())[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Hook to carry out Exponential Moving Average
- class easycv.hooks.DistEvalHook(dataloader, interval=1, mode='test', initial=False, gpu_collect=False, flush_buffer=True, broadcast_bn_buffer=True, **eval_kwargs)[source]¶
Bases:
easycv.hooks.eval_hook.EvalHook
Distributed evaluation hook.
- dataloader¶
A PyTorch dataloader.
- Type
DataLoader
- interval¶
Evaluation interval (by epochs). Default: 1.
- Type
int
- mode¶
model forward mode
- Type
str
- tmpdir¶
Temporary directory to save the results of all processes. Default: None.
- Type
str | None
- gpu_collect¶
Whether to use gpu or cpu to collect results. Default: False.
- Type
bool
- broadcast_bn_buffer¶
Whether to broadcast the buffer(running_mean and running_var) of rank 0 to other rank before evaluation. Default: True.
- Type
bool
- class easycv.hooks.EvalHook(dataloader, initial=False, interval=1, mode='test', flush_buffer=True, **eval_kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Evaluation hook.
- dataloader¶
A PyTorch dataloader.
- Type
DataLoader
- interval¶
Evaluation interval (by epochs). Default: 1.
- Type
int
- mode¶
model forward mode
- Type
str
- flush_buffer¶
flush log buffer
- Type
bool
- class easycv.hooks.ExportHook(cfg, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', export_after_each_ckpt=False)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
export model when training on pai
- class easycv.hooks.Extractor(dataset, imgs_per_gpu, workers_per_gpu, dist_mode=False)[source]¶
Bases:
object
- class easycv.hooks.OptimizerHook(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], multiply_key=[], multiply_rate=[])[source]¶
Bases:
mmcv.runner.hooks.optimizer.OptimizerHook
- __init__(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], multiply_key=[], multiply_rate=[])[source]¶
ignore_key: [str,…], ignore_key[i], name of parameters, which’s gradient will be set to zero before every optimizer step when epoch < ignore_key_epoch[i] ignore_key_epoch: [int,…], epoch < ignore_key_epoch[i], ignore_key[i]’s gradient will be set to zero. multiply_key:[str,…] multiply_key[i], name of parameters, which will set different learning rate ratio by multipy_rate multiply_rate:[float,…] multiply_rate[i], different ratio
- class easycv.hooks.OSSSyncHook(work_dir, oss_work_dir, interval=1, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', other_file_list=[], iter_interval=None)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
upload log files and checkpoints to oss when training on pai
- __init__(work_dir, oss_work_dir, interval=1, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', other_file_list=[], iter_interval=None)[source]¶
- Parameters
work_dir – work_dir in cfg
oss_work_dir – oss directory where to upload local files in work_dir
interval – upload frequency
ckpt_filename_tmpl – checkpoint filename template
other_file_list – other file need to be upload to oss
iter_interval – upload frequency by iter interval, default to be None, means do it with certain assignment
- class easycv.hooks.TIMEHook(end_momentum=1.0, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
This hook to show time for runner running process
- class easycv.hooks.SWAVHook(gpu_batch_size=32, dump_path='data/', **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Hook in SWAV
- class easycv.hooks.SyncNormHook(no_aug_epochs=15, interval=1, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Synchronize Norm states after training epoch, currently used in YOLOX.
- Parameters
no_aug_epochs (int) – The number of latter epochs in the end of the training to switch to synchronizing norm interval. Default: 15.
interval (int) – Synchronizing norm interval. Default: 1.
- class easycv.hooks.SyncRandomSizeHook(ratio_range=(14, 26), img_scale=(640, 640), interval=10, device='cuda', **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Change and synchronize the random image size across ranks, currently used in YOLOX.
- Parameters
ratio_range (tuple[int]) – Random ratio range. It will be multiplied by 32, and then change the dataset output image size. Default: (14, 26).
img_scale (tuple[int]) – Size of input image. Default: (640, 640).
interval (int) – The interval of change image size. Default: 10.
device (torch.device | str) – device for returned tensors. Default: ‘cuda’.
- class easycv.hooks.TensorboardLoggerHookV2(log_dir=None, interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[source]¶
Bases:
mmcv.runner.hooks.logger.tensorboard.TensorboardLoggerHook
- class easycv.hooks.WandbLoggerHookV2(init_kwargs=None, interval=10, ignore_last=True, reset_flag=False, commit=True, by_epoch=True, with_step=True)[source]¶
Bases:
mmcv.runner.hooks.logger.wandb.WandbLoggerHook
- class easycv.hooks.YOLOXLrUpdaterHook(num_last_epochs, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.lr_updater.CosineAnnealingLrUpdaterHook
YOLOX learning rate scheme.
There are two main differences between YOLOXLrUpdaterHook and CosineAnnealingLrUpdaterHook.
- When the current running epoch is greater than
max_epoch-last_epoch, a fixed learning rate will be used
The exp warmup scheme is different with LrUpdaterHook in MMCV
- Parameters
num_last_epochs (int) – The number of epochs with a fixed learning rate before the end of the training.
- class easycv.hooks.YOLOXModeSwitchHook(no_aug_epochs=15, skip_type_keys=('MMMosaic', 'MMRandomAffine', 'MMMixUp'), **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Switch the mode of YOLOX during training.
This hook turns off the mosaic and mixup data augmentation and switches to use L1 loss in bbox_head.
- Parameters
no_aug_epochs – The number of latter epochs in the end of the training to close the data augmentation and switch to L1 loss. Default: 15.
- class easycv.hooks.MixupCollateHook(**kwargs)[source]¶
Bases:
easycv.hooks.collate_hook.BaseCollateHook
Mixedup data batch, should be used after merges a list of samples to form a mini-batch of Tensor(s).
- class easycv.hooks.PreLoggerHook(interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[source]¶
Bases:
mmcv.runner.hooks.logger.base.LoggerHook
- class easycv.hooks.StepFixCosineAnnealingLrUpdaterHook(min_lr=None, min_lr_ratio=None, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.lr_updater.CosineAnnealingLrUpdaterHook
- class easycv.hooks.CosineAnnealingWarmupByEpochLrUpdaterHook(min_lr=None, min_lr_ratio=None, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.lr_updater.CosineAnnealingLrUpdaterHook
- class easycv.hooks.ThroughputHook(warmup_iters=0, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Count the throughput per second of all steps in the history. warmup_iters can be set to skip the calculation of the first few steps, if the initialization of the first few steps is slow.
- class easycv.hooks.AMPFP16OptimizerHook(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], loss_scale={})[source]¶
Bases:
easycv.hooks.optimizer_hook.OptimizerHook
- __init__(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], loss_scale={})[source]¶
ignore_key: [str,…], ignore_key[i], name of parameters, which’s gradient will be set to zero before every optimizer step when epoch < ignore_key_epoch[i] ignore_key_epoch: [int,…], epoch < ignore_key_epoch[i], ignore_key[i]’s gradient will be set to zero. loss_scale (float | dict): grade scale config. If loss_scale is a float, static loss scaling will be used with the specified scale.
It can also be a dict containing arguments of GradScalar. For Pytorch >= 1.6, we use official torch.cuda.amp.GradScaler. please refer to: https://pytorch.org/docs/stable/amp.html#torch.cuda.amp.GradScaler for the parameters.
Submodules¶
easycv.hooks.best_ckpt_saver_hook module¶
- class easycv.hooks.best_ckpt_saver_hook.BestCkptSaverHook(by_epoch=True, save_optimizer=True, best_metric_name=[], best_metric_type=[], **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Save checkpoints periodically.
- Parameters
by_epoch (bool) – Saving checkpoints by epoch or by iteration. Default: True.
save_optimizer (bool) – Whether to save optimizer state_dict in the checkpoint. It is usually used for resuming experiments. Default: True.
best_metric_name (List(str)) – metric name to save best, such as “neck_top1”… Default: [], do not save anything
best_metric_type (List(str)) – metric type to define best, should be “max”, “min” if len(best_metric_type) <= len(best_metric_type), use “max” to append.
easycv.hooks.byol_hook module¶
- class easycv.hooks.byol_hook.BYOLHook(end_momentum=1.0, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Hook in BYOL
- This hook including momentum adjustment in BYOL following:
m = 1 - ( 1- m_0) * (cos(pi * k / K) + 1) / 2, k: current step, K: total steps.
easycv.hooks.dino_hook module¶
- easycv.hooks.dino_hook.cosine_scheduler(base_value, final_value, epochs, niter_per_ep, warmup_epochs=0, start_warmup_value=0)[source]¶
- class easycv.hooks.dino_hook.DINOHook(momentum_teacher=0.996, weight_decay=0.04, weight_decay_end=0.4, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Hook in DINO
easycv.hooks.ema_hook module¶
- class easycv.hooks.ema_hook.ModelEMA(model, decay=0.9999, updates=0)[source]¶
Bases:
object
Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models Keep a moving average of everything in the model state_dict (parameters and buffers). This is intended to allow functionality like https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage A smoothed version of the weights is necessary for some training schemes to perform well. This class is sensitive where it is initialized in the sequence of model init, GPU assignment and distributed training wrappers.
In Yolo5s, ema help increase mAP from 0.27 to 0.353
- class easycv.hooks.ema_hook.EMAHook(decay=0.9999, copy_model_attr=())[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Hook to carry out Exponential Moving Average
easycv.hooks.eval_hook module¶
- class easycv.hooks.eval_hook.EvalHook(dataloader, initial=False, interval=1, mode='test', flush_buffer=True, **eval_kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Evaluation hook.
- dataloader¶
A PyTorch dataloader.
- Type
DataLoader
- interval¶
Evaluation interval (by epochs). Default: 1.
- Type
int
- mode¶
model forward mode
- Type
str
- flush_buffer¶
flush log buffer
- Type
bool
- class easycv.hooks.eval_hook.DistEvalHook(dataloader, interval=1, mode='test', initial=False, gpu_collect=False, flush_buffer=True, broadcast_bn_buffer=True, **eval_kwargs)[source]¶
Bases:
easycv.hooks.eval_hook.EvalHook
Distributed evaluation hook.
- dataloader¶
A PyTorch dataloader.
- Type
DataLoader
- interval¶
Evaluation interval (by epochs). Default: 1.
- Type
int
- mode¶
model forward mode
- Type
str
- tmpdir¶
Temporary directory to save the results of all processes. Default: None.
- Type
str | None
- gpu_collect¶
Whether to use gpu or cpu to collect results. Default: False.
- Type
bool
- broadcast_bn_buffer¶
Whether to broadcast the buffer(running_mean and running_var) of rank 0 to other rank before evaluation. Default: True.
- Type
bool
easycv.hooks.export_hook module¶
- class easycv.hooks.export_hook.ExportHook(cfg, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', export_after_each_ckpt=False)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
export model when training on pai
easycv.hooks.extractor module¶
easycv.hooks.optimizer_hook module¶
- class easycv.hooks.optimizer_hook.OptimizerHook(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], multiply_key=[], multiply_rate=[])[source]¶
Bases:
mmcv.runner.hooks.optimizer.OptimizerHook
- __init__(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], multiply_key=[], multiply_rate=[])[source]¶
ignore_key: [str,…], ignore_key[i], name of parameters, which’s gradient will be set to zero before every optimizer step when epoch < ignore_key_epoch[i] ignore_key_epoch: [int,…], epoch < ignore_key_epoch[i], ignore_key[i]’s gradient will be set to zero. multiply_key:[str,…] multiply_key[i], name of parameters, which will set different learning rate ratio by multipy_rate multiply_rate:[float,…] multiply_rate[i], different ratio
- class easycv.hooks.optimizer_hook.AMPFP16OptimizerHook(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], loss_scale={})[source]¶
Bases:
easycv.hooks.optimizer_hook.OptimizerHook
- __init__(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], loss_scale={})[source]¶
ignore_key: [str,…], ignore_key[i], name of parameters, which’s gradient will be set to zero before every optimizer step when epoch < ignore_key_epoch[i] ignore_key_epoch: [int,…], epoch < ignore_key_epoch[i], ignore_key[i]’s gradient will be set to zero. loss_scale (float | dict): grade scale config. If loss_scale is a float, static loss scaling will be used with the specified scale.
It can also be a dict containing arguments of GradScalar. For Pytorch >= 1.6, we use official torch.cuda.amp.GradScaler. please refer to: https://pytorch.org/docs/stable/amp.html#torch.cuda.amp.GradScaler for the parameters.
easycv.hooks.oss_sync_hook module¶
- class easycv.hooks.oss_sync_hook.OSSSyncHook(work_dir, oss_work_dir, interval=1, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', other_file_list=[], iter_interval=None)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
upload log files and checkpoints to oss when training on pai
- __init__(work_dir, oss_work_dir, interval=1, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', other_file_list=[], iter_interval=None)[source]¶
- Parameters
work_dir – work_dir in cfg
oss_work_dir – oss directory where to upload local files in work_dir
interval – upload frequency
ckpt_filename_tmpl – checkpoint filename template
other_file_list – other file need to be upload to oss
iter_interval – upload frequency by iter interval, default to be None, means do it with certain assignment
easycv.hooks.registry module¶
easycv.hooks.show_time_hook module¶
easycv.hooks.swav_hook module¶
easycv.hooks.sync_norm_hook module¶
- class easycv.hooks.sync_norm_hook.SyncNormHook(no_aug_epochs=15, interval=1, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Synchronize Norm states after training epoch, currently used in YOLOX.
- Parameters
no_aug_epochs (int) – The number of latter epochs in the end of the training to switch to synchronizing norm interval. Default: 15.
interval (int) – Synchronizing norm interval. Default: 1.
easycv.hooks.sync_random_size_hook module¶
- class easycv.hooks.sync_random_size_hook.SyncRandomSizeHook(ratio_range=(14, 26), img_scale=(640, 640), interval=10, device='cuda', **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Change and synchronize the random image size across ranks, currently used in YOLOX.
- Parameters
ratio_range (tuple[int]) – Random ratio range. It will be multiplied by 32, and then change the dataset output image size. Default: (14, 26).
img_scale (tuple[int]) – Size of input image. Default: (640, 640).
interval (int) – The interval of change image size. Default: 10.
device (torch.device | str) – device for returned tensors. Default: ‘cuda’.
easycv.hooks.tensorboard module¶
- class easycv.hooks.tensorboard.TensorboardLoggerHookV2(log_dir=None, interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[source]¶
Bases:
mmcv.runner.hooks.logger.tensorboard.TensorboardLoggerHook
easycv.hooks.wandb module¶
- class easycv.hooks.wandb.WandbLoggerHookV2(init_kwargs=None, interval=10, ignore_last=True, reset_flag=False, commit=True, by_epoch=True, with_step=True)[source]¶
Bases:
mmcv.runner.hooks.logger.wandb.WandbLoggerHook
easycv.hooks.yolox_lr_hook module¶
- class easycv.hooks.yolox_lr_hook.YOLOXLrUpdaterHook(num_last_epochs, **kwargs)[source]¶
Bases:
mmcv.runner.hooks.lr_updater.CosineAnnealingLrUpdaterHook
YOLOX learning rate scheme.
There are two main differences between YOLOXLrUpdaterHook and CosineAnnealingLrUpdaterHook.
- When the current running epoch is greater than
max_epoch-last_epoch, a fixed learning rate will be used
The exp warmup scheme is different with LrUpdaterHook in MMCV
- Parameters
num_last_epochs (int) – The number of epochs with a fixed learning rate before the end of the training.
easycv.hooks.yolox_mode_switch_hook module¶
- class easycv.hooks.yolox_mode_switch_hook.YOLOXModeSwitchHook(no_aug_epochs=15, skip_type_keys=('MMMosaic', 'MMRandomAffine', 'MMMixUp'), **kwargs)[source]¶
Bases:
mmcv.runner.hooks.hook.Hook
Switch the mode of YOLOX during training.
This hook turns off the mosaic and mixup data augmentation and switches to use L1 loss in bbox_head.
- Parameters
no_aug_epochs – The number of latter epochs in the end of the training to close the data augmentation and switch to L1 loss. Default: 15.
easycv.predictors package¶
Submodules¶
easycv.predictors.base module¶
- class easycv.predictors.base.Predictor(model_path, numpy_to_pil=True)[source]¶
Bases:
object
- class easycv.predictors.base.InputProcessor(cfg, pipelines=None, batch_size=1, threads=8, mode='BGR')[source]¶
Bases:
object
Base input processor for processing input samples. :param cfg: Config instance. :type cfg: Config :param pipelines: Data pipeline configs. :type pipelines: list[dict] :param batch_size: batch size for forward. :type batch_size: int :param threads: Number of processes to process inputs. :type threads: int :param mode: The image mode into the model. :type mode: str
- __init__(cfg, pipelines=None, batch_size=1, threads=8, mode='BGR')[source]¶
Initialize self. See help(type(self)) for accurate signature.
- class easycv.predictors.base.OutputProcessor[source]¶
Bases:
object
Base output processor for processing model outputs.
- class easycv.predictors.base.PredictorV2(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, input_processor_threads=8, mode='BGR')[source]¶
Bases:
object
Base predict pipeline. :param model_path: Path of model path. :type model_path: str :param config_file: config file path for model and processor to init. Defaults to None. :type config_file: Optinal[str] :param batch_size: batch size for forward. :type batch_size: int :param device: Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically. :type device: str | torch.device :param save_results: Whether to save predict results. :type save_results: bool :param save_path: File path for saving results, only valid when save_results is True. :type save_path: str :param pipelines: Data pipeline configs. :type pipelines: list[dict] :param input_processor_threads: Number of processes to process inputs. :type input_processor_threads: int :param mode: The image mode into the model. :type mode: str
- __init__(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, input_processor_threads=8, mode='BGR')[source]¶
Initialize self. See help(type(self)) for accurate signature.
- prepare_model()[source]¶
Build model from config file by default. If the model is not loaded from a configuration file, e.g. torch jit model, you need to reimplement it.
easycv.predictors.builder module¶
easycv.predictors.classifier module¶
- class easycv.predictors.classifier.ClsInputProcessor(cfg, pipelines=None, batch_size=1, pil_input=True, threads=8, mode='BGR')[source]¶
Bases:
easycv.predictors.base.InputProcessor
Process inputs for classification models.
- Parameters
cfg (Config) – Config instance.
pipelines (list[dict]) – Data pipeline configs.
batch_size (int) – batch size for forward.
pil_input (bool) – Whether use PIL image. If processor need PIL input, set true, default false.
threads (int) – Number of processes to process inputs.
mode (str) – The image mode into the model.
- class easycv.predictors.classifier.ClsOutputProcessor(topk=1, label_map={})[source]¶
Bases:
easycv.predictors.base.OutputProcessor
Output processor for processing classification model outputs.
- Parameters
topk (int) – Return top-k results. Default: 1.
label_map (dict) – Dict of class id to class name.
- class easycv.predictors.classifier.ClassificationPredictor(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, topk=1, pil_input=True, label_map_path=None, input_processor_threads=8, mode='BGR', *args, **kwargs)[source]¶
Bases:
easycv.predictors.base.PredictorV2
Predictor for classification. :param model_path: Path of model path. :type model_path: str :param config_file: config file path for model and processor to init. Defaults to None. :type config_file: Optinal[str] :param batch_size: batch size for forward. :type batch_size: int :param device: Support ‘cuda’ or ‘cpu’, if is None, detect device automatically. :type device: str :param save_results: Whether to save predict results. :type save_results: bool :param save_path: File path for saving results, only valid when save_results is True. :type save_path: str :param pipelines: Data pipeline configs. :type pipelines: list[dict] :param topk: Return top-k results. Default: 1. :type topk: int :param pil_input: Whether use PIL image. If processor need PIL input, set true, default false. :type pil_input: bool :param label_map_path: File path of saving labels list. :type label_map_path: str :param input_processor_threads: Number of processes to process inputs. :type input_processor_threads: int :param mode: The image mode into the model. :type mode: str
easycv.predictors.detector module¶
- class easycv.predictors.detector.DetInputProcessor(cfg, pipelines=None, batch_size=1, threads=8, mode='BGR')[source]¶
- class easycv.predictors.detector.DetOutputProcessor(score_thresh, classes=None)[source]¶
- class easycv.predictors.detector.DetectionPredictor(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, score_threshold=0.5, input_processor_threads=8, mode='BGR', *arg, **kwargs)[source]¶
Bases:
easycv.predictors.base.PredictorV2
Generic Detection Predictor, it will filter bbox results by
score_threshold
.- Parameters
model_path (str) – Path of model path.
config_file (Optinal[str]) – config file path for model and processor to init. Defaults to None.
batch_size (int) – batch size for forward.
device (str | torch.device) – Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically.
save_results (bool) – Whether to save predict results.
save_path (str) – File path for saving results, only valid when save_results is True.
pipelines (list[dict]) – Data pipeline configs.
input_processor_threads (int) – Number of processes to process inputs.
mode (str) – The image mode into the model.
- class easycv.predictors.detector.YoloXInputProcessor(cfg, pipelines=None, batch_size=1, model_type='raw', jit_processor_path=None, device=None, threads=8, mode='BGR')[source]¶
Bases:
easycv.predictors.detector.DetInputProcessor
Input processor for yolox.
- Parameters
cfg (Config) – Config instance.
pipelines (list[dict]) – Data pipeline configs.
batch_size (int) – batch size for forward.
model_type (str) – “raw” or “jit” or “blade”
jit_processor_path (str) – File of the saved processing operator of torch jit type.
device (str | torch.device) – Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically.
threads (int) – Number of processes to process inputs.
mode (str) – The image mode into the model.
- class easycv.predictors.detector.YoloXOutputProcessor(score_thresh=0.5, model_type='raw', test_conf=0.01, nms_thre=0.65, use_trt_efficientnms=False, classes=None)[source]¶
- class easycv.predictors.detector.YoloXPredictor(model_path, config_file=None, batch_size=1, use_trt_efficientnms=False, device=None, save_results=False, save_path=None, pipelines=None, max_det=100, score_thresh=0.5, nms_thresh=None, test_conf=None, input_processor_threads=8, mode='BGR', model_type=None)[source]¶
Bases:
easycv.predictors.detector.DetectionPredictor
Detection predictor for Yolox.
- Parameters
model_path (str) – Path of model path.
config_file (Optinal[str]) – config file path for model and processor to init. Defaults to None.
batch_size (int) – batch size for forward.
use_trt_efficientnms (bool) – Whether used tensorrt efficient nms operation in the saved model.
device (str | torch.device) – Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically.
save_results (bool) – Whether to save predict results.
save_path (str) – File path for saving results, only valid when save_results is True.
pipelines (list[dict]) – Data pipeline configs.
max_det (int) – Maximum number of detection output boxes.
score_thresh (float) – Score threshold to filter box.
nms_thresh (float) – Nms threshold to filter box.
input_processor_threads (int) – Number of processes to process inputs.
mode (str) – The image mode into the model.
- __init__(model_path, config_file=None, batch_size=1, use_trt_efficientnms=False, device=None, save_results=False, save_path=None, pipelines=None, max_det=100, score_thresh=0.5, nms_thresh=None, test_conf=None, input_processor_threads=8, mode='BGR', model_type=None)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- prepare_model()[source]¶
Build model from config file by default. If the model is not loaded from a configuration file, e.g. torch jit model, you need to reimplement it.
- class easycv.predictors.detector.TorchFaceDetector(model_path=None, model_config=None)[source]¶
Bases:
easycv.predictors.interface.PredictorInterface
- __init__(model_path=None, model_config=None)[source]¶
init model, add a facedetect and align for img input.
- Parameters
model_path – model file path
model_config – config string for model to init, in json format
- get_output_type()[source]¶
in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str
type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty
type video, data will be converted to encode video binary and write to oss file,
- :: return {
‘image’: ‘image’, ‘feature’: ‘json’
} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json
- predict(input_data_list, batch_size=- 1, threshold=0.95)[source]¶
using session run predict a number of samples using batch_size
- Parameters
input_data_list – a list of numpy array, each array is a sample to be predicted
batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime
- Returns
- a list of dict, each dict is the prediction result of one sample
eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array
- Return type
result
- Raises
if detect !=1 face in a img, then do nothing for this image –
- class easycv.predictors.detector.TorchYoloXClassifierPredictor(models_root_dir, max_det=100, cls_score_thresh=0.01, det_model_config=None, cls_model_config=None)[source]¶
Bases:
easycv.predictors.interface.PredictorInterface
- __init__(models_root_dir, max_det=100, cls_score_thresh=0.01, det_model_config=None, cls_model_config=None)[source]¶
init model, add a yolox and classification predictor for img input.
- Parameters
models_root_dir – models_root_dir/detection/.pth and models_root_dir/classification/.pth
det_model_config – config string for detection model to init, in json format
cls_model_config – config string for classification model to init, in json format
- predict(input_data_list, batch_size=- 1)[source]¶
using session run predict a number of samples using batch_size
- Parameters
input_data_list – a list of numpy array(in rgb order), each array is a sample to be predicted
batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime
- Returns
- a list of dict, each dict is the prediction result of one sample
eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array
- Return type
result
easycv.predictors.feature_extractor module¶
- class easycv.predictors.feature_extractor.TorchFeatureExtractor(model_path, model_config=None)[source]¶
Bases:
easycv.predictors.interface.PredictorInterface
- __init__(model_path, model_config=None)[source]¶
init model
- Parameters
model_path – model file path
model_config – config string for model to init, in json format
- get_output_type()[source]¶
in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str
type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty
type video, data will be converted to encode video binary and write to oss file,
- :: return {
‘image’: ‘image’, ‘feature’: ‘json’
} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json
- predict(input_data_list, batch_size=- 1)[source]¶
using session run predict a number of samples using batch_size
- Parameters
input_data_list – a list of numpy array, each array is a sample to be predicted
batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime
- Returns
- a list of dict, each dict is the prediction result of one sample
eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array
- Return type
result
- class easycv.predictors.feature_extractor.TorchFaceFeatureExtractor(model_path, model_config=None)[source]¶
Bases:
easycv.predictors.interface.PredictorInterface
- __init__(model_path, model_config=None)[source]¶
init model, add a facedetect and align for img input.
- Parameters
model_path – model file path
model_config – config string for model to init, in json format
- get_output_type()[source]¶
in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str
type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty
type video, data will be converted to encode video binary and write to oss file,
- :: return {
‘image’: ‘image’, ‘feature’: ‘json’
} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json
- predict(input_data_list, batch_size=- 1, detect_and_align=True)[source]¶
using session run predict a number of samples using batch_size
- Parameters
input_data_list – a list of numpy array or PIL.Image, each array is a sample to be predicted
batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime
detect_and_align – True to detect and align before feature extractor
- Returns
- a list of dict, each dict is the prediction result of one sample
eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array
- Return type
result
- Raises
if detect !=1 face in a img, then do nothing for this image –
- class easycv.predictors.feature_extractor.TorchMultiFaceFeatureExtractor(model_path, model_config=None)[source]¶
Bases:
easycv.predictors.interface.PredictorInterface
- __init__(model_path, model_config=None)[source]¶
init model, add a facedetect and align for img input.
- Parameters
model_path – model file path
model_config – config string for model to init, in json format
- get_output_type()[source]¶
in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str
type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty
type video, data will be converted to encode video binary and write to oss file,
- :: return {
‘image’: ‘image’, ‘feature’: ‘json’
} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json
- predict(input_data_list, batch_size=- 1, detect_and_align=True)[source]¶
using session run predict a number of samples using batch_size
- Parameters
input_data_list – a list of numpy array or PIL.Image, each array is a sample to be predicted
batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime
detect_and_align – True to detect and align before feature extractor
- Returns
- a list of dict, each dict is the prediction result of one sample
eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array
- Return type
result
- Raises
if detect !=1 face in a img, then do nothing for this image –
- class easycv.predictors.feature_extractor.TorchFaceAttrExtractor(model_path, model_config=None, face_threshold=0.95, attr_method=['distribute_sum', 'softmax', 'softmax'], attr_name=['age', 'gender', 'emo'])[source]¶
Bases:
easycv.predictors.interface.PredictorInterface
- __init__(model_path, model_config=None, face_threshold=0.95, attr_method=['distribute_sum', 'softmax', 'softmax'], attr_name=['age', 'gender', 'emo'])[source]¶
init model
- Parameters
model_path – model file path
model_config – config string for model to init, in json format
attr_method –
softmax: do softmax for feature_dim 1
distribute_sum: do softmax and prob sum
- get_output_type()[source]¶
in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str
type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty
type video, data will be converted to encode video binary and write to oss file,
- :: return {
‘image’: ‘image’, ‘feature’: ‘json’
} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json
- predict(input_data_list, batch_size=- 1)[source]¶
using session run predict a number of samples using batch_size
- Parameters
input_data_list – a list of numpy array, each array is a sample to be predicted
batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime
- Returns
- a list of dict, each dict is the prediction result of one sample
eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array
- Return type
result
easycv.predictors.interface module¶
- class easycv.predictors.interface.PredictorInterface(model_path, model_config=None)[source]¶
Bases:
object
- version = 1¶
- __init__(model_path, model_config=None)[source]¶
init model
- Parameters
model_path – init model from this directory
model_config – config string for model to init, in json format
- abstract predict(input_data, batch_size)[source]¶
using session run predict a number of samples using batch_size
- Parameters
input_data – a list of numpy array, each array is a sample to be predicted
batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime
- Returns
- a list of dict, each dict is the prediction result of one sample
eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array
- Return type
result
- get_output_type()[source]¶
in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str
type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty
type video, data will be converted to encode video binary and write to oss file,
- :: return {
‘image’: ‘image’, ‘feature’: ‘json’
} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json
- class easycv.predictors.interface.PredictorInterfaceV2(model_path, model_config=None)[source]¶
Bases:
easycv.predictors.interface.PredictorInterface
- version = 2¶
- __init__(model_path, model_config=None)[source]¶
init model
- Parameters
model_path – init model from this directory
model_config – config string for model to init, in json format
- get_output_type()[source]¶
in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str
type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty
type video, data will be converted to encode video binary and write to oss file,
- :: return {
‘image’: ‘image’, ‘feature’: ‘json’
} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json
- abstract predict(input_data_dict_list, batch_size)[source]¶
using session run predict a number of samples using batch_size
- Parameters
input_data_dict_list – a list of dict, each dict is a sample data to be predicted
batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime
- Returns
- a list of dict, each dict is the prediction result of one sample
eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array
- Return type
result
easycv.predictors.pose_predictor module¶
- easycv.predictors.pose_predictor.vis_pose_result(model, img, result, radius=4, thickness=1, kpt_score_thr=0.3, bbox_color='green', dataset_info=None, out_file=None, pose_kpt_color=None, pose_link_color=None, text_color='white', font_scale=0.5, bbox_thickness=1, win_name='', show=False, wait_time=0)[source]¶
Visualize the detection results on the image.
- Parameters
model (nn.Module) – The loaded detector.
img (str | np.ndarray) – Image filename or loaded image.
result (list[dict]) – The results to draw over img (bbox_result, pose_result).
radius (int) – Radius of circles.
thickness (int) – Thickness of lines.
kpt_score_thr (float) – The threshold to visualize the keypoints.
skeleton (list[tuple()]) – Default None.
out_file (str or None) – The filename of the output visualization image.
show (bool) – Whether to show the image. Default: False.
wait_time (int) – Value of waitKey param. Default: 0.
out_file – The filename to write the image. Default: None.
- class easycv.predictors.pose_predictor.PoseTopDownInputProcessor(cfg, dataset_info, detection_predictor_config, bbox_thr=None, pipelines=None, batch_size=1, cat_id=None, mode='BGR')[source]¶
- class easycv.predictors.pose_predictor.PoseTopDownPredictor(model_path, config_file=None, detection_predictor_config=None, batch_size=1, bbox_thr=None, cat_id=None, device=None, pipelines=None, save_results=False, save_path=None, mode='BGR', model_type=None, *args, **kwargs)[source]¶
Bases:
easycv.predictors.base.PredictorV2
Pose topdown predictor. :param model_path: Path of model path. :type model_path: str :param config_file: Config file path for model and processor to init. Defaults to None. :type config_file: Optinal[str] :param detection_model_config: Dict of person detection model predictor config,
example like
dict(type="", model_path="", config_file="", ......)
- Parameters
batch_size (int) – Batch size for forward.
bbox_thr (float) – Bounding box threshold to filter output results of detection model
cat_id (int | str) – Category id or name to filter target objects.
device (str | torch.device) – Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically.
save_results (bool) – Whether to save predict results.
save_path (str) – File path for saving results, only valid when save_results is True.
pipelines (list[dict]) – Data pipeline configs.
mode (str) – The image mode into the model.
- __init__(model_path, config_file=None, detection_predictor_config=None, batch_size=1, bbox_thr=None, cat_id=None, device=None, pipelines=None, save_results=False, save_path=None, mode='BGR', model_type=None, *args, **kwargs)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- prepare_model()[source]¶
Build model from config file by default. If the model is not loaded from a configuration file, e.g. torch jit model, you need to reimplement it.
easycv.core package¶
Subpackages¶
easycv.core.evaluation package¶
Subpackages¶
easycv.core.evaluation.custom_cocotools package¶
- class easycv.core.evaluation.custom_cocotools.cocoeval.COCOeval(cocoGt=None, cocoDt=None, iouType='segm', sigmas=None)[source]¶
Bases:
object
- __init__(cocoGt=None, cocoDt=None, iouType='segm', sigmas=None)[source]¶
Initialize CocoEval using coco APIs for gt and dt :param cocoGt: coco object with ground truth annotations :param cocoDt: coco object with detection results :param iouType: type of iou to be computed, bbox for detection task,
segm for segmentation task
- Parameters
sigmas – keypoint labelling sigmas.
- Returns
None
- evaluate()[source]¶
Run per image evaluation on given images and store results (a list of dict) in self.evalImgs :returns: None
- evaluateImg(imgId, catId, aRng, maxDet)[source]¶
perform evaluation for single category and image :param imgId: image id, string :param catId: category id, string :param aRng: area range, tuple :param maxDet: maximum detection number
- Returns
dict (single image results)
- accumulate(p=None)[source]¶
Accumulate per image evaluation results and store the result in self.eval :param param p: input params for evaluation
- Returns
None
- summarize()[source]¶
Compute and display summary metrics for evaluation results. Note this functin can only be applied on the default parameter setting
Submodules¶
easycv.core.evaluation.ap module¶
easycv.core.evaluation.auc_eval module¶
- class easycv.core.evaluation.auc_eval.AucEvaluator(dataset_name=None, metric_names=['neck_auc'], neck_num=None)[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
AUC evaluator for binary classification only.
easycv.core.evaluation.base_evaluator module¶
- class easycv.core.evaluation.base_evaluator.Evaluator(dataset_name=None, metric_names=[])[source]¶
Bases:
object
Evaluator interface
- __init__(dataset_name=None, metric_names=[])[source]¶
Construct eval ops from tensor
- Parameters
dataset_name (str) – dataset name to be evaluated
metric_names (List[str]) – metric names this evaluator will return
- property metric_names¶
easycv.core.evaluation.builder module¶
easycv.core.evaluation.classification_eval module¶
- class easycv.core.evaluation.classification_eval.ClsEvaluator(topk=(1, 5), dataset_name=None, metric_names=['neck_top1'], neck_num=None, class_list=None)[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
Classification evaluator.
- __init__(topk=(1, 5), dataset_name=None, metric_names=['neck_top1'], neck_num=None, class_list=None)[source]¶
- Parameters
top_k (int, tuple) – int or tuple of int, evaluate top_k acc
dataset_name – eval dataset name
metric_names – eval metrics name
neck_num – some model contains multi-neck to support multitask, neck_num means use the no.neck_num neck output of model to eval
- class easycv.core.evaluation.classification_eval.MultiLabelEvaluator(dataset_name=None, metric_names=['mAP'])[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
Multilabel Classification evaluator.
- __init__(dataset_name=None, metric_names=['mAP'])[source]¶
- Parameters
dataset_name – eval dataset name
metric_names – eval metrics name
- mAP(pred, target)[source]¶
Calculate the mean average precision with respect of classes. :param pred: The model prediction with shape
(N, C), where C is the number of classes.
- Parameters
target (torch.Tensor | np.ndarray) – The target of each prediction with shape (N, C), where C is the number of classes. 1 stands for positive examples, 0 stands for negative examples and -1 stands for difficult examples.
- Returns
A single float as mAP value.
- Return type
float
- average_precision(pred, target)[source]¶
Calculate the average precision for a single class. AP summarizes a precision-recall curve as the weighted mean of maximum precisions obtained for any r’>r, where r is the recall: .. math:
\text{AP} = \sum_n (R_n - R_{n-1}) P_n
Note that no approximation is involved since the curve is piecewise constant. :param pred: The model prediction with shape (N, ). :type pred: np.ndarray :param target: The target of each prediction with shape (N, ). :type target: np.ndarray
- Returns
a single float as average precision value.
- Return type
float
easycv.core.evaluation.coco_evaluation module¶
Class for evaluating object detections with COCO metrics.
- class easycv.core.evaluation.coco_evaluation.CocoDetectionEvaluator(classes, include_metrics_per_category=False, all_metrics_per_category=False, coco_analyze=False, dataset_name=None, metric_names=['DetectionBoxes_Precision/mAP'])[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
Class to evaluate COCO detection metrics.
- __init__(classes, include_metrics_per_category=False, all_metrics_per_category=False, coco_analyze=False, dataset_name=None, metric_names=['DetectionBoxes_Precision/mAP'])[source]¶
Constructor.
- Parameters
classes – a list of class name
include_metrics_per_category – If True, include metrics for each category.
all_metrics_per_category – Whether to include all the summary metrics for each category in per_category_ap. Be careful with setting it to true if you have more than handful of ∏, because it will pollute your mldash.
coco_analyze – If True, will analyze the detection result using coco analysis.
dataset_name – If not None, dataset_name will be inserted to each metric name.
- add_single_ground_truth_image_info(image_id, groundtruth_dict)[source]¶
Adds groundtruth for a single image to be used for evaluation.
If the image has already been added, a warning is logged, and groundtruth is ignored.
- Parameters
image_id – A unique string/integer identifier for the image.
groundtruth_dict –
A dictionary containing
- InputDataFields.groundtruth_boxes
float32 numpy array of shape [num_boxes, 4] containing num_boxes groundtruth boxes of the format [ymin, xmin, ymax, xmax] in absolute image coordinates.
- InputDataFields.groundtruth_classes
integer numpy array of shape [num_boxes] containing 1-indexed groundtruth classes for the boxes. InputDataFields.groundtruth_is_crowd (optional): integer numpy array of shape [num_boxes] containing iscrowd flag for groundtruth boxes.
- add_single_detected_image_info(image_id, detections_dict)[source]¶
Adds detections for a single image to be used for evaluation.
If a detection has already been added for this image id, a warning is logged, and the detection is skipped.
- Parameters
image_id – A unique string/integer identifier for the image.
detections_dict –
A dictionary containing
- DetectionResultFields.detection_boxes
float32 numpy array of shape [num_boxes, 4] containing num_boxes detection boxes of the format [ymin, xmin, ymax, xmax] in absolute image coordinates.
- DetectionResultFields.detection_scores
float32 numpy array of shape [num_boxes] containing detection scores for the boxes.
- DetectionResultFields.detection_classes
integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.
- Raises
ValueError – If groundtruth for the image_id is not available.
- class easycv.core.evaluation.coco_evaluation.CocoMaskEvaluator(classes, include_metrics_per_category=False, dataset_name=None, metric_names=['DetectionMasks_Precision/mAP'])[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
Class to evaluate COCO detection metrics.
- __init__(classes, include_metrics_per_category=False, dataset_name=None, metric_names=['DetectionMasks_Precision/mAP'])[source]¶
Constructor.
- Parameters
categories – A list of dicts, each of which has the following keys :id: (required) an integer id uniquely identifying this category. :name: (required) string representing category name e.g., ‘cat’, ‘dog’.
include_metrics_per_category – If True, include metrics for each category.
- add_single_ground_truth_image_info(image_id, groundtruth_dict)[source]¶
Adds groundtruth for a single image to be used for evaluation.
If the image has already been added, a warning is logged, and groundtruth is ignored.
- Parameters
image_id – A unique string/integer identifier for the image.
groundtruth_dict –
A dictionary containing :InputDataFields.groundtruth_boxes: float32 numpy array of shape
[num_boxes, 4] containing num_boxes groundtruth boxes of the format [ymin, xmin, ymax, xmax] in absolute image coordinates.
- InputDataFields.groundtruth_classes
integer numpy array of shape [num_boxes] containing 1-indexed groundtruth classes for the boxes.
- InputDataFields.groundtruth_instance_masks
uint8 numpy array of shape [num_boxes, image_height, image_width] containing groundtruth masks corresponding to the boxes. The elements of the array must be in {0, 1}.
- add_single_detected_image_info(image_id, detections_dict)[source]¶
Adds detections for a single image to be used for evaluation.
If a detection has already been added for this image id, a warning is logged, and the detection is skipped.
- Parameters
image_id – A unique string/integer identifier for the image.
detections_dict – A dictionary containing - DetectionResultFields.detection_scores: float32 numpy array of shape [num_boxes] containing detection scores for the boxes. DetectionResultFields.detection_classes: integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes. DetectionResultFields.detection_masks: optional uint8 numpy array of shape [num_boxes, image_height, image_width] containing instance masks corresponding to the boxes. The elements of the array must be in {0, 1}.
- Raises
ValueError – If groundtruth for the image_id is not available or if spatial shapes of groundtruth_instance_masks and detection_masks are incompatible.
- class easycv.core.evaluation.coco_evaluation.CoCoPoseTopDownEvaluator(dataset_name=None, metric_names=['AP'], **kwargs)[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
Class to evaluate COCO keypoint topdown metrics.
- class easycv.core.evaluation.coco_evaluation.CocoPanopticEvaluator(dataset_name=None, metric_names=['PQ'], classes=None, file_client_args={'backend': 'disk'}, **kwargs)[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
Class to evaluate COCO panoptic metrics.
- __init__(dataset_name=None, metric_names=['PQ'], classes=None, file_client_args={'backend': 'disk'}, **kwargs)[source]¶
Construct eval ops from tensor
- Parameters
dataset_name (str) – dataset name to be evaluated
metric_names (List[str]) – metric names this evaluator will return
- easycv.core.evaluation.coco_evaluation.pq_compute_single_core(proc_id, annotation_set, gt_folder, pred_folder, categories, file_client=None, print_log=False)[source]¶
The single core function to evaluate the metric of Panoptic Segmentation.
Same as the function with the same name in panopticapi. Only the function to load the images is changed to use the file client.
- Parameters
proc_id (int) – The id of the mini process.
gt_folder (str) – The path of the ground truth images.
pred_folder (str) – The path of the prediction images.
categories (str) – The categories of the dataset.
file_client (object) – The file client of the dataset. If None, the backend will be set to disk.
print_log (bool) – Whether to print the log. Defaults to False.
- easycv.core.evaluation.coco_evaluation.pq_compute_multi_core(matched_annotations_list, gt_folder, pred_folder, categories, file_client=None, nproc=32)[source]¶
Evaluate the metrics of Panoptic Segmentation with multithreading.
Same as the function with the same name in panopticapi.
- Parameters
matched_annotations_list (list) – The matched annotation list. Each element is a tuple of annotations of the same image with the format (gt_anns, pred_anns).
gt_folder (str) – The path of the ground truth images.
pred_folder (str) – The path of the prediction images.
categories (str) – The categories of the dataset.
file_client (object) – The file client of the dataset. If None, the backend will be set to disk.
nproc (int) – Number of processes for panoptic quality computing. Defaults to 32. When nproc exceeds the number of cpu cores, the number of cpu cores is used.
easycv.core.evaluation.coco_tools module¶
Wrappers for third party pycocotools to be used within object_detection.
Note that nothing in this file is tensorflow related and thus cannot be called directly as a slim metric, for example.
TODO(jonathanhuang): wrap as a slim metric in metrics.py
Usage example: given a set of images with ids in the list image_ids and corresponding lists of numpy arrays encoding groundtruth (boxes and classes) and detections (boxes, scores and classes), where elements of each list correspond to detections/annotations of a single image, then evaluation (in multi-class mode) can be invoked as follows:
- groundtruth_dict = coco_tools.ExportGroundtruthToCOCO(
image_ids, groundtruth_boxes_list, groundtruth_classes_list, max_num_classes, output_path=None)
- detections_list = coco_tools.ExportDetectionsToCOCO(
image_ids, detection_boxes_list, detection_scores_list, detection_classes_list, output_path=None)
groundtruth = coco_tools.COCOWrapper(groundtruth_dict) detections = groundtruth.LoadAnnotations(detections_list) evaluator = coco_tools.COCOEvalWrapper(groundtruth, detections,
agnostic_mode=False)
metrics = evaluator.ComputeMetrics()
- class easycv.core.evaluation.coco_tools.COCOWrapper(dataset, detection_type='bbox')[source]¶
Bases:
xtcocotools.coco.COCO
Wrapper for the pycocotools COCO class.
- __init__(dataset, detection_type='bbox')[source]¶
COCOWrapper constructor.
See http://mscoco.org/dataset/#format for a description of the format. By default, the coco.COCO class constructor reads from a JSON file. This function duplicates the same behavior but loads from a dictionary, allowing us to perform evaluation without writing to external storage.
- Parameters
dataset – a dictionary holding bounding box annotations in the COCO format.
detection_type – type of detections being wrapped. Can be one of [‘bbox’, ‘segmentation’]
- Raises
ValueError – if detection_type is unsupported.
- LoadAnnotations(annotations)[source]¶
Load annotations dictionary into COCO datastructure.
See http://mscoco.org/dataset/#format for a description of the annotations format. As above, this function replicates the default behavior of the API but does not require writing to external storage.
- Parameters
annotations – python list holding object detection results where each detection is encoded as a dict with required keys [‘image_id’, ‘category_id’, ‘score’] and one of [‘bbox’, ‘segmentation’] based on detection_type.
- Returns
a coco.COCO datastructure holding object detection annotations results
- Raises
ValueError – if annotations is not a list
ValueError – if annotations do not correspond to the images contained in self.
- class easycv.core.evaluation.coco_tools.COCOEvalWrapper(groundtruth=None, detections=None, agnostic_mode=False, iou_type='bbox')[source]¶
Bases:
easycv.core.evaluation.custom_cocotools.cocoeval.COCOeval
Wrapper for the pycocotools COCOeval class.
To evaluate, create two objects (groundtruth_dict and detections_list) using the conventions listed at http://mscoco.org/dataset/#format. Then call evaluation as follows:
groundtruth = coco_tools.COCOWrapper(groundtruth_dict) detections = groundtruth.LoadAnnotations(detections_list) evaluator = coco_tools.COCOEvalWrapper(groundtruth, detections,
agnostic_mode=False)
metrics = evaluator.ComputeMetrics()
- __init__(groundtruth=None, detections=None, agnostic_mode=False, iou_type='bbox')[source]¶
COCOEvalWrapper constructor.
Note that for the area-based metrics to be meaningful, detection and groundtruth boxes must be in image coordinates measured in pixels.
- Parameters
groundtruth – a coco.COCO (or coco_tools.COCOWrapper) object holding groundtruth annotations
detections – a coco.COCO (or coco_tools.COCOWrapper) object holding detections
agnostic_mode – boolean (default: False). If True, evaluation ignores class labels, treating all detections as proposals.
iou_type – IOU type to use for evaluation. Supports bbox or segm.
- GetCategory(category_id)[source]¶
Fetches dictionary holding category information given category id.
- Parameters
category_id – integer id
- Returns
dictionary holding ‘id’, ‘name’.
- ComputeMetrics(include_metrics_per_category=False, all_metrics_per_category=False)[source]¶
Computes detection metrics.
- Parameters
include_metrics_per_category – If True, will include metrics per category.
all_metrics_per_category – If true, include all the summery metrics for each category in per_category_ap. Be careful with setting it to true if you have more than handful of categories, because it will pollute your mldash.
- Returns
- a dictionary holding:
- ’Precision/mAP’: mean average precision over classes averaged over IOU
thresholds ranging from .5 to .95 with .05 increments
’Precision/mAP@.50IOU’: mean average precision at 50% IOU ‘Precision/mAP@.75IOU’: mean average precision at 75% IOU ‘Precision/mAP (small)’: mean average precision for small objects
(area < 32^2 pixels)
- ’Precision/mAP (medium)’: mean average precision for medium sized
objects (32^2 pixels < area < 96^2 pixels)
- ’Precision/mAP (large)’: mean average precision for large objects
(96^2 pixels < area < 10000^2 pixels)
’Recall/AR@1’: average recall with 1 detection ‘Recall/AR@10’: average recall with 10 detections ‘Recall/AR@100’: average recall with 100 detections ‘Recall/AR@100 (small)’: average recall for small objects with 100
detections
- ’Recall/AR@100 (medium)’: average recall for medium objects with 100
detections
- ’Recall/AR@100 (large)’: average recall for large objects with 100
detections
per_category_ap: a dictionary holding category specific results with
keys of the form: ‘Precision mAP ByCategory/category’ (without the supercategory part if no supercategories exist). For backward compatibility ‘PerformanceByCategory’ is included in the output regardless of all_metrics_per_category. If evaluating class-agnostic mode, per_category_ap is an empty dictionary.
- Return type
summary_metrics
- Raises
ValueError – If category_stats does not exist.
- Analyze()[source]¶
Analyze detection results.
Args:
- Returns
- A dictionary containing images of analyzing result images,
key is the image name, value is a [H,W,3] numpy array which represent the image content. You can refer to http://cocodataset.org/#detection-eval section 4 Analysis code.
- easycv.core.evaluation.coco_tools.ExportSingleImageGroundtruthToCoco(image_id, next_annotation_id, category_id_set, groundtruth_boxes, groundtruth_classes, groundtruth_masks=None, groundtruth_is_crowd=None, super_categories=None)[source]¶
Export groundtruth of a single image to COCO format.
This function converts groundtruth detection annotations represented as numpy arrays to dictionaries that can be ingested by the COCO evaluation API. Note that the image_ids provided here must match the ones given to ExportSingleImageDetectionsToCoco. We assume that boxes and classes are in correspondence - that is: groundtruth_boxes[i, :], and groundtruth_classes[i] are associated with the same groundtruth annotation.
In the exported result, “area” fields are always set to the area of the groundtruth bounding box.
- Parameters
image_id – a unique image identifier either of type integer or string.
next_annotation_id – integer specifying the first id to use for the groundtruth annotations. All annotations are assigned a continuous integer id starting from this value.
category_id_set – A set of valid class ids. Groundtruth with classes not in category_id_set are dropped.
groundtruth_boxes – numpy array (float32) with shape [num_gt_boxes, 4]
groundtruth_classes – numpy array (int) with shape [num_gt_boxes]
groundtruth_masks – optional uint8 numpy array of shape [num_detections, image_height, image_width] containing detection_masks.
groundtruth_is_crowd – optional numpy array (int) with shape [num_gt_boxes] indicating whether groundtruth boxes are crowd.
super_categories – optional list of str indicating each box super category
- Returns
a list of groundtruth annotations for a single image in the COCO format.
- Raises
ValueError – if (1) groundtruth_boxes and groundtruth_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers
- easycv.core.evaluation.coco_tools.ExportGroundtruthToCOCO(image_ids, groundtruth_boxes, groundtruth_classes, categories, output_path=None)[source]¶
Export groundtruth detection annotations in numpy arrays to COCO API.
This function converts a set of groundtruth detection annotations represented as numpy arrays to dictionaries that can be ingested by the COCO API. Inputs to this function are three lists: image ids for each groundtruth image, groundtruth boxes for each image and groundtruth classes respectively. Note that the image_ids provided here must match the ones given to the ExportDetectionsToCOCO function in order for evaluation to work properly. We assume that for each image, boxes, scores and classes are in correspondence — that is: image_id[i], groundtruth_boxes[i, :] and groundtruth_classes[i] are associated with the same groundtruth annotation.
In the exported result, “area” fields are always set to the area of the groundtruth bounding box and “iscrowd” fields are always set to 0. TODO(jonathanhuang): pass in “iscrowd” array for evaluating on COCO dataset.
- Parameters
image_ids – a list of unique image identifier either of type integer or string.
groundtruth_boxes – list of numpy arrays with shape [num_gt_boxes, 4] (note that num_gt_boxes can be different for each entry in the list)
groundtruth_classes – list of numpy arrays (int) with shape [num_gt_boxes] (note that num_gt_boxes can be different for each entry in the list)
categories –
a list of dictionaries representing all possible categories. Each dict in this list has the following keys:
’id’: (required) an integer id uniquely identifying this category ‘name’: (required) string representing category name
e.g., ‘cat’, ‘dog’, ‘pizza’
- ’supercategory’: (optional) string representing the supercategory
e.g., ‘animal’, ‘vehicle’, ‘food’, etc
output_path – (optional) path for exporting result to JSON
- Returns
dictionary that can be read by COCO API
- Raises
ValueError – if (1) groundtruth_boxes and groundtruth_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers
- easycv.core.evaluation.coco_tools.ExportSingleImageDetectionBoxesToCoco(image_id, category_id_set, detection_boxes, detection_scores, detection_classes)[source]¶
Export detections of a single image to COCO format.
This function converts detections represented as numpy arrays to dictionaries that can be ingested by the COCO evaluation API. Note that the image_ids provided here must match the ones given to the ExporSingleImageDetectionBoxesToCoco. We assume that boxes, and classes are in correspondence - that is: boxes[i, :], and classes[i] are associated with the same groundtruth annotation.
- Parameters
image_id – unique image identifier either of type integer or string.
category_id_set – A set of valid class ids. Detections with classes not in category_id_set are dropped.
detection_boxes – float numpy array of shape [num_detections, 4] containing detection boxes.
detection_scores – float numpy array of shape [num_detections] containing scored for the detection boxes.
detection_classes – integer numpy array of shape [num_detections] containing the classes for detection boxes.
- Returns
a list of detection annotations for a single image in the COCO format.
- Raises
ValueError – if (1) detection_boxes, detection_scores and detection_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers.
- easycv.core.evaluation.coco_tools.ExportSingleImageDetectionMasksToCoco(image_id, category_id_set, detection_masks, detection_scores, detection_classes)[source]¶
Export detection masks of a single image to COCO format.
This function converts detections represented as numpy arrays to dictionaries that can be ingested by the COCO evaluation API. We assume that detection_masks, detection_scores, and detection_classes are in correspondence - that is: detection_masks[i, :], detection_classes[i] and detection_scores[i]
are associated with the same annotation.
- Parameters
image_id – unique image identifier either of type integer or string.
category_id_set – A set of valid class ids. Detections with classes not in category_id_set are dropped.
detection_masks – uint8 numpy array of shape [num_detections, image_height, image_width] containing detection_masks.
detection_scores – float numpy array of shape [num_detections] containing scores for detection masks.
detection_classes – integer numpy array of shape [num_detections] containing the classes for detection masks.
- Returns
a list of detection mask annotations for a single image in the COCO format.
- Raises
ValueError – if (1) detection_masks, detection_scores and detection_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers.
- easycv.core.evaluation.coco_tools.ExportDetectionsToCOCO(image_ids, detection_boxes, detection_scores, detection_classes, categories, output_path=None)[source]¶
Export detection annotations in numpy arrays to COCO API.
This function converts a set of predicted detections represented as numpy arrays to dictionaries that can be ingested by the COCO API. Inputs to this function are lists, consisting of boxes, scores and classes, respectively, corresponding to each image for which detections have been produced. Note that the image_ids provided here must match the ones given to the ExportGroundtruthToCOCO function in order for evaluation to work properly.
We assume that for each image, boxes, scores and classes are in correspondence — that is: detection_boxes[i, :], detection_scores[i] and detection_classes[i] are associated with the same detection.
- Parameters
image_ids – a list of unique image identifier either of type integer or string.
detection_boxes – list of numpy arrays with shape [num_detection_boxes, 4]
detection_scores – list of numpy arrays (float) with shape [num_detection_boxes]. Note that num_detection_boxes can be different for each entry in the list.
detection_classes – list of numpy arrays (int) with shape [num_detection_boxes]. Note that num_detection_boxes can be different for each entry in the list.
categories – a list of dictionaries representing all possible categories. Each dict in this list must have an integer ‘id’ key uniquely identifying this category.
output_path – (optional) path for exporting result to JSON
- Returns
list of dictionaries that can be read by COCO API, where each entry corresponds to a single detection and has keys from: [‘image_id’, ‘category_id’, ‘bbox’, ‘score’].
- Raises
ValueError – if (1) detection_boxes and detection_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers.
- easycv.core.evaluation.coco_tools.ExportSegmentsToCOCO(image_ids, detection_masks, detection_scores, detection_classes, categories, output_path=None)[source]¶
Export segmentation masks in numpy arrays to COCO API.
This function converts a set of predicted instance masks represented as numpy arrays to dictionaries that can be ingested by the COCO API. Inputs to this function are lists, consisting of segments, scores and classes, respectively, corresponding to each image for which detections have been produced.
Note this function is recommended to use for small dataset. For large dataset, it should be used with a merge function (e.g. in map reduce), otherwise the memory consumption is large.
We assume that for each image, masks, scores and classes are in correspondence — that is: detection_masks[i, :, :, :], detection_scores[i] and detection_classes[i] are associated with the same detection.
- Parameters
image_ids – list of image ids (typically ints or strings)
detection_masks – list of numpy arrays with shape [num_detection, h, w, 1] and type uint8. The height and width should match the shape of corresponding image.
detection_scores – list of numpy arrays (float) with shape [num_detection]. Note that num_detection can be different for each entry in the list.
detection_classes – list of numpy arrays (int) with shape [num_detection]. Note that num_detection can be different for each entry in the list.
categories – a list of dictionaries representing all possible categories. Each dict in this list must have an integer ‘id’ key uniquely identifying this category.
output_path – (optional) path for exporting result to JSON
- Returns
list of dictionaries that can be read by COCO API, where each entry corresponds to a single detection and has keys from: [‘image_id’, ‘category_id’, ‘segmentation’, ‘score’].
- Raises
ValueError – if detection_masks and detection_classes do not have the right lengths or if each of the elements inside these lists do not have the correct shapes.
- easycv.core.evaluation.coco_tools.ExportKeypointsToCOCO(image_ids, detection_keypoints, detection_scores, detection_classes, categories, output_path=None)[source]¶
Exports keypoints in numpy arrays to COCO API.
This function converts a set of predicted keypoints represented as numpy arrays to dictionaries that can be ingested by the COCO API. Inputs to this function are lists, consisting of keypoints, scores and classes, respectively, corresponding to each image for which detections have been produced.
We assume that for each image, keypoints, scores and classes are in correspondence — that is: detection_keypoints[i, :, :, :], detection_scores[i] and detection_classes[i] are associated with the same detection.
- Parameters
image_ids – list of image ids (typically ints or strings)
detection_keypoints – list of numpy arrays with shape [num_detection, num_keypoints, 2] and type float32 in absolute x-y coordinates.
detection_scores – list of numpy arrays (float) with shape [num_detection]. Note that num_detection can be different for each entry in the list.
detection_classes – list of numpy arrays (int) with shape [num_detection]. Note that num_detection can be different for each entry in the list.
categories – a list of dictionaries representing all possible categories. Each dict in this list must have an integer ‘id’ key uniquely identifying this category and an integer ‘num_keypoints’ key specifying the number of keypoints the category has.
output_path – (optional) path for exporting result to JSON
- Returns
list of dictionaries that can be read by COCO API, where each entry corresponds to a single detection and has keys from: [‘image_id’, ‘category_id’, ‘keypoints’, ‘score’].
- Raises
ValueError – if detection_keypoints and detection_classes do not have the right lengths or if each of the elements inside these lists do not have the correct shapes.
easycv.core.evaluation.faceid_pair_eval module¶
- easycv.core.evaluation.faceid_pair_eval.calculate_roc(thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10, pca=0)[source]¶
- easycv.core.evaluation.faceid_pair_eval.calculate_val(thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10)[source]¶
- easycv.core.evaluation.faceid_pair_eval.faceid_evaluate(embeddings, actual_issame, nrof_folds=10, pca=0)[source]¶
Do Kfold=nrof_folds faceid pair-match test for embeddings :param embeddings: [N x C] inputs embedding of all dataset :param actual_issame: [N/2, 1] label of is match :param nrof_folds: KFold number :param pca: > 0 means, do pca and trans embedding to [N, pca] feature
- Returns
KFold average best accuracy and best threshold
- class easycv.core.evaluation.faceid_pair_eval.FaceIDPairEvaluator(dataset_name=None, metric_names=['acc'], kfold=10, pca=0)[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
FaceIDPairEvaluator evaluator. Input nx2 pairs and label, kfold thresholds search and return average best accuracy
- __init__(dataset_name=None, metric_names=['acc'], kfold=10, pca=0)[source]¶
Faceid small dataset evaluator, do pair match validation :param dataset_name: faceid small validate set name, include [lfw, agedb_30, cfp_ff, cfp_fw, calfw] :param kfold: Kfold for train/val split :param pca: pca dimensions, if > 0, do PCA for input feature, transfer to [n, pca]
- Returns
None
easycv.core.evaluation.metric_registry module¶
- class easycv.core.evaluation.metric_registry.MetricRegistry[source]¶
Bases:
object
- register_default_best_metric(cls, metric_name, metric_cmp_op='max')[source]¶
Register default best metric for each evaluator
- Parameters
cls (object) – class object
metric_name (str or List[str]) – default best metric name
metric_cmp_op (str or List[str]) – metric compare operation, should be one of [“max”, “min”]
easycv.core.evaluation.mse_eval module¶
- class easycv.core.evaluation.mse_eval.MSEEvaluator(dataset_name=None, metric_names=['avg_mse'], neck_num=None)[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
MSEEvaluator evaluator,
easycv.core.evaluation.retrival_topk_eval module¶
- class easycv.core.evaluation.retrival_topk_eval.RetrivalTopKEvaluator(topk=(1, 2, 4, 8), norm=0, metric='cos', pca=0, dataset_name=None, metric_names=['R@K=1'], save_results=False, save_results_dir='', feature_keyword=['neck'])[source]¶
Bases:
easycv.core.evaluation.base_evaluator.Evaluator
RetrivalTopK evaluator, Retrival evaluate do the topK retrival, by measuring the distance of every 1 vs other. get the topK nearest, and count the match of ID. if Retrival = 1, Miss = 0. Finally average all RetrivalRate.
easycv.core.evaluation.top_down_eval module¶
- easycv.core.evaluation.top_down_eval.pose_pck_accuracy(output, target, mask, thr=0.05, normalize=None)[source]¶
Calculate the pose accuracy of PCK for each individual keypoint and the averaged accuracy across all keypoints from heatmaps.
Note
PCK metric measures accuracy of the localization of the body joints. The distances between predicted positions and the ground-truth ones are typically normalized by the bounding box size. The threshold (thr) of the normalized distance is commonly set as 0.05, 0.1 or 0.2 etc.
batch_size: N num_keypoints: K heatmap height: H heatmap width: W
- Parameters
output (np.ndarray[N, K, H, W]) – Model output heatmaps.
target (np.ndarray[N, K, H, W]) – Groundtruth heatmaps.
mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.
thr (float) – Threshold of PCK calculation. Default 0.05.
normalize (np.ndarray[N, 2]) – Normalization factor for H&W.
- Returns
A tuple containing keypoint accuracy.
np.ndarray[K]: Accuracy of each keypoint.
float: Averaged accuracy across all keypoints.
int: Number of valid keypoints.
- Return type
tuple
- easycv.core.evaluation.top_down_eval.keypoint_pck_accuracy(pred, gt, mask, thr, normalize)[source]¶
Calculate the pose accuracy of PCK for each individual keypoint and the averaged accuracy across all keypoints for coordinates.
Note
PCK metric measures accuracy of the localization of the body joints. The distances between predicted positions and the ground-truth ones are typically normalized by the bounding box size. The threshold (thr) of the normalized distance is commonly set as 0.05, 0.1 or 0.2 etc.
batch_size: N num_keypoints: K
- Parameters
pred (np.ndarray[N, K, 2]) – Predicted keypoint location.
gt (np.ndarray[N, K, 2]) – Groundtruth keypoint location.
mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.
thr (float) – Threshold of PCK calculation.
normalize (np.ndarray[N, 2]) – Normalization factor for H&W.
- Returns
A tuple containing keypoint accuracy.
acc (np.ndarray[K]): Accuracy of each keypoint.
avg_acc (float): Averaged accuracy across all keypoints.
cnt (int): Number of valid keypoints.
- Return type
tuple
- easycv.core.evaluation.top_down_eval.keypoint_auc(pred, gt, mask, normalize, num_step=20)[source]¶
Calculate the pose accuracy of PCK for each individual keypoint and the averaged accuracy across all keypoints for coordinates.
Note
batch_size: N
num_keypoints: K
- Parameters
pred (np.ndarray[N, K, 2]) – Predicted keypoint location.
gt (np.ndarray[N, K, 2]) – Groundtruth keypoint location.
mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.
normalize (float) – Normalization factor.
- Returns
Area under curve.
- Return type
float
- easycv.core.evaluation.top_down_eval.keypoint_nme(pred, gt, mask, normalize_factor)[source]¶
Calculate the normalized mean error (NME).
Note
batch_size: N
num_keypoints: K
- Parameters
pred (np.ndarray[N, K, 2]) – Predicted keypoint location.
gt (np.ndarray[N, K, 2]) – Groundtruth keypoint location.
mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.
normalize_factor (np.ndarray[N, 2]) – Normalization factor.
- Returns
normalized mean error
- Return type
float
- easycv.core.evaluation.top_down_eval.keypoint_epe(pred, gt, mask)[source]¶
Calculate the end-point error.
Note
batch_size: N
num_keypoints: K
- Parameters
pred (np.ndarray[N, K, 2]) – Predicted keypoint location.
gt (np.ndarray[N, K, 2]) – Groundtruth keypoint location.
mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.
- Returns
Average end-point error.
- Return type
float
- easycv.core.evaluation.top_down_eval.post_dark_udp(coords, batch_heatmaps, kernel=3)[source]¶
DARK post-pocessing. Implemented by udp. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020). Zhang et al. Distribution-Aware Coordinate Representation for Human Pose Estimation (CVPR 2020).
Note
batch size: B num keypoints: K num persons: N height of heatmaps: H width of heatmaps: W B=1 for bottom_up paradigm where all persons share the same heatmap. B=N for top_down paradigm where each person has its own heatmaps.
- Parameters
coords (np.ndarray[N, K, 2]) – Initial coordinates of human pose.
batch_heatmaps (np.ndarray[B, K, H, W]) – batch_heatmaps
kernel (int) – Gaussian kernel size (K) for modulation.
- Returns
Refined coordinates.
- Return type
res (np.ndarray[N, K, 2])
- easycv.core.evaluation.top_down_eval.keypoints_from_heatmaps(heatmaps, center, scale, unbiased=False, post_process='default', kernel=11, valid_radius_factor=0.0546875, use_udp=False, target_type='GaussianHeatmap')[source]¶
Get final keypoint predictions from heatmaps and transform them back to the image.
Note
batch size: N num keypoints: K heatmap height: H heatmap width: W
- Parameters
heatmaps (np.ndarray[N, K, H, W], dtype=float32) – model predicted heatmaps.
center (np.ndarray[N, 2]) – Center of the bounding box (x, y).
scale (np.ndarray[N, 2]) – Scale of the bounding box wrt height/width.
post_process (str/None) – Choice of methods to post-process heatmaps. Currently supported: None, ‘default’, ‘unbiased’, ‘megvii’.
unbiased (bool) – Option to use unbiased decoding. Mutually exclusive with megvii. Note: this arg is deprecated and unbiased=True can be replaced by post_process=’unbiased’ Paper ref: Zhang et al. Distribution-Aware Coordinate Representation for Human Pose Estimation (CVPR 2020).
kernel (int) – Gaussian kernel size (K) for modulation, which should match the heatmap gaussian sigma when training. K=17 for sigma=3 and k=11 for sigma=2.
valid_radius_factor (float) – The radius factor of the positive area in classification heatmap for UDP.
use_udp (bool) – Use unbiased data processing.
target_type (str) – ‘GaussianHeatmap’ or ‘CombinedTarget’. GaussianHeatmap: Classification target with gaussian distribution. CombinedTarget: The combination of classification target (response map) and regression target (offset map). Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
- Returns
A tuple containing keypoint predictions and scores.
preds (np.ndarray[N, K, 2]): Predicted keypoint location in images.
maxvals (np.ndarray[N, K, 1]): Scores (confidence) of the keypoints.
- Return type
tuple
easycv.core.optimizer package¶
Submodules¶
easycv.core.optimizer.lars module¶
- class easycv.core.optimizer.lars.LARS(params, lr=<required parameter>, momentum=0, dampening=0, weight_decay=0, eta=0.001, nesterov=False)[source]¶
Bases:
torch.optim.optimizer.Optimizer
Implements layer-wise adaptive rate scaling for SGD.
- Parameters
params (iterable) – iterable of parameters to optimize or dicts defining parameter groups
lr (float) – base learning rate (gamma_0)
momentum (float, optional) – momentum factor (default: 0) (“m”)
weight_decay (float, optional) – weight decay (L2 penalty) (default: 0) (“beta”)
dampening (float, optional) – dampening for momentum (default: 0)
eta (float, optional) – LARS coefficient
nesterov (bool, optional) – enables Nesterov momentum (default: False)
Based on Algorithm 1 of the following paper by You, Gitman, and Ginsburg. Large Batch Training of Convolutional Networks:
Example
>>> optimizer = LARS(model.parameters(), lr=0.1, momentum=0.9, >>> weight_decay=1e-4, eta=1e-3) >>> optimizer.zero_grad() >>> loss_fn(model(input), target).backward() >>> optimizer.step()
easycv.core.optimizer.ranger module¶
- easycv.core.optimizer.ranger.centralized_gradient(x, use_gc=True, gc_conv_only=False)[source]¶
credit - https://github.com/Yonghongwei/Gradient-Centralization
- class easycv.core.optimizer.ranger.Ranger(params, lr=0.001, alpha=0.5, k=6, N_sma_threshhold=5, betas=(0.95, 0.999), eps=1e-05, weight_decay=0, use_gc=True, gc_conv_only=False, gc_loc=True)[source]¶
Bases:
torch.optim.optimizer.Optimizer
Adam+LookAhead: refer to https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer
- __init__(params, lr=0.001, alpha=0.5, k=6, N_sma_threshhold=5, betas=(0.95, 0.999), eps=1e-05, weight_decay=0, use_gc=True, gc_conv_only=False, gc_loc=True)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- step(closure=None)[source]¶
Performs a single optimization step (parameter update).
- Parameters
closure (Callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.
Note
Unless otherwise specified, this function should not modify the
.grad
field of the parameters.
easycv.core.post_processing package¶
- easycv.core.post_processing.affine_transform(pt, trans_mat)[source]¶
Apply an affine transformation to the points.
- Parameters
pt (np.ndarray) – a 2 dimensional point to be transformed
trans_mat (np.ndarray) – 2x3 matrix of an affine transform
- Returns
Transformed points.
- Return type
np.ndarray
- easycv.core.post_processing.flip_back(output_flipped, flip_pairs, target_type='GaussianHeatmap')[source]¶
Flip the flipped heatmaps back to the original form.
Note
batch_size: N num_keypoints: K heatmap height: H heatmap width: W
- Parameters
output_flipped (np.ndarray[N, K, H, W]) – The output heatmaps obtained from the flipped images.
flip_pairs (list[tuple()) – Pairs of keypoints which are mirrored (for example, left ear – right ear).
target_type (str) – GaussianHeatmap or CombinedTarget
- Returns
heatmaps that flipped back to the original image
- Return type
np.ndarray
- easycv.core.post_processing.fliplr_joints(joints_3d, joints_3d_visible, img_width, flip_pairs)[source]¶
Flip human joints horizontally.
Note
num_keypoints: K
- Parameters
joints_3d (np.ndarray([K, 3])) – Coordinates of keypoints.
joints_3d_visible (np.ndarray([K, 1])) – Visibility of keypoints.
img_width (int) – Image width.
flip_pairs (list[tuple()]) – Pairs of keypoints which are mirrored (for example, left ear – right ear).
- Returns
Flipped human joints.
joints_3d_flipped (np.ndarray([K, 3])): Flipped joints.
joints_3d_visible_flipped (np.ndarray([K, 1])): Joint visibility.
- Return type
tuple
- easycv.core.post_processing.fliplr_regression(regression, flip_pairs, center_mode='static', center_x=0.5, center_index=0)[source]¶
Flip human joints horizontally.
Note
batch_size: N num_keypoint: K
- Parameters
regression (np.ndarray([..., K, C])) –
Coordinates of keypoints, where K is the joint number and C is the dimension. Example shapes are: - [N, K, C]: a batch of keypoints where N is the batch size. - [N, T, K, C]: a batch of pose sequences, where T is the frame
number.
flip_pairs (list[tuple()]) – Pairs of keypoints which are mirrored (for example, left ear – right ear).
center_mode (str) – The mode to set the center location on the x-axis to flip around. Options are: - static: use a static x value (see center_x also) - root: use a root joint (see center_index also)
center_x (float) – Set the x-axis location of the flip center. Only used when center_mode=static.
center_index (int) – Set the index of the root joint, whose x location will be used as the flip center. Only used when center_mode=root.
- Returns
Flipped human joints.
regression_flipped (np.ndarray([…, K, C])): Flipped joints.
- Return type
tuple
- easycv.core.post_processing.get_affine_transform(center, scale, rot, output_size, shift=(0.0, 0.0), inv=False)[source]¶
Get the affine transform matrix, given the center/scale/rot/output_size.
- Parameters
center (np.ndarray[2, ]) – Center of the bounding box (x, y).
scale (np.ndarray[2, ]) – Scale of the bounding box wrt [width, height].
rot (float) – Rotation angle (degree).
output_size (np.ndarray[2, ] | list(2,)) – Size of the destination heatmaps.
shift (0-100%) – Shift translation ratio wrt the width/height. Default (0., 0.).
inv (bool) – Option to inverse the affine transform direction. (inv=False: src->dst or inv=True: dst->src)
- Returns
The transform matrix.
- Return type
np.ndarray
- easycv.core.post_processing.get_warp_matrix(theta, size_input, size_dst, size_target)[source]¶
Calculate the transformation matrix under the constraint of unbiased. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
- Parameters
theta (float) – Rotation angle in degrees.
size_input (np.ndarray) – Size of input image [w, h].
size_dst (np.ndarray) – Size of output image [w, h].
size_target (np.ndarray) – Size of ROI in input plane [w, h].
- Returns
A matrix for transformation.
- Return type
matrix (np.ndarray)
- easycv.core.post_processing.rotate_point(pt, angle_rad)[source]¶
Rotate a point by an angle.
- Parameters
pt (list[float]) – 2 dimensional point to be rotated
angle_rad (float) – rotation angle by radian
- Returns
Rotated point.
- Return type
list[float]
- easycv.core.post_processing.transform_preds(coords, center, scale, output_size, use_udp=False)[source]¶
Get final keypoint predictions from heatmaps and apply scaling and translation to map them back to the image.
Note
num_keypoints: K
- Parameters
coords (np.ndarray[K, ndims]) –
If ndims=2, corrds are predicted keypoint location.
If ndims=4, corrds are composed of (x, y, scores, tags)
If ndims=5, corrds are composed of (x, y, scores, tags, flipped_tags)
center (np.ndarray[2, ]) – Center of the bounding box (x, y).
scale (np.ndarray[2, ]) – Scale of the bounding box wrt [width, height].
output_size (np.ndarray[2, ] | list(2,)) – Size of the destination heatmaps.
use_udp (bool) – Use unbiased data processing
- Returns
Predicted coordinates in the images.
- Return type
np.ndarray
- easycv.core.post_processing.warp_affine_joints(joints, mat)[source]¶
Apply affine transformation defined by the transform matrix on the joints.
- Parameters
joints (np.ndarray[..., 2]) – Origin coordinate of joints.
mat (np.ndarray[3, 2]) – The affine matrix.
- Returns
Result coordinate of joints.
- Return type
matrix (np.ndarray[…, 2])
- easycv.core.post_processing.oks_nms(kpts_db, thr, sigmas=None, vis_thr=None)[source]¶
OKS NMS implementations.
- Parameters
kpts_db – keypoints.
thr – Retain overlap < thr.
sigmas – standard deviation of keypoint labelling.
vis_thr – threshold of the keypoint visibility.
- Returns
indexes to keep.
- Return type
np.ndarray
- easycv.core.post_processing.soft_oks_nms(kpts_db, thr, max_dets=20, sigmas=None, vis_thr=None)[source]¶
Soft OKS NMS implementations.
- Parameters
kpts_db –
thr – retain oks overlap < thr.
max_dets – max number of detections to keep.
sigmas – Keypoint labelling uncertainty.
- Returns
indexes to keep.
- Return type
np.ndarray
Submodules¶
easycv.core.post_processing.nms module¶
- easycv.core.post_processing.nms.oks_iou(g, d, a_g, a_d, sigmas=None, vis_thr=None)[source]¶
Calculate oks ious.
- Parameters
g – Ground truth keypoints.
d – Detected keypoints.
a_g – Area of the ground truth object.
a_d – Area of the detected object.
sigmas – standard deviation of keypoint labelling.
vis_thr – threshold of the keypoint visibility.
- Returns
The oks ious.
- Return type
list
- easycv.core.post_processing.nms.oks_nms(kpts_db, thr, sigmas=None, vis_thr=None)[source]¶
OKS NMS implementations.
- Parameters
kpts_db – keypoints.
thr – Retain overlap < thr.
sigmas – standard deviation of keypoint labelling.
vis_thr – threshold of the keypoint visibility.
- Returns
indexes to keep.
- Return type
np.ndarray
- easycv.core.post_processing.nms.soft_oks_nms(kpts_db, thr, max_dets=20, sigmas=None, vis_thr=None)[source]¶
Soft OKS NMS implementations.
- Parameters
kpts_db –
thr – retain oks overlap < thr.
max_dets – max number of detections to keep.
sigmas – Keypoint labelling uncertainty.
- Returns
indexes to keep.
- Return type
np.ndarray
easycv.core.post_processing.pose_transforms module¶
- easycv.core.post_processing.pose_transforms.fliplr_joints(joints_3d, joints_3d_visible, img_width, flip_pairs)[source]¶
Flip human joints horizontally.
Note
num_keypoints: K
- Parameters
joints_3d (np.ndarray([K, 3])) – Coordinates of keypoints.
joints_3d_visible (np.ndarray([K, 1])) – Visibility of keypoints.
img_width (int) – Image width.
flip_pairs (list[tuple()]) – Pairs of keypoints which are mirrored (for example, left ear – right ear).
- Returns
Flipped human joints.
joints_3d_flipped (np.ndarray([K, 3])): Flipped joints.
joints_3d_visible_flipped (np.ndarray([K, 1])): Joint visibility.
- Return type
tuple
- easycv.core.post_processing.pose_transforms.fliplr_regression(regression, flip_pairs, center_mode='static', center_x=0.5, center_index=0)[source]¶
Flip human joints horizontally.
Note
batch_size: N num_keypoint: K
- Parameters
regression (np.ndarray([..., K, C])) –
Coordinates of keypoints, where K is the joint number and C is the dimension. Example shapes are: - [N, K, C]: a batch of keypoints where N is the batch size. - [N, T, K, C]: a batch of pose sequences, where T is the frame
number.
flip_pairs (list[tuple()]) – Pairs of keypoints which are mirrored (for example, left ear – right ear).
center_mode (str) – The mode to set the center location on the x-axis to flip around. Options are: - static: use a static x value (see center_x also) - root: use a root joint (see center_index also)
center_x (float) – Set the x-axis location of the flip center. Only used when center_mode=static.
center_index (int) – Set the index of the root joint, whose x location will be used as the flip center. Only used when center_mode=root.
- Returns
Flipped human joints.
regression_flipped (np.ndarray([…, K, C])): Flipped joints.
- Return type
tuple
- easycv.core.post_processing.pose_transforms.flip_back(output_flipped, flip_pairs, target_type='GaussianHeatmap')[source]¶
Flip the flipped heatmaps back to the original form.
Note
batch_size: N num_keypoints: K heatmap height: H heatmap width: W
- Parameters
output_flipped (np.ndarray[N, K, H, W]) – The output heatmaps obtained from the flipped images.
flip_pairs (list[tuple()) – Pairs of keypoints which are mirrored (for example, left ear – right ear).
target_type (str) – GaussianHeatmap or CombinedTarget
- Returns
heatmaps that flipped back to the original image
- Return type
np.ndarray
- easycv.core.post_processing.pose_transforms.transform_preds(coords, center, scale, output_size, use_udp=False)[source]¶
Get final keypoint predictions from heatmaps and apply scaling and translation to map them back to the image.
Note
num_keypoints: K
- Parameters
coords (np.ndarray[K, ndims]) –
If ndims=2, corrds are predicted keypoint location.
If ndims=4, corrds are composed of (x, y, scores, tags)
If ndims=5, corrds are composed of (x, y, scores, tags, flipped_tags)
center (np.ndarray[2, ]) – Center of the bounding box (x, y).
scale (np.ndarray[2, ]) – Scale of the bounding box wrt [width, height].
output_size (np.ndarray[2, ] | list(2,)) – Size of the destination heatmaps.
use_udp (bool) – Use unbiased data processing
- Returns
Predicted coordinates in the images.
- Return type
np.ndarray
- easycv.core.post_processing.pose_transforms.get_affine_transform(center, scale, rot, output_size, shift=(0.0, 0.0), inv=False)[source]¶
Get the affine transform matrix, given the center/scale/rot/output_size.
- Parameters
center (np.ndarray[2, ]) – Center of the bounding box (x, y).
scale (np.ndarray[2, ]) – Scale of the bounding box wrt [width, height].
rot (float) – Rotation angle (degree).
output_size (np.ndarray[2, ] | list(2,)) – Size of the destination heatmaps.
shift (0-100%) – Shift translation ratio wrt the width/height. Default (0., 0.).
inv (bool) – Option to inverse the affine transform direction. (inv=False: src->dst or inv=True: dst->src)
- Returns
The transform matrix.
- Return type
np.ndarray
- easycv.core.post_processing.pose_transforms.affine_transform(pt, trans_mat)[source]¶
Apply an affine transformation to the points.
- Parameters
pt (np.ndarray) – a 2 dimensional point to be transformed
trans_mat (np.ndarray) – 2x3 matrix of an affine transform
- Returns
Transformed points.
- Return type
np.ndarray
- easycv.core.post_processing.pose_transforms.rotate_point(pt, angle_rad)[source]¶
Rotate a point by an angle.
- Parameters
pt (list[float]) – 2 dimensional point to be rotated
angle_rad (float) – rotation angle by radian
- Returns
Rotated point.
- Return type
list[float]
- easycv.core.post_processing.pose_transforms.get_warp_matrix(theta, size_input, size_dst, size_target)[source]¶
Calculate the transformation matrix under the constraint of unbiased. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).
- Parameters
theta (float) – Rotation angle in degrees.
size_input (np.ndarray) – Size of input image [w, h].
size_dst (np.ndarray) – Size of output image [w, h].
size_target (np.ndarray) – Size of ROI in input plane [w, h].
- Returns
A matrix for transformation.
- Return type
matrix (np.ndarray)
- easycv.core.post_processing.pose_transforms.warp_affine_joints(joints, mat)[source]¶
Apply affine transformation defined by the transform matrix on the joints.
- Parameters
joints (np.ndarray[..., 2]) – Origin coordinate of joints.
mat (np.ndarray[3, 2]) – The affine matrix.
- Returns
Result coordinate of joints.
- Return type
matrix (np.ndarray[…, 2])
easycv.core.visualization package¶
- easycv.core.visualization.imshow_bboxes(img, bboxes, labels=None, colors='green', text_color='white', font_size=20, thickness=1, font_scale=0.5, show=True, win_name='', wait_time=0, out_file=None)[source]¶
Draw bboxes with labels (optional) on an image. This is a wrapper of mmcv.imshow_bboxes.
- Parameters
img (str or ndarray) – The image to be displayed.
bboxes (ndarray) – ndarray of shape (k, 4), each row is a bbox in format [x1, y1, x2, y2].
labels (str or list[str], optional) – labels of each bbox.
colors (list[str or tuple or
Color
]) – A list of colors.text_color (str or tuple or
Color
) – Color of texts.font_size (int) – Size of font.
thickness (int) – Thickness of lines.
font_scale (float) – Font scales of texts.
show (bool) – Whether to show the image.
win_name (str) – The window name.
wait_time (int) – Value of waitKey param.
out_file (str, optional) – The filename to write the image.
- Returns
The image with bboxes drawn on it.
- Return type
ndarray
- easycv.core.visualization.imshow_keypoints(img, pose_result, skeleton=None, kpt_score_thr=0.3, pose_kpt_color=None, pose_link_color=None, radius=4, thickness=1, show_keypoint_weight=False)[source]¶
Draw keypoints and links on an image.
- Parameters
img (str or Tensor) – The image to draw poses on. If an image array is given, id will be modified in-place.
pose_result (list[kpts]) – The poses to draw. Each element kpts is a set of K keypoints as an Kx3 numpy.ndarray, where each keypoint is represented as x, y, score.
kpt_score_thr (float, optional) – Minimum score of keypoints to be shown. Default: 0.3.
pose_kpt_color (np.array[Nx3]`) – Color of N keypoints. If None, the keypoint will not be drawn.
pose_link_color (np.array[Mx3]) – Color of M links. If None, the links will not be drawn.
thickness (int) – Thickness of lines.
- easycv.core.visualization.imshow_label(img, labels, text_color='blue', font_size=20, thickness=1, font_scale=0.5, intervel=5, show=True, win_name='', wait_time=0, out_file=None)[source]¶
Draw images with labels on an image.
- Parameters
img (str or ndarray) – The image to be displayed.
labels (str or list[str]) – labels of each image.
text_color (str or tuple or
Color
) – Color of texts.font_size (int) – Size of font.
thickness (int) – Thickness of lines.
font_scale (float) – Font scales of texts.
intervel(int) – interval pixels between multiple labels
show (bool) – Whether to show the image.
win_name (str) – The window name.
wait_time (int) – Value of waitKey param.
out_file (str, optional) – The filename to write the image.
- Returns
The image with bboxes drawn on it.
- Return type
ndarray
Submodules¶
easycv.core.visualization.image module¶
- easycv.core.visualization.image.put_text(img, xy, text, fill, size=20)[source]¶
support chinese text
- easycv.core.visualization.image.imshow_label(img, labels, text_color='blue', font_size=20, thickness=1, font_scale=0.5, intervel=5, show=True, win_name='', wait_time=0, out_file=None)[source]¶
Draw images with labels on an image.
- Parameters
img (str or ndarray) – The image to be displayed.
labels (str or list[str]) – labels of each image.
text_color (str or tuple or
Color
) – Color of texts.font_size (int) – Size of font.
thickness (int) – Thickness of lines.
font_scale (float) – Font scales of texts.
intervel(int) – interval pixels between multiple labels
show (bool) – Whether to show the image.
win_name (str) – The window name.
wait_time (int) – Value of waitKey param.
out_file (str, optional) – The filename to write the image.
- Returns
The image with bboxes drawn on it.
- Return type
ndarray
- easycv.core.visualization.image.imshow_bboxes(img, bboxes, labels=None, colors='green', text_color='white', font_size=20, thickness=1, font_scale=0.5, show=True, win_name='', wait_time=0, out_file=None)[source]¶
Draw bboxes with labels (optional) on an image. This is a wrapper of mmcv.imshow_bboxes.
- Parameters
img (str or ndarray) – The image to be displayed.
bboxes (ndarray) – ndarray of shape (k, 4), each row is a bbox in format [x1, y1, x2, y2].
labels (str or list[str], optional) – labels of each bbox.
colors (list[str or tuple or
Color
]) – A list of colors.text_color (str or tuple or
Color
) – Color of texts.font_size (int) – Size of font.
thickness (int) – Thickness of lines.
font_scale (float) – Font scales of texts.
show (bool) – Whether to show the image.
win_name (str) – The window name.
wait_time (int) – Value of waitKey param.
out_file (str, optional) – The filename to write the image.
- Returns
The image with bboxes drawn on it.
- Return type
ndarray
- easycv.core.visualization.image.imshow_keypoints(img, pose_result, skeleton=None, kpt_score_thr=0.3, pose_kpt_color=None, pose_link_color=None, radius=4, thickness=1, show_keypoint_weight=False)[source]¶
Draw keypoints and links on an image.
- Parameters
img (str or Tensor) – The image to draw poses on. If an image array is given, id will be modified in-place.
pose_result (list[kpts]) – The poses to draw. Each element kpts is a set of K keypoints as an Kx3 numpy.ndarray, where each keypoint is represented as x, y, score.
kpt_score_thr (float, optional) – Minimum score of keypoints to be shown. Default: 0.3.
pose_kpt_color (np.array[Nx3]`) – Color of N keypoints. If None, the keypoint will not be drawn.
pose_link_color (np.array[Mx3]) – Color of M links. If None, the links will not be drawn.
thickness (int) – Thickness of lines.
Submodules¶
easycv.core.standard_fields module¶
Contains classes specifying naming conventions used for object detection.
- Specifies:
InputDataFields: standard fields used by reader/preprocessor/batcher. DetectionResultFields: standard fields returned by object detector. BoxListFields: standard field used by BoxList TfExampleFields: standard fields for tf-example data format (go/tf-example).
- class easycv.core.standard_fields.InputDataFields[source]¶
Bases:
object
Names for the input tensors.
Holds the standard data field names to use for identifying input tensors. This should be used by the decoder to identify keys for the returned tensor_dict containing input tensors. And it should be used by the model to identify the tensors it needs.
- image¶
image.
- original_image¶
image in the original input size.
- key¶
unique key corresponding to image.
- source_id¶
source of the original image.
- filename¶
original filename of the dataset (without common path).
- groundtruth_image_classes¶
image-level class labels.
- groundtruth_boxes¶
coordinates of the ground truth boxes in the image.
- groundtruth_classes¶
box-level class labels.
- groundtruth_label_types¶
box-level label types (e.g. explicit negative).
- groundtruth_is_crowd¶
[DEPRECATED, use groundtruth_group_of instead] is the groundtruth a single object or a crowd.
- groundtruth_area¶
area of a groundtruth segment.
- groundtruth_difficult¶
is a difficult object
- groundtruth_group_of¶
is a group_of objects, e.g. multiple objects of the same class, forming a connected group, where instances are heavily occluding each other.
- proposal_boxes¶
coordinates of object proposal boxes.
- proposal_objectness¶
objectness score of each proposal.
- groundtruth_instance_masks¶
ground truth instance masks.
- groundtruth_instance_boundaries¶
ground truth instance boundaries.
- groundtruth_instance_classes¶
instance mask-level class labels.
- groundtruth_keypoints¶
ground truth keypoints.
- groundtruth_keypoint_visibilities¶
ground truth keypoint visibilities.
- groundtruth_label_scores¶
groundtruth label scores.
- groundtruth_weights¶
groundtruth weight factor for bounding boxes.
- num_groundtruth_boxes¶
number of groundtruth boxes.
- true_image_shapes¶
true shapes of images in the resized images, as resized images can be padded with zeros.
- image = 'image'¶
- mask = 'mask'¶
- width = 'width'¶
- height = 'height'¶
- original_image = 'original_image'¶
- optical_flow = 'optical_flow'¶
- key = 'key'¶
- source_id = 'source_id'¶
- filename = 'filename'¶
- dataset_name = 'dataset_name'¶
- groundtruth_image_classes = 'groundtruth_image_classes'¶
- groundtruth_image_classes_num = 'groundtruth_image_classes_num'¶
- groundtruth_boxes = 'groundtruth_boxes'¶
- groundtruth_classes = 'groundtruth_classes'¶
- groundtruth_label_types = 'groundtruth_label_types'¶
- groundtruth_is_crowd = 'groundtruth_is_crowd'¶
- groundtruth_area = 'groundtruth_area'¶
- groundtruth_difficult = 'groundtruth_difficult'¶
- groundtruth_group_of = 'groundtruth_group_of'¶
- proposal_boxes = 'proposal_boxes'¶
- proposal_objectness = 'proposal_objectness'¶
- groundtruth_instance_masks = 'groundtruth_instance_masks'¶
- groundtruth_instance_boundaries = 'groundtruth_instance_boundaries'¶
- groundtruth_instance_classes = 'groundtruth_instance_classes'¶
- groundtruth_keypoints = 'groundtruth_keypoints'¶
- groundtruth_keypoint_visibilities = 'groundtruth_keypoint_visibilities'¶
- groundtruth_label_scores = 'groundtruth_label_scores'¶
- groundtruth_weights = 'groundtruth_weights'¶
- num_groundtruth_boxes = 'num_groundtruth_boxes'¶
- true_image_shape = 'true_image_shape'¶
- original_image_shape = 'original_image_shape'¶
- original_instance_masks = 'original_instance_masks'¶
- groundtruth_boxes_absolute = 'groundtruth_boxes_absolute'¶
- groundtruth_keypoints_absolute = 'groundtruth_keypoints_absolute'¶
- label_map = 'label_map'¶
- char_dict = 'char_dict'¶
- class easycv.core.standard_fields.DetectionResultFields[source]¶
Bases:
object
Naming conventions for storing the output of the detector.
- source_id¶
source of the original image.
- key¶
unique key corresponding to image.
- detection_boxes¶
coordinates of the detection boxes in the image.
- detection_scores¶
detection scores for the detection boxes in the image.
- detection_classes¶
detection-level class labels.
- detection_masks¶
contains a segmentation mask for each detection box.
- detection_boundaries¶
contains an object boundary for each detection box.
- detection_keypoints¶
contains detection keypoints for each detection box.
- num_detections¶
number of detections in the batch.
- source_id = 'source_id'¶
- key = 'key'¶
- detection_boxes = 'detection_boxes'¶
- detection_scores = 'detection_scores'¶
- detection_classes = 'detection_classes'¶
- detection_masks = 'detection_masks'¶
- detection_boundaries = 'detection_boundaries'¶
- detection_keypoints = 'detection_keypoints'¶
- num_detections = 'num_detections'¶
- class easycv.core.standard_fields.TfExampleFields[source]¶
Bases:
object
TF-example proto feature names for object detection.
Holds the standard feature names to load from an Example proto for object detection.
- image_encoded¶
JPEG encoded string
- image_format¶
image format, e.g. “JPEG”
- filename¶
filename
- channels¶
number of channels of image
- colorspace¶
colorspace, e.g. “RGB”
- height¶
height of image in pixels, e.g. 462
- width¶
width of image in pixels, e.g. 581
- source_id¶
original source of the image
- object_class_text¶
labels in text format, e.g. [“person”, “cat”]
- object_class_label¶
labels in numbers, e.g. [16, 8]
- object_bbox_xmin¶
xmin coordinates of groundtruth box, e.g. 10, 30
- object_bbox_xmax¶
xmax coordinates of groundtruth box, e.g. 50, 40
- object_bbox_ymin¶
ymin coordinates of groundtruth box, e.g. 40, 50
- object_bbox_ymax¶
ymax coordinates of groundtruth box, e.g. 80, 70
- object_view¶
viewpoint of object, e.g. [“frontal”, “left”]
- object_truncated¶
is object truncated, e.g. [true, false]
- object_occluded¶
is object occluded, e.g. [true, false]
- object_difficult¶
is object difficult, e.g. [true, false]
- object_group_of¶
is object a single object or a group of objects
- object_depiction¶
is object a depiction
- object_is_crowd¶
[DEPRECATED, use object_group_of instead] is the object a single object or a crowd
- object_segment_area¶
the area of the segment.
- object_weight¶
a weight factor for the object’s bounding box.
- instance_masks¶
instance segmentation masks.
- instance_boundaries¶
instance boundaries.
- instance_classes¶
Classes for each instance segmentation mask.
- detection_class_label¶
class label in numbers.
- detection_bbox_ymin¶
ymin coordinates of a detection box.
- detection_bbox_xmin¶
xmin coordinates of a detection box.
- detection_bbox_ymax¶
ymax coordinates of a detection box.
- detection_bbox_xmax¶
xmax coordinates of a detection box.
- detection_score¶
detection score for the class label and box.
- image_encoded = 'image/encoded'¶
- image_format = 'image/format'¶
- filename = 'image/filename'¶
- channels = 'image/channels'¶
- colorspace = 'image/colorspace'¶
- height = 'image/height'¶
- width = 'image/width'¶
- source_id = 'image/source_id'¶
- object_class_text = 'image/object/class/text'¶
- object_class_label = 'image/object/class/label'¶
- object_bbox_ymin = 'image/object/bbox/ymin'¶
- object_bbox_xmin = 'image/object/bbox/xmin'¶
- object_bbox_ymax = 'image/object/bbox/ymax'¶
- object_bbox_xmax = 'image/object/bbox/xmax'¶
- object_view = 'image/object/view'¶
- object_truncated = 'image/object/truncated'¶
- object_occluded = 'image/object/occluded'¶
- object_difficult = 'image/object/difficult'¶
- object_group_of = 'image/object/group_of'¶
- object_depiction = 'image/object/depiction'¶
- object_is_crowd = 'image/object/is_crowd'¶
- object_segment_area = 'image/object/segment/area'¶
- object_weight = 'image/object/weight'¶
- instance_masks = 'image/segmentation/object'¶
- instance_boundaries = 'image/boundaries/object'¶
- instance_classes = 'image/segmentation/object/class'¶
- detection_class_label = 'image/detection/label'¶
- detection_bbox_ymin = 'image/detection/bbox/ymin'¶
- detection_bbox_xmin = 'image/detection/bbox/xmin'¶
- detection_bbox_ymax = 'image/detection/bbox/ymax'¶
- detection_bbox_xmax = 'image/detection/bbox/xmax'¶
- detection_score = 'image/detection/score'¶
easycv.models package¶
Subpackages¶
easycv.models.backbones package¶
Submodules¶
easycv.models.backbones.benchmark_mlp module¶
- class easycv.models.backbones.benchmark_mlp.BenchMarkMLP(feature_num, num_classes=1000, avg_pool=False, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(feature_num, num_classes=1000, avg_pool=False, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.bninception module¶
This model is taken from the official PyTorch model zoo. - torchvision.models.mobilenet.py on 31th Aug, 2019
- class easycv.models.backbones.bninception.BNInception(num_classes=0)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(num_classes=0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.darknet module¶
- class easycv.models.backbones.darknet.Darknet(depth, in_channels=3, stem_out_channels=32, out_features=('dark3', 'dark4', 'dark5'))[source]¶
Bases:
torch.nn.modules.module.Module
- depth2blocks = {21: [1, 2, 2, 1], 53: [2, 8, 8, 4]}¶
- __init__(depth, in_channels=3, stem_out_channels=32, out_features=('dark3', 'dark4', 'dark5'))[source]¶
- Parameters
depth (int) – depth of darknet used in model, usually use [21, 53] for this param.
in_channels (int) – number of input channels, for example, use 3 for RGB image.
stem_out_channels (int) – number of output chanels of darknet stem. It decides channels of darknet layer2 to layer5.
out_features (Tuple[str]) – desired output layer name.
- make_group_layer(in_channels: int, num_blocks: int, stride: int = 1)[source]¶
starts with conv layer then has num_blocks ResLayer
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.darknet.CSPDarknet(dep_mul, wid_mul, out_features=('dark3', 'dark4', 'dark5'), depthwise=False, act='silu', spp_type='spp')[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(dep_mul, wid_mul, out_features=('dark3', 'dark4', 'dark5'), depthwise=False, act='silu', spp_type='spp')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.genet module¶
- class easycv.models.backbones.genet.PlainNetBasicBlockClass(in_channels=0, out_channels=0, stride=1, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(in_channels=0, out_channels=0, stride=1, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.AdaptiveAvgPool(out_channels, output_size, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(out_channels, output_size, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.BN(out_channels=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(out_channels=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.ConvDW(out_channels=None, kernel_size=None, stride=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(out_channels=None, kernel_size=None, stride=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.ConvKX(in_channels=None, out_channels=None, kernel_size=None, stride=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(in_channels=None, out_channels=None, kernel_size=None, stride=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.Flatten(out_channels, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(out_channels, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.Linear(in_channels=None, out_channels=None, bias=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(in_channels=None, out_channels=None, bias=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.MaxPool(out_channels, kernel_size, stride, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(out_channels, kernel_size, stride, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.MultiSumBlock(inner_block_list, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(inner_block_list, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.RELU(out_channels, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(out_channels, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.ResBlock(inner_block_list, in_channels=None, stride=None, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
ResBlock(in_channles, inner_blocks_str). If in_channels is missing, use inner_block_list[0].in_channels as in_channels
- __init__(inner_block_list, in_channels=None, stride=None, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.Sequential(inner_block_list, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(inner_block_list, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.SuperResKXKX(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.SuperResK1KX(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.SuperResK1KXK1(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.SuperResK1DWK1(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.genet.SuperResK1DW(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Bases:
easycv.models.backbones.genet.PlainNetBasicBlockClass
- __init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class easycv.models.backbones.genet.PlainNet(plainnet_struct_idx=None, num_classes=0, no_create=False, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- training: bool¶
- __init__(plainnet_struct_idx=None, num_classes=0, no_create=False, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
easycv.models.backbones.hrnet module¶
- easycv.models.backbones.hrnet.get_expansion(block, expansion=None)[source]¶
Get the expansion of a residual block.
The block expansion will be obtained by the following order:
If
expansion
is given, just return it.If
block
has the attributeexpansion
, then returnblock.expansion
.Return the default value according the the block type: 1 for
BasicBlock
and 4 forBottleneck
.
- Parameters
block (class) – The block class.
expansion (int | None) – The given expansion ratio.
- Returns
The expansion of the block.
- Return type
int
- class easycv.models.backbones.hrnet.Bottleneck(in_channels, out_channels, expansion=4, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]¶
Bases:
torch.nn.modules.module.Module
Bottleneck block for ResNet.
- Parameters
in_channels (int) – Input channels of this block.
out_channels (int) – Output channels of this block.
expansion (int) – The ratio of
out_channels/mid_channels
wheremid_channels
is the input/output channels of conv2. Default: 4.stride (int) – stride of the block. Default: 1
dilation (int) – dilation of convolution. Default: 1
downsample (nn.Module) – downsample operation on identity branch. Default: None.
style (str) –
"pytorch"
or"caffe"
. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer. Default: “pytorch”.with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.
conv_cfg (dict) – dictionary to construct and config conv layer. Default: None
norm_cfg (dict) – dictionary to construct and config norm layer. Default: dict(type=’BN’)
- __init__(in_channels, out_channels, expansion=4, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property norm1¶
the normalization layer named “norm1”
- Type
nn.Module
- property norm2¶
the normalization layer named “norm2”
- Type
nn.Module
- property norm3¶
the normalization layer named “norm3”
- Type
nn.Module
- training: bool¶
- class easycv.models.backbones.hrnet.HRModule(num_branches, blocks, num_blocks, in_channels, num_channels, multiscale_output=False, with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, upsample_cfg={'align_corners': None, 'mode': 'nearest'})[source]¶
Bases:
torch.nn.modules.module.Module
High-Resolution Module for HRNet.
In this module, every branch has 4 BasicBlocks/Bottlenecks. Fusion/Exchange is in this module.
- __init__(num_branches, blocks, num_blocks, in_channels, num_channels, multiscale_output=False, with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, upsample_cfg={'align_corners': None, 'mode': 'nearest'})[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
- class easycv.models.backbones.hrnet.HRNet(arch='w32', extra=None, in_channels=3, conv_cfg=None, norm_cfg={'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=False, multi_scale_output=False)[source]¶
Bases:
torch.nn.modules.module.Module
HRNet backbone.
High-Resolution Representations for Labeling Pixels and Regions
- Parameters
extra (dict) – detailed configuration for each stage of HRNet.
in_channels (int) – Number of input image channels. Default: 3.
conv_cfg (dict) – dictionary to construct and config conv layer.
norm_cfg (dict) – dictionary to construct and config norm layer.
norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only. Default: False
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.
zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.
Example
>>> from mmpose.models import HRNet >>> import torch >>> extra = dict( >>> stage1=dict( >>> num_modules=1, >>> num_branches=1, >>> block='BOTTLENECK', >>> num_blocks=(4, ), >>> num_channels=(64, )), >>> stage2=dict( >>> num_modules=1, >>> num_branches=2, >>> block='BASIC', >>> num_blocks=(4, 4), >>> num_channels=(32, 64)), >>> stage3=dict( >>> num_modules=4, >>> num_branches=3, >>> block='BASIC', >>> num_blocks=(4, 4, 4), >>> num_channels=(32, 64, 128)), >>> stage4=dict( >>> num_modules=3, >>> num_branches=4, >>> block='BASIC', >>> num_blocks=(4, 4, 4, 4), >>> num_channels=(32, 64, 128, 256))) >>> self = HRNet(extra, in_channels=1) >>> self.eval() >>> inputs = torch.rand(1, 1, 32, 32) >>> level_outputs = self.forward(inputs) >>> for level_out in level_outputs: ... print(tuple(level_out.shape)) (1, 32, 8, 8)
- blocks_dict = {'BASIC': <class 'easycv.models.backbones.resnet.BasicBlock'>, 'BOTTLENECK': <class 'easycv.models.backbones.hrnet.Bottleneck'>}¶
- arch_zoo = {'w18': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (18, 36)], [4, 3, 'BASIC', (4, 4, 4), (18, 36, 72)], [3, 4, 'BASIC', (4, 4, 4, 4), (18, 36, 72, 144)]], 'w30': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (30, 60)], [4, 3, 'BASIC', (4, 4, 4), (30, 60, 120)], [3, 4, 'BASIC', (4, 4, 4, 4), (30, 60, 120, 240)]], 'w32': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (32, 64)], [4, 3, 'BASIC', (4, 4, 4), (32, 64, 128)], [3, 4, 'BASIC', (4, 4, 4, 4), (32, 64, 128, 256)]], 'w40': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (40, 80)], [4, 3, 'BASIC', (4, 4, 4), (40, 80, 160)], [3, 4, 'BASIC', (4, 4, 4, 4), (40, 80, 160, 320)]], 'w44': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (44, 88)], [4, 3, 'BASIC', (4, 4, 4), (44, 88, 176)], [3, 4, 'BASIC', (4, 4, 4, 4), (44, 88, 176, 352)]], 'w48': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (48, 96)], [4, 3, 'BASIC', (4, 4, 4), (48, 96, 192)], [3, 4, 'BASIC', (4, 4, 4, 4), (48, 96, 192, 384)]], 'w64': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (64, 128)], [4, 3, 'BASIC', (4, 4, 4), (64, 128, 256)], [3, 4, 'BASIC', (4, 4, 4, 4), (64, 128, 256, 512)]]}¶
- __init__(arch='w32', extra=None, in_channels=3, conv_cfg=None, norm_cfg={'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=False, multi_scale_output=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property norm1¶
the normalization layer named “norm1”
- Type
nn.Module
- property norm2¶
the normalization layer named “norm2”
- Type
nn.Module
- training: bool¶
easycv.models.backbones.inceptionv3 module¶
This model is taken from the official PyTorch model zoo. - torchvision.models.inception.py on 31th Aug, 2019
- class easycv.models.backbones.inceptionv3.Inception3(num_classes: int = 0, aux_logits: bool = True, transform_input: bool = False)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(num_classes: int = 0, aux_logits: bool = True, transform_input: bool = False) → None[source]¶
- Parameters
num_classes – number of classes based on dataset.
aux_logits – If True, adds two auxiliary branches that can improve training. Default: False when pretrained is True otherwise True
transform_input – If True, preprocesses the input according to the method with which it was trained on ImageNet. Default: False
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.lighthrnet module¶
- easycv.models.backbones.lighthrnet.channel_shuffle(x, groups)[source]¶
Channel Shuffle operation.
This function enables cross-group information flow for multiple groups convolution layers.
- Parameters
x (Tensor) – The input tensor.
groups (int) – The number of groups to divide the input tensor in the channel dimension.
- Returns
The output tensor after channel shuffle operation.
- Return type
Tensor
- class easycv.models.backbones.lighthrnet.SpatialWeighting(channels, ratio=16, conv_cfg=None, norm_cfg=None, act_cfg=({'type': 'ReLU'}, {'type': 'Sigmoid'}))[source]¶
Bases:
torch.nn.modules.module.Module
Spatial weighting module.
- Parameters
channels (int) – The channels of the module.
ratio (int) – channel reduction ratio.
conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Default: None.
act_cfg (dict) – Config dict for activation layer. Default: (dict(type=’ReLU’), dict(type=’Sigmoid’)). The last ConvModule uses Sigmoid by default.
- __init__(channels, ratio=16, conv_cfg=None, norm_cfg=None, act_cfg=({'type': 'ReLU'}, {'type': 'Sigmoid'}))[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.lighthrnet.CrossResolutionWeighting(channels, ratio=16, conv_cfg=None, norm_cfg=None, act_cfg=({'type': 'ReLU'}, {'type': 'Sigmoid'}))[source]¶
Bases:
torch.nn.modules.module.Module
Cross-resolution channel weighting module.
- Parameters
channels (int) – The channels of the module.
ratio (int) – channel reduction ratio.
conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Default: None.
act_cfg (dict) – Config dict for activation layer. Default: (dict(type=’ReLU’), dict(type=’Sigmoid’)). The last ConvModule uses Sigmoid by default.
- __init__(channels, ratio=16, conv_cfg=None, norm_cfg=None, act_cfg=({'type': 'ReLU'}, {'type': 'Sigmoid'}))[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.lighthrnet.ConditionalChannelWeighting(in_channels, stride, reduce_ratio, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]¶
Bases:
torch.nn.modules.module.Module
Conditional channel weighting block.
- Parameters
in_channels (int) – The input channels of the block.
stride (int) – Stride of the 3x3 convolution layer.
reduce_ratio (int) – channel reduction ratio.
conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.
- __init__(in_channels, stride, reduce_ratio, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.lighthrnet.Stem(in_channels, stem_channels, out_channels, expand_ratio, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]¶
Bases:
torch.nn.modules.module.Module
Stem network block.
- Parameters
in_channels (int) – The input channels of the block.
stem_channels (int) – Output channels of the stem layer.
out_channels (int) – The output channels of the block.
expand_ratio (int) – adjusts number of channels of the hidden layer in InvertedResidual by this amount.
conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.
- __init__(in_channels, stem_channels, out_channels, expand_ratio, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.lighthrnet.IterativeHead(in_channels, norm_cfg={'type': 'BN'})[source]¶
Bases:
torch.nn.modules.module.Module
Extra iterative head for feature learning.
- Parameters
in_channels (int) – The input channels of the block.
norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).
- __init__(in_channels, norm_cfg={'type': 'BN'})[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.lighthrnet.ShuffleUnit(in_channels, out_channels, stride=1, conv_cfg=None, norm_cfg={'type': 'BN'}, act_cfg={'type': 'ReLU'}, with_cp=False)[source]¶
Bases:
torch.nn.modules.module.Module
InvertedResidual block for ShuffleNetV2 backbone.
- Parameters
in_channels (int) – The input channels of the block.
out_channels (int) – The output channels of the block.
stride (int) – Stride of the 3x3 convolution layer. Default: 1
conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.
norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).
act_cfg (dict) – Config dict for activation layer. Default: dict(type=’ReLU’).
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.
- __init__(in_channels, out_channels, stride=1, conv_cfg=None, norm_cfg={'type': 'BN'}, act_cfg={'type': 'ReLU'}, with_cp=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.lighthrnet.LiteHRModule(num_branches, num_blocks, in_channels, reduce_ratio, module_type, multiscale_output=False, with_fuse=True, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]¶
Bases:
torch.nn.modules.module.Module
High-Resolution Module for LiteHRNet.
It contains conditional channel weighting blocks and shuffle blocks.
- Parameters
num_branches (int) – Number of branches in the module.
num_blocks (int) – Number of blocks in the module.
in_channels (list(int)) – Number of input image channels.
reduce_ratio (int) – Channel reduction ratio.
module_type (str) – ‘LITE’ or ‘NAIVE’
multiscale_output (bool) – Whether to output multi-scale features.
with_fuse (bool) – Whether to use fuse layers.
conv_cfg (dict) – dictionary to construct and config conv layer.
norm_cfg (dict) – dictionary to construct and config norm layer.
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.
- __init__(num_branches, num_blocks, in_channels, reduce_ratio, module_type, multiscale_output=False, with_fuse=True, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
- class easycv.models.backbones.lighthrnet.LiteHRNet(extra, in_channels=3, conv_cfg=None, norm_cfg={'type': 'BN'}, norm_eval=False, with_cp=False)[source]¶
Bases:
torch.nn.modules.module.Module
Lite-HRNet backbone.
Lite-HRNet: A Lightweight High-Resolution Network
Code adapted from ‘https://github.com/HRNet/Lite-HRNet/’ ‘blob/hrnet/models/backbones/litehrnet.py’
- Parameters
extra (dict) – detailed configuration for each stage of HRNet.
in_channels (int) – Number of input image channels. Default: 3.
conv_cfg (dict) – dictionary to construct and config conv layer.
norm_cfg (dict) – dictionary to construct and config norm layer.
norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only. Default: False
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.
Example
>>> from mmpose.models import LiteHRNet >>> import torch >>> extra=dict( >>> stem=dict(stem_channels=32, out_channels=32, expand_ratio=1), >>> num_stages=3, >>> stages_spec=dict( >>> num_modules=(2, 4, 2), >>> num_branches=(2, 3, 4), >>> num_blocks=(2, 2, 2), >>> module_type=('LITE', 'LITE', 'LITE'), >>> with_fuse=(True, True, True), >>> reduce_ratios=(8, 8, 8), >>> num_channels=( >>> (40, 80), >>> (40, 80, 160), >>> (40, 80, 160, 320), >>> )), >>> with_head=False) >>> self = LiteHRNet(extra, in_channels=1) >>> self.eval() >>> inputs = torch.rand(1, 1, 32, 32) >>> level_outputs = self.forward(inputs) >>> for level_out in level_outputs: ... print(tuple(level_out.shape)) (1, 40, 8, 8)
- __init__(extra, in_channels=3, conv_cfg=None, norm_cfg={'type': 'BN'}, norm_eval=False, with_cp=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- init_weights()[source]¶
Initialize the weights in backbone.
- Parameters
pretrained (str, optional) – Path to pre-trained weights. Defaults to None.
- training: bool¶
easycv.models.backbones.mae_vit_transformer module¶
Mostly copy-paste from https://github.com/facebookresearch/mae/blob/main/models_mae.py
- class easycv.models.backbones.mae_vit_transformer.MaskedAutoencoderViT(img_size=224, patch_size=16, in_chans=3, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]¶
Bases:
torch.nn.modules.module.Module
- Masked Autoencoder with VisionTransformer backbone.
MaskedAutoencoderViT is mostly same as vit_tranformer_dynamic, but with a random_masking func. MaskedAutoencoderViT model can be loaded by vit_tranformer_dynamic.
- Parameters
img_size (int) – input image size
patch_size (int) – patch size
in_chans (int) – input image channels
embed_dim (int) – feature dimensions
depth (int) – number of encoder layers
num_heads (int) – Parallel attention heads
mlp_ratio (float) – mlp ratio
norm_layer – type of normalization layer
- __init__(img_size=224, patch_size=16, in_chans=3, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- random_masking(x, mask_ratio)[source]¶
Perform per-sample random masking by per-sample shuffling. Per-sample shuffling is done by argsort random noise. x: [N, L, D], sequence
- forward(x, mask_ratio)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.mnasnet module¶
This model is taken from the official PyTorch model zoo. - torchvision.models.mnasnet.py on 31th Aug, 2019
- class easycv.models.backbones.mnasnet.MNASNet(alpha, num_classes=0, dropout=0.2)[source]¶
Bases:
torch.nn.modules.module.Module
MNASNet, as described in https://arxiv.org/pdf/1807.11626.pdf. >>> model = MNASNet(1000, 1.0) >>> x = torch.rand(1, 3, 224, 224) >>> y = model(x) >>> y.dim() 1 >>> y.nelement() 1000
- __init__(alpha, num_classes=0, dropout=0.2)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.mobilenetv2 module¶
This model is taken from the official PyTorch model zoo. - torchvision.models.mobilenet.py on 31th Aug, 2019
- class easycv.models.backbones.mobilenetv2.MobileNetV2(num_classes=0, width_multi=1.0, inverted_residual_setting=None, round_nearest=8)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(num_classes=0, width_multi=1.0, inverted_residual_setting=None, round_nearest=8)[source]¶
MobileNet V2 main class :param num_classes: Number of classes :type num_classes: int :param width_multi: Width multiplier - adjusts number of channels in each layer by this amount :type width_multi: float :param inverted_residual_setting: Network structure :param round_nearest: Round the number of channels in each layer to be a multiple of this number :type round_nearest: int :param Set to 1 to turn off rounding:
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.network_blocks module¶
- class easycv.models.backbones.network_blocks.SiLU(inplace=True)[source]¶
Bases:
torch.nn.modules.module.Module
export-friendly inplace version of nn.SiLU()
- __init__(inplace=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- static forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.HSiLU(inplace=True)[source]¶
Bases:
torch.nn.modules.module.Module
export-friendly inplace version of nn.SiLU() hardsigmoid is better than sigmoid when used for edge model
- __init__(inplace=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- static forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.BaseConv(in_channels, out_channels, ksize, stride, groups=1, bias=False, act='silu')[source]¶
Bases:
torch.nn.modules.module.Module
A Conv2d -> Batchnorm -> silu/leaky relu block
- __init__(in_channels, out_channels, ksize, stride, groups=1, bias=False, act='silu')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.DWConv(in_channels, out_channels, ksize, stride=1, act='silu')[source]¶
Bases:
torch.nn.modules.module.Module
Depthwise Conv + Conv
- __init__(in_channels, out_channels, ksize, stride=1, act='silu')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.Bottleneck(in_channels, out_channels, shortcut=True, expansion=0.5, depthwise=False, act='silu')[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(in_channels, out_channels, shortcut=True, expansion=0.5, depthwise=False, act='silu')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.ResLayer(in_channels: int)[source]¶
Bases:
torch.nn.modules.module.Module
Residual layer with in_channels inputs.
- __init__(in_channels: int)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.SPPFBottleneck(in_channels, out_channels, kernel_size=5, activation='silu')[source]¶
Bases:
torch.nn.modules.module.Module
Spatial pyramid pooling layer used in YOLOv3-SPP
- __init__(in_channels, out_channels, kernel_size=5, activation='silu')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.SPPBottleneck(in_channels, out_channels, kernel_sizes=(5, 9, 13), activation='silu')[source]¶
Bases:
torch.nn.modules.module.Module
Spatial pyramid pooling layer used in YOLOv3-SPP
- __init__(in_channels, out_channels, kernel_sizes=(5, 9, 13), activation='silu')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.CSPLayer(in_channels, out_channels, n=1, shortcut=True, expansion=0.5, depthwise=False, act='silu')[source]¶
Bases:
torch.nn.modules.module.Module
CSP Bottleneck with 3 convolutions
- __init__(in_channels, out_channels, n=1, shortcut=True, expansion=0.5, depthwise=False, act='silu')[source]¶
- Parameters
in_channels (int) – input channels.
out_channels (int) – output channels.
n (int) – number of Bottlenecks. Default value: 1.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.Focus(in_channels, out_channels, ksize=1, stride=1, act='silu')[source]¶
Bases:
torch.nn.modules.module.Module
Focus width and height information into channel space.
- __init__(in_channels, out_channels, ksize=1, stride=1, act='silu')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.GSConv(c1, c2, k=1, s=1, g=1, act='silu')[source]¶
Bases:
torch.nn.modules.module.Module
GSConv is used to merge the channel information of DSConv and BaseConv You can refer to https://github.com/AlanLi1997/slim-neck-by-gsconv for more details
- __init__(c1, c2, k=1, s=1, g=1, act='silu')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.GSBottleneck(c1, c2, k=3, s=1)[source]¶
Bases:
torch.nn.modules.module.Module
The use of GSBottleneck is to stack the GSConv layer You can refer to https://github.com/AlanLi1997/slim-neck-by-gsconv for more details
- __init__(c1, c2, k=3, s=1)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.network_blocks.VoVGSCSP(c1, c2, n=1, shortcut=True, g=1, e=0.5)[source]¶
Bases:
torch.nn.modules.module.Module
VoVGSCSP is a new neck structure used in CSPNet You can refer to https://github.com/AlanLi1997/slim-neck-by-gsconv for more details
- __init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.pytorch_image_models_wrapper module¶
- class easycv.models.backbones.pytorch_image_models_wrapper.PytorchImageModelWrapper(model_name='resnet50', scriptable=None, exportable=None, no_jit=None, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
Support Backbones From pytorch-image-models. The PyTorch community has lots of awesome contributions for image models. PyTorch Image Models (timm) is a collection of image models, aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results. Model pages can be found at https://rwightman.github.io/pytorch-image-models/models/ References: https://github.com/rwightman/pytorch-image-models
- __init__(model_name='resnet50', scriptable=None, exportable=None, no_jit=None, **kwargs)[source]¶
Inits PytorchImageModelWrapper by timm.create_models :param model_name: name of model to instantiate :type model_name: str :param scriptable: set layer config so that model is jit scriptable (not working for all models yet) :type scriptable: bool :param exportable: set layer config so that model is traceable / ONNX exportable (not fully impl/obeyed yet) :type exportable: bool :param no_jit: set layer config so that model doesn’t utilize jit scripted layers (so far activations only) :type no_jit: bool
- init_weights(pretrained=None)[source]¶
- Parameters
pretrained == True (if) –
model from default path; (load) –
pretrained == False or None (if) –
from init weights. (load) –
model_name in timm_model_names (if) –
model from timm default path; (load) –
model_name in _MODEL_MAP (if) –
model from easycv default path (load) –
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.backbones.resnest module¶
ResNet variants
- class easycv.models.backbones.resnest.SplAtConv2d(in_channels, channels, kernel_size, stride=(1, 1), padding=(0, 0), dilation=(1, 1), groups=1, bias=True, radix=2, reduction_factor=4, rectify=False, rectify_avg=False, norm_layer=None, dropblock_prob=0.0, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
Split-Attention Conv2d
- __init__(in_channels, channels, kernel_size, stride=(1, 1), padding=(0, 0), dilation=(1, 1), groups=1, bias=True, radix=2, reduction_factor=4, rectify=False, rectify_avg=False, norm_layer=None, dropblock_prob=0.0, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.resnest.rSoftMax(radix, cardinality)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(radix, cardinality)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.resnest.GlobalAvgPool2d[source]¶
Bases:
torch.nn.modules.module.Module
- forward(inputs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.resnest.Bottleneck(inplanes, planes, stride=1, downsample=None, radix=1, cardinality=1, bottleneck_width=64, avd=False, avd_first=False, dilation=1, is_first=False, rectified_conv=False, rectify_avg=False, norm_layer=None, dropblock_prob=0.0, last_gamma=False)[source]¶
Bases:
torch.nn.modules.module.Module
ResNet Bottleneck
- expansion = 4¶
- __init__(inplanes, planes, stride=1, downsample=None, radix=1, cardinality=1, bottleneck_width=64, avd=False, avd_first=False, dilation=1, is_first=False, rectified_conv=False, rectify_avg=False, norm_layer=None, dropblock_prob=0.0, last_gamma=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.resnest.ResNeSt(depth=None, block=<class 'easycv.models.backbones.resnest.Bottleneck'>, layers=[3, 4, 6, 3], radix=2, groups=1, bottleneck_width=64, num_classes=0, dilated=False, dilation=1, deep_stem=True, stem_width=32, avg_down=True, rectified_conv=False, rectify_avg=False, avd=False, avd_first=False, final_drop=0.0, dropblock_prob=0, last_gamma=False, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>)[source]¶
Bases:
torch.nn.modules.module.Module
ResNet Variants
- Parameters
block (Block) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.
layers (list of int) – Numbers of layers in each block
classes (int, default 1000) – Number of classification classes.
dilated (bool, default False) – Applying dilation strategy to pretrained ResNet yielding a stride-8 model, typically used in Semantic Segmentation.
norm_layer (object) – Normalization layer used in backbone network (default:
mxnet.gluon.nn.BatchNorm
; for Synchronized Cross-GPU BachNormalization).Reference –
He, Kaiming, et al. “Deep residual learning for image recognition.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.”
- arch_settings = {50: ((3, 4, 6, 3), 32), 101: ((3, 4, 23, 3), 64), 200: ((3, 24, 36, 3), 64), 269: ((3, 30, 48, 8), 64)}¶
- __init__(depth=None, block=<class 'easycv.models.backbones.resnest.Bottleneck'>, layers=[3, 4, 6, 3], radix=2, groups=1, bottleneck_width=64, num_classes=0, dilated=False, dilation=1, deep_stem=True, stem_width=32, avg_down=True, rectified_conv=False, rectify_avg=False, avd=False, avd_first=False, final_drop=0.0, dropblock_prob=0, last_gamma=False, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
easycv.models.backbones.resnet module¶
- class easycv.models.backbones.resnet.BasicBlock(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, frelu=False, dcn=None)[source]¶
Bases:
torch.nn.modules.module.Module
- expansion = 1¶
- __init__(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, frelu=False, dcn=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property norm1¶
- property norm2¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.resnet.Bottleneck(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, frelu=False, dcn=None)[source]¶
Bases:
torch.nn.modules.module.Module
- expansion = 4¶
- __init__(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, frelu=False, dcn=None)[source]¶
Bottleneck block for ResNet. If style is “pytorch”, the stride-two layer is the 3x3 conv layer, if it is “caffe”, the stride-two layer is the first 1x1 conv layer.
- property norm1¶
- property norm2¶
- property norm3¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- easycv.models.backbones.resnet.make_res_layer(block, inplanes, planes, blocks, stride=1, dilation=1, style='pytorch', avg_down=False, with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, dcn=None, frelu=False, multi_grid=None, contract_dilation=False)[source]¶
- class easycv.models.backbones.resnet.ResNet(depth, in_channels=3, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), style='pytorch', deep_stem=False, avg_down=False, num_classes=0, frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, dcn=None, stage_with_dcn=(False, False, False, False), with_cp=False, frelu=False, original_inplanes=64, stem_channels=64, zero_init_residual=False, multi_grid=None, contract_dilation=False)[source]¶
Bases:
torch.nn.modules.module.Module
ResNet backbone.
- Parameters
depth (int) – Depth of resnet, from {18, 34, 50, 101, 152}.
in_channels (int) – Number of input image channels. Normally 3.
num_stages (int) – Resnet stages, normally 4.
strides (Sequence[int]) – Strides of the first block of each stage.
dilations (Sequence[int]) – Dilation of each stage.
out_indices (Sequence[int]) – Output from which stages.
style (str) – pytorch or caffe. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer.
deep_stem (bool) – Replace 7x7 conv in input stem with 3 3x3 conv. Default: False.
avg_down (bool) – Use AvgPool instead of stride conv when downsampling in the bottleneck. Default: False.
frozen_stages (int) – Stages to be frozen (stop grad and set eval mode). -1 means not freezing any parameters.
norm_cfg (dict) – dictionary to construct and config norm layer.
norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only.
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.
original_inplanes – start channel for first block, default=64
stem_channels (int) – Number of stem channels. Default: 64.
zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.
multi_grid (Sequence[int]|None) – Multi grid dilation rates of last stage. Default: None.
contract_dilation (bool) – Whether contract first dilation of each layer Default: False.
Example
>>> from easycv.models import ResNet >>> import torch >>> self = ResNet(depth=18) >>> self.eval() >>> inputs = torch.rand(1, 3, 32, 32) >>> level_outputs = self.forward(inputs) >>> for level_out in level_outputs: ... print(tuple(level_out.shape)) (1, 64, 8, 8) (1, 128, 4, 4) (1, 256, 2, 2) (1, 512, 1, 1)
- arch_settings = {10: (<class 'easycv.models.backbones.resnet.BasicBlock'>, (1, 1, 1, 1)), 18: (<class 'easycv.models.backbones.resnet.BasicBlock'>, (2, 2, 2, 2)), 34: (<class 'easycv.models.backbones.resnet.BasicBlock'>, (3, 4, 6, 3)), 50: (<class 'easycv.models.backbones.resnet.Bottleneck'>, (3, 4, 6, 3)), 101: (<class 'easycv.models.backbones.resnet.Bottleneck'>, (3, 4, 23, 3)), 152: (<class 'easycv.models.backbones.resnet.Bottleneck'>, (3, 8, 36, 3))}¶
- __init__(depth, in_channels=3, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), style='pytorch', deep_stem=False, avg_down=False, num_classes=0, frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, dcn=None, stage_with_dcn=(False, False, False, False), with_cp=False, frelu=False, original_inplanes=64, stem_channels=64, zero_init_residual=False, multi_grid=None, contract_dilation=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property norm1¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- train(mode=True)[source]¶
Sets the module in training mode.
This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g.
Dropout
,BatchNorm
, etc.- Parameters
mode (bool) – whether to set training mode (
True
) or evaluation mode (False
). Default:True
.- Returns
self
- Return type
Module
- training: bool¶
- class easycv.models.backbones.resnet.ResNetV1c(**kwargs)[source]¶
Bases:
easycv.models.backbones.resnet.ResNet
Compared to ResNet, ResNetV1c replaces the 7x7 conv in the input stem with three 3x3 convs. For more details please refer to <https://arxiv.org/abs/1812.01187>.
- training: bool¶
- class easycv.models.backbones.resnet.ResNetV1d(**kwargs)[source]¶
Bases:
easycv.models.backbones.resnet.ResNet
Compared to ResNet, ResNetV1d replaces the 7x7 conv in the input stem with three 3x3 convs. And in the downsampling block, a 2x2 avg_pool with stride 2 is added before conv, whose stride is changed to 1.
- training: bool¶
easycv.models.backbones.resnet_jit module¶
- class easycv.models.backbones.resnet_jit.BasicBlock(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]¶
Bases:
torch.nn.modules.module.Module
- expansion = 1¶
- __init__(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property norm1¶
- property norm2¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.resnet_jit.Bottleneck(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]¶
Bases:
torch.nn.modules.module.Module
- expansion = 4¶
- __init__(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]¶
Bottleneck block for ResNet. If style is “pytorch”, the stride-two layer is the 3x3 conv layer, if it is “caffe”, the stride-two layer is the first 1x1 conv layer.
- property norm1¶
- property norm2¶
- property norm3¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- easycv.models.backbones.resnet_jit.make_res_layer(block, inplanes, planes, blocks, stride=1, dilation=1, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]¶
- class easycv.models.backbones.resnet_jit.ResNetJIT(depth, in_channels=3, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), style='pytorch', frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=False)[source]¶
Bases:
torch.nn.modules.module.Module
ResNet backbone.
- Parameters
depth (int) – Depth of resnet, from {18, 34, 50, 101, 152}.
in_channels (int) – Number of input image channels. Normally 3.
num_stages (int) – Resnet stages, normally 4.
strides (Sequence[int]) – Strides of the first block of each stage.
dilations (Sequence[int]) – Dilation of each stage.
out_indices (Sequence[int]) – Output from which stages.
style (str) – pytorch or caffe. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer.
frozen_stages (int) – Stages to be frozen (stop grad and set eval mode). -1 means not freezing any parameters.
norm_cfg (dict) – dictionary to construct and config norm layer.
norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only.
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.
zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.
Example
>>> from easycv.models import ResNet >>> import torch >>> self = ResNet(depth=18) >>> self.eval() >>> inputs = torch.rand(1, 3, 32, 32) >>> level_outputs = self.forward(inputs) >>> for level_out in level_outputs: ... print(tuple(level_out.shape)) (1, 64, 8, 8) (1, 128, 4, 4) (1, 256, 2, 2) (1, 512, 1, 1)
- arch_settings = {18: (<class 'easycv.models.backbones.resnet_jit.BasicBlock'>, (2, 2, 2, 2)), 34: (<class 'easycv.models.backbones.resnet_jit.BasicBlock'>, (3, 4, 6, 3)), 50: (<class 'easycv.models.backbones.resnet_jit.Bottleneck'>, (3, 4, 6, 3)), 101: (<class 'easycv.models.backbones.resnet_jit.Bottleneck'>, (3, 4, 23, 3)), 152: (<class 'easycv.models.backbones.resnet_jit.Bottleneck'>, (3, 8, 36, 3))}¶
- __init__(depth, in_channels=3, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), style='pytorch', frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property norm1¶
- training: bool¶
- forward(x: torch.Tensor) → List[torch.Tensor][source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- train(mode=True)[source]¶
Sets the module in training mode.
This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g.
Dropout
,BatchNorm
, etc.- Parameters
mode (bool) – whether to set training mode (
True
) or evaluation mode (False
). Default:True
.- Returns
self
- Return type
Module
easycv.models.backbones.resnext module¶
- class easycv.models.backbones.resnext.Bottleneck(inplanes, planes, groups=1, base_width=4, **kwargs)[source]¶
Bases:
easycv.models.backbones.resnet.Bottleneck
- __init__(inplanes, planes, groups=1, base_width=4, **kwargs)[source]¶
Bottleneck block for ResNeXt. If style is “pytorch”, the stride-two layer is the 3x3 conv layer, if it is “caffe”, the stride-two layer is the first 1x1 conv layer.
- training: bool¶
- easycv.models.backbones.resnext.make_res_layer(block, inplanes, planes, blocks, stride=1, dilation=1, groups=1, base_width=4, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]¶
- class easycv.models.backbones.resnext.ResNeXt(groups=1, base_width=4, **kwargs)[source]¶
Bases:
easycv.models.backbones.resnet.ResNet
ResNeXt backbone.
- Parameters
depth (int) – Depth of resnet, from {18, 34, 50, 101, 152}.
in_channels (int) – Number of input image channels. Normally 3.
num_stages (int) – Resnet stages, normally 4.
groups (int) – Group of resnext.
base_width (int) – Base width of resnext.
strides (Sequence[int]) – Strides of the first block of each stage.
dilations (Sequence[int]) – Dilation of each stage.
out_indices (Sequence[int]) – Output from which stages.
style (str) – pytorch or caffe. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer.
frozen_stages (int) – Stages to be frozen (all param fixed). -1 means not freezing any parameters.
norm_cfg (dict) – dictionary to construct and config norm layer.
norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only.
with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.
zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.
Example
>>> from easycv.models import ResNeXt >>> import torch >>> self = ResNeXt(depth=50) >>> self.eval() >>> inputs = torch.rand(1, 3, 32, 32) >>> level_outputs = self.forward(inputs) >>> for level_out in level_outputs: ... print(tuple(level_out.shape)) (1, 256, 8, 8) (1, 512, 4, 4) (1, 1024, 2, 2) (1, 2048, 1, 1)
- arch_settings = {50: (<class 'easycv.models.backbones.resnext.Bottleneck'>, (3, 4, 6, 3)), 101: (<class 'easycv.models.backbones.resnext.Bottleneck'>, (3, 4, 23, 3)), 152: (<class 'easycv.models.backbones.resnext.Bottleneck'>, (3, 8, 36, 3))}¶
- __init__(groups=1, base_width=4, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
easycv.models.backbones.shuffle_transformer module¶
- class easycv.models.backbones.shuffle_transformer.Mlp(in_features, hidden_features=None, out_features=None, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, drop=0.0, stride=False)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(in_features, hidden_features=None, out_features=None, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, drop=0.0, stride=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.shuffle_transformer.Attention(dim, num_heads, window_size=1, shuffle=False, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0, relative_pos_embedding=False)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(dim, num_heads, window_size=1, shuffle=False, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0, relative_pos_embedding=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.shuffle_transformer.Block(dim, out_dim, num_heads, window_size=1, shuffle=False, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, stride=False, relative_pos_embedding=False)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(dim, out_dim, num_heads, window_size=1, shuffle=False, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, stride=False, relative_pos_embedding=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.shuffle_transformer.PatchMerging(dim, out_dim, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(dim, out_dim, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- extra_repr() → str[source]¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- training: bool¶
- class easycv.models.backbones.shuffle_transformer.StageModule(layers, dim, out_dim, num_heads, window_size=1, shuffle=True, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, relative_pos_embedding=False)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(layers, dim, out_dim, num_heads, window_size=1, shuffle=True, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, relative_pos_embedding=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.shuffle_transformer.PatchEmbedding(inter_channel=32, out_channels=48)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(inter_channel=32, out_channels=48)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.shuffle_transformer.ShuffleTransformer(img_size=224, in_chans=3, num_classes=1000, token_dim=32, embed_dim=96, mlp_ratio=4.0, layers=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], relative_pos_embedding=True, shuffle=True, window_size=7, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, has_pos_embed=False, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(img_size=224, in_chans=3, num_classes=1000, token_dim=32, embed_dim=96, mlp_ratio=4.0, layers=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], relative_pos_embedding=True, shuffle=True, window_size=7, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, has_pos_embed=False, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- easycv.models.backbones.shuffle_transformer.shuffletrans_base_p4_w7_224(pretrained=False, **kwargs)[source]¶
easycv.models.backbones.swin_transformer_dynamic module¶
Borrow this code from https://github.com/microsoft/esvit/blob/main/models/swin_transformer.py To support dynamic swin-transformer for ssl!
- class easycv.models.backbones.swin_transformer_dynamic.WindowAttention(dim, window_size, num_heads, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]¶
Bases:
torch.nn.modules.module.Module
Window based multi-head self attention (W-MSA) module with relative position bias. It supports both of shifted and non-shifted window.
- Parameters
dim (int) – Number of input channels.
window_size (tuple[int]) – The height and width of the window.
num_heads (int) – Number of attention heads.
qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True
qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set
attn_drop (float, optional) – Dropout ratio of attention weight. Default: 0.0
proj_drop (float, optional) – Dropout ratio of output. Default: 0.0
- __init__(dim, window_size, num_heads, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, mask=None)[source]¶
- Parameters
x – input features with shape of (num_windows*B, N, C)
mask – (0/-inf) mask with shape of (num_windows, Wh*Ww, Wh*Ww) or None
- extra_repr() → str[source]¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- training: bool¶
- class easycv.models.backbones.swin_transformer_dynamic.SwinTransformerBlock(dim, input_resolution, num_heads, window_size=7, shift_size=0, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]¶
Bases:
torch.nn.modules.module.Module
Swin Transformer Block.
- Parameters
dim (int) – Number of input channels.
input_resolution (tuple[int]) – Input resulotion.
num_heads (int) – Number of attention heads.
window_size (int) – Window size.
shift_size (int) – Shift size for SW-MSA.
mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim.
qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True
qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set.
drop (float, optional) – Dropout rate. Default: 0.0
attn_drop (float, optional) – Attention dropout rate. Default: 0.0
drop_path (float, optional) – Stochastic depth rate. Default: 0.0
act_layer (nn.Module, optional) – Activation layer. Default: nn.GELU
norm_layer (nn.Module, optional) – Normalization layer. Default: nn.LayerNorm
- __init__(dim, input_resolution, num_heads, window_size=7, shift_size=0, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- extra_repr() → str[source]¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- training: bool¶
- class easycv.models.backbones.swin_transformer_dynamic.PatchMerging(input_resolution, dim, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]¶
Bases:
torch.nn.modules.module.Module
Patch Merging Layer.
- Parameters
input_resolution (tuple[int]) – Resolution of input feature.
dim (int) – Number of input channels.
norm_layer (nn.Module, optional) – Normalization layer. Default: nn.LayerNorm
- __init__(input_resolution, dim, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Forward function. :param x: Input feature, tensor size (B, H*W, C). :param H: Spatial resolution of the input feature. :param W: Spatial resolution of the input feature.
- extra_repr() → str[source]¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- training: bool¶
- class easycv.models.backbones.swin_transformer_dynamic.BasicLayer(dim, input_resolution, depth, num_heads, window_size, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, downsample=None)[source]¶
Bases:
torch.nn.modules.module.Module
A basic Swin Transformer layer for one stage.
- Parameters
dim (int) – Number of input channels.
input_resolution (tuple[int]) – Input resulotion.
depth (int) – Number of blocks.
num_heads (int) – Number of attention heads.
window_size (int) – Window size.
mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim.
qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True
qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set.
drop (float, optional) – Dropout rate. Default: 0.0
attn_drop (float, optional) – Attention dropout rate. Default: 0.0
drop_path (float | tuple[float], optional) – Stochastic depth rate. Default: 0.0
norm_layer (nn.Module, optional) – Normalization layer. Default: nn.LayerNorm
downsample (nn.Module | None, optional) – Downsample layer at the end of the layer. Default: None
- __init__(dim, input_resolution, depth, num_heads, window_size, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, downsample=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- extra_repr() → str[source]¶
Set the extra representation of the module
To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.
- training: bool¶
- class easycv.models.backbones.swin_transformer_dynamic.PatchEmbed(img_size=224, patch_size=16, in_chans=3, embed_dim=768, norm_layer=None)[source]¶
Bases:
torch.nn.modules.module.Module
Image to Patch Embedding
- __init__(img_size=224, patch_size=16, in_chans=3, embed_dim=768, norm_layer=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.swin_transformer_dynamic.DynamicSwinTransformer(img_size=224, patch_size=4, in_chans=3, num_classes=1000, embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], window_size=7, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.1, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, ape=False, patch_norm=True, use_dense_prediction=False, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- Swin Transformer
- A PyTorch impl ofSwin Transformer: Hierarchical Vision Transformer using Shifted Windows -
- Parameters
img_size (int | tuple(int)) – Input image size.
patch_size (int | tuple(int)) – Patch size.
in_chans (int) – Number of input channels.
num_classes (int) – Number of classes for classification head.
embed_dim (int) – Embedding dimension.
depths (tuple(int)) – Depth of Swin Transformer layers.
num_heads (tuple(int)) – Number of attention heads in different layers.
window_size (int) – Window size.
mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim.
qkv_bias (bool) – If True, add a learnable bias to query, key, value. Default: Truee
qk_scale (float) – Override default qk scale of head_dim ** -0.5 if set.
drop_rate (float) – Dropout rate.
attn_drop_rate (float) – Attention dropout rate.
drop_path_rate (float) – Stochastic depth rate.
norm_layer (nn.Module) – normalization layer.
ape (bool) – If True, add absolute position embedding to the patch embedding.
patch_norm (bool) – If True, add normalization after patch embedding.
- __init__(img_size=224, patch_size=4, in_chans=3, num_classes=1000, embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], window_size=7, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.1, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, ape=False, patch_norm=True, use_dense_prediction=False, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- easycv.models.backbones.swin_transformer_dynamic.dynamic_swin_tiny_p4_w7_224(pretrained=False, **kwargs)[source]¶
easycv.models.backbones.vit_transfomer_dynamic module¶
Mostly copy-paste from timm library. https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/vision_transformer.py
dynamic Input support borrow from https://github.com/microsoft/esvit/blob/main/models/vision_transformer.py
- class easycv.models.backbones.vit_transformer_dynamic.DynamicVisionTransformer(use_dense_prediction=False, **kwargs)[source]¶
Bases:
easycv.models.backbones.vision_transformer.VisionTransformer
Dynamic Vision Transformer
- Parameters
use_dense_prediction (bool) – If use_dense_prediction is True, the global pool and norm will before head will be removed.(if any) Default: False
- __init__(use_dense_prediction=False, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- easycv.models.backbones.vit_transformer_dynamic.dynamic_deit_tiny_p16(patch_size=16, **kwargs)[source]¶
- easycv.models.backbones.vit_transformer_dynamic.dynamic_deit_small_p16(patch_size=16, **kwargs)[source]¶
- easycv.models.backbones.vit_transformer_dynamic.dynamic_vit_base_p16(patch_size=16, **kwargs)[source]¶
easycv.models.backbones.xcit_transformer module¶
Implementation of Cross-Covariance Image Transformer (XCiT) Based on timm and DeiT code bases https://github.com/rwightman/pytorch-image-models/tree/master/timm https://github.com/facebookresearch/deit/
XCiT Transformer. Part of the code is borrowed from: https://github.com/facebookresearch/xcit/blob/master/xcit.py
- class easycv.models.backbones.xcit_transformer.PositionalEncodingFourier(hidden_dim=32, dim=768, temperature=10000)[source]¶
Bases:
torch.nn.modules.module.Module
Positional encoding relying on a fourier kernel matching the one used in the “Attention is all of Need” paper. The implementation builds on DeTR code https://github.com/facebookresearch/detr/blob/master/models/position_encoding.py
- __init__(hidden_dim=32, dim=768, temperature=10000)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(B, H, W)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- easycv.models.backbones.xcit_transformer.conv3x3(in_planes, out_planes, stride=1)[source]¶
3x3 convolution with padding
- class easycv.models.backbones.xcit_transformer.ConvPatchEmbed(img_size=224, patch_size=16, in_chans=3, embed_dim=768)[source]¶
Bases:
torch.nn.modules.module.Module
Image to Patch Embedding using multiple convolutional layers
- __init__(img_size=224, patch_size=16, in_chans=3, embed_dim=768)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, padding_size=None)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.xcit_transformer.LPI(in_features, hidden_features=None, out_features=None, act_layer=<class 'torch.nn.modules.activation.GELU'>, drop=0.0, kernel_size=3)[source]¶
Bases:
torch.nn.modules.module.Module
Local Patch Interaction module that allows explicit communication between tokens in 3x3 windows to augment the implicit communcation performed by the block diagonal scatter attention. Implemented using 2 layers of separable 3x3 convolutions with GeLU and BatchNorm2d
- __init__(in_features, hidden_features=None, out_features=None, act_layer=<class 'torch.nn.modules.activation.GELU'>, drop=0.0, kernel_size=3)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, H, W)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.xcit_transformer.ClassAttention(dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]¶
Bases:
torch.nn.modules.module.Module
Class Attention Layer as in CaiT https://arxiv.org/abs/2103.17239
- __init__(dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.xcit_transformer.ClassAttentionBlock(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, eta=None, tokens_norm=False)[source]¶
Bases:
torch.nn.modules.module.Module
Class Attention Layer as in CaiT https://arxiv.org/abs/2103.17239
- __init__(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, eta=None, tokens_norm=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, H, W, mask=None)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.xcit_transformer.XCA(dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]¶
Bases:
torch.nn.modules.module.Module
Cross-Covariance Attention (XCA) operation where the channels are updated using a weighted sum.
The weights are obtained from the (softmax normalized) Cross-covariance matrix (Q^T K in d_h times d_h)
- __init__(dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.xcit_transformer.XCABlock(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, num_tokens=196, eta=None)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, num_tokens=196, eta=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, H, W)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.backbones.xcit_transformer.XCiT(img_size=224, patch_size=16, in_chans=3, num_classes=1000, embed_dim=768, depth=12, num_heads=12, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, norm_layer=None, cls_attn_layers=2, use_pos=True, patch_proj='linear', eta=None, tokens_norm=False)[source]¶
Bases:
torch.nn.modules.module.Module
Based on timm and DeiT code bases https://github.com/rwightman/pytorch-image-models/tree/master/timm https://github.com/facebookresearch/deit/
- __init__(img_size=224, patch_size=16, in_chans=3, num_classes=1000, embed_dim=768, depth=12, num_heads=12, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, norm_layer=None, cls_attn_layers=2, use_pos=True, patch_proj='linear', eta=None, tokens_norm=False)[source]¶
- Parameters
img_size (int, tuple) – input image size
patch_size (int, tuple) – patch size
in_chans (int) – number of input channels
num_classes (int) – number of classes for classification head
embed_dim (int) – embedding dimension
depth (int) – depth of transformer
num_heads (int) – number of attention heads
mlp_ratio (int) – ratio of mlp hidden dim to embedding dim
qkv_bias (bool) – enable bias for qkv if True
qk_scale (float) – override default qk scale of head_dim ** -0.5 if set
drop_rate (float) – dropout rate
attn_drop_rate (float) – attention dropout rate
drop_path_rate (float) – stochastic depth rate
norm_layer – (nn.Module): normalization layer
cls_attn_layers – (int) Depth of Class attention layers
use_pos – (bool) whether to use positional encoding
eta – (float) layerscale initialization value
tokens_norm – (bool) Whether to normalize all tokens or just the cls_token in the CA
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.classification package¶
Submodules¶
easycv.models.classification.classification module¶
- class easycv.models.classification.classification.Classification(backbone, train_preprocess=[], with_sobel=False, head=None, neck=None, pretrained=True, mixup_cfg=None)[source]¶
Bases:
easycv.models.base.BaseModel
- Parameters
pretrained – Select one {str or True or False/None}.
pretrained == str (if) –
model from specified path; (load) –
pretrained == True (if) –
model from default path (load) –
pretrained == False or None (if) –
from init weights. (load) –
- __init__(backbone, train_preprocess=[], with_sobel=False, head=None, neck=None, pretrained=True, mixup_cfg=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward_backbone(img: torch.Tensor) → List[torch.Tensor][source]¶
Forward backbone
- Returns
backbone outputs
- Return type
x (tuple)
- forward_train(img, gt_labels) → Dict[str, torch.Tensor][source]¶
In forward train, model will forward backbone + neck / multi-neck, get alist of output tensor, and put this list to head / multi-head, to compute each loss
- forward_test(img: torch.Tensor) → Dict[str, torch.Tensor][source]¶
forward_test means generate prob/class from image only support one neck + one head
- forward_test_label(img, gt_labels) → Dict[str, torch.Tensor][source]¶
forward_test_label means generate prob/class from image only support one neck + one head ps : head init need set the input feature idx
- training: bool¶
- forward_feature(img) → Dict[str, torch.Tensor][source]¶
- Forward feature means forward backbone + neck/multineck ,get dict of output feature,
self.neck_num = 0: means only forward backbone, output backbone feature with avgpool, with key neck, self.neck_num > 0: means has 1/multi neck, output neck’s feature with key neck_neckidx_featureidx, suck as neck_0_0
- Returns
feature tensor
- Return type
x (torch.Tensor)
- forward(img: torch.Tensor, mode: str = 'train', gt_labels: Optional[torch.Tensor] = None, img_metas: Optional[torch.Tensor] = None) → Dict[str, torch.Tensor][source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
easycv.models.classification.necks module¶
- class easycv.models.classification.necks.LinearNeck(in_channels, out_channels, with_avg_pool=True, with_norm=False)[source]¶
Bases:
torch.nn.modules.module.Module
Linear neck: fc only
- __init__(in_channels, out_channels, with_avg_pool=True, with_norm=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.classification.necks.RetrivalNeck(in_channels, out_channels, with_avg_pool=True, cdg_config=['G', 'M'])[source]¶
Bases:
torch.nn.modules.module.Module
- RetrivalNeck: refer, Combination of Multiple Global Descriptors for Image Retrieval
CGD feature : only use avg pool + gem pooling + max pooling, by pool -> fc -> norm -> concat -> norm Avg feature : use avg pooling, avg pool -> syncbn -> fc
len(cgd_config) > 0: return [CGD, Avg] len(cgd_config) = 0 : return [Avg]
- __init__(in_channels, out_channels, with_avg_pool=True, cdg_config=['G', 'M'])[source]¶
Init RetrivalNeck, faceid neck doesn’t pool for input feature map, doesn’t support dynamic input
- Parameters
in_channels – Int - input feature map channels
out_channels – Int - output feature map channels
with_avg_pool – bool do avg pool for BNneck or not
cdg_config – list(‘G’,’M’,’S’), to configure output feature, CGD = [gempooling] + [maxpooling] + [meanpooling], if len(cgd_config) > 0: return [CGD, Avg] if len(cgd_config) = 0 : return [Avg]
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.classification.necks.FaceIDNeck(in_channels, out_channels, map_shape=1, dropout_ratio=0.4, with_norm=False, bn_type='SyncBN')[source]¶
Bases:
torch.nn.modules.module.Module
FaceID neck: Include BN, dropout, flatten, linear, bn
- __init__(in_channels, out_channels, map_shape=1, dropout_ratio=0.4, with_norm=False, bn_type='SyncBN')[source]¶
Init FaceIDNeck, faceid neck doesn’t pool for input feature map, doesn’t support dynamic input
- Parameters
in_channels – Int - input feature map channels
out_channels – Int - output feature map channels
map_shape – Int or list(int,…), input feature map (w,h) or w when w=h,
dropout_ratio – float, drop out ratio
with_norm – normalize output feature or not
bn_type – SyncBN or BN
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.classification.necks.MultiLinearNeck(in_channels, out_channels, num_layers=1, with_avg_pool=True)[source]¶
Bases:
torch.nn.modules.module.Module
MultiLinearNeck neck: MultiFc head
- __init__(in_channels, out_channels, num_layers=1, with_avg_pool=True)[source]¶
- Parameters
in_channels – int or list[int]
out_channels – int or list[int]
num_layers – total fc num
with_avg_pool – input will be avgPool if True
- Returns
None
- Raises
len(in_channel) != len(out_channels) –
len(in_channel) != len(num_layers) –
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.classification.necks.HRFuseScales(in_channels, out_channels=2048, norm_cfg={'momentum': 0.1, 'type': 'BN'})[source]¶
Bases:
torch.nn.modules.module.Module
Fuse feature map of multiple scales in HRNet. :param in_channels: The input channels of all scales. :type in_channels: list[int] :param out_channels: The channels of fused feature map.
Defaults to 2048.
- Parameters
norm_cfg (dict) – dictionary to construct norm layers. Defaults to
dict(type='BN', momentum=0.1)
.init_cfg (dict | list[dict], optional) – Initialization config dict. Defaults to
dict(type='Normal', layer='Linear', std=0.01))
.
- __init__(in_channels, out_channels=2048, norm_cfg={'momentum': 0.1, 'type': 'BN'})[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.classification.necks.ReIDNeck(in_channels, dropout, relu=False, norm=True, out_channels=512)[source]¶
Bases:
torch.nn.modules.module.Module
ReID neck: Include Linear, bn, relu, dropout
- __init__(in_channels, dropout, relu=False, norm=True, out_channels=512)[source]¶
Init FaceIDNeck, faceid neck doesn’t pool for input feature map, doesn’t support dynamic input
- Parameters
in_channels – Int - input feature map channels
out_channels – Int - output feature map channels
map_shape – Int or list(int,…), input feature map (w,h) or w when w=h,
dropout_ratio – float, drop out ratio
with_norm – normalize output feature or not
bn_type – SyncBN or BN
- training: bool¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
easycv.models.detection package¶
Subpackages¶
easycv.models.detection.utils package¶
- easycv.models.detection.utils.boxes.bbox2result(bboxes, labels, num_classes)[source]¶
Convert detection results to a list of numpy arrays. :param bboxes: shape (n, 5) :type bboxes: torch.Tensor | np.ndarray :param labels: shape (n, ) :type labels: torch.Tensor | np.ndarray :param num_classes: class number, including background class :type num_classes: int
- Returns
bbox results of each class
- Return type
list(ndarray)
- easycv.models.detection.utils.boxes.generalized_box_iou(boxes1, boxes2)[source]¶
Generalized IoU from https://giou.stanford.edu/ The boxes should be in [x0, y0, x1, y1] format Returns a [N, M] pairwise matrix, where N = len(boxes1) and M = len(boxes2)
- easycv.models.detection.utils.boxes.bbox_overlaps(bboxes1, bboxes2, mode='iou', is_aligned=False, eps=1e-06)[source]¶
Calculate overlap between two set of bboxes.
FP16 Contributed by https://github.com/open-mmlab/mmdetection/pull/4889 .. note:
Assume bboxes1 is M x 4, bboxes2 is N x 4, when mode is 'iou', there are some new generated variable when calculating IOU using bbox_overlaps function: 1) is_aligned is False area1: M x 1 area2: N x 1 lt: M x N x 2 rb: M x N x 2 wh: M x N x 2 overlap: M x N x 1 union: M x N x 1 ious: M x N x 1 Total memory: S = (9 x N x M + N + M) * 4 Byte, When using FP16, we can reduce: R = (9 x N x M + N + M) * 4 / 2 Byte R large than (N + M) * 4 * 2 is always true when N and M >= 1. Obviously, N + M <= N * M < 3 * N * M, when N >=2 and M >=2, N + 1 < 3 * N, when N or M is 1. Given M = 40 (ground truth), N = 400000 (three anchor boxes in per grid, FPN, R-CNNs), R = 275 MB (one times) A special case (dense detection), M = 512 (ground truth), R = 3516 MB = 3.43 GB When the batch size is B, reduce: B x R Therefore, CUDA memory runs out frequently. Experiments on GeForce RTX 2080Ti (11019 MiB): | dtype | M | N | Use | Real | Ideal | |:----:|:----:|:----:|:----:|:----:|:----:| | FP32 | 512 | 400000 | 8020 MiB | -- | -- | | FP16 | 512 | 400000 | 4504 MiB | 3516 MiB | 3516 MiB | | FP32 | 40 | 400000 | 1540 MiB | -- | -- | | FP16 | 40 | 400000 | 1264 MiB | 276MiB | 275 MiB | 2) is_aligned is True area1: N x 1 area2: N x 1 lt: N x 2 rb: N x 2 wh: N x 2 overlap: N x 1 union: N x 1 ious: N x 1 Total memory: S = 11 x N * 4 Byte When using FP16, we can reduce: R = 11 x N * 4 / 2 Byte So do the 'giou' (large than 'iou'). Time-wise, FP16 is generally faster than FP32. When gpu_assign_thr is not -1, it takes more time on cpu but not reduce memory. There, we can reduce half the memory and keep the speed.
If
is_aligned
isFalse
, then calculate the overlaps between each bbox of bboxes1 and bboxes2, otherwise the overlaps between each aligned pair of bboxes1 and bboxes2.- Parameters
bboxes1 (Tensor) – shape (B, m, 4) in <x1, y1, x2, y2> format or empty.
bboxes2 (Tensor) – shape (B, n, 4) in <x1, y1, x2, y2> format or empty. B indicates the batch dim, in shape (B1, B2, …, Bn). If
is_aligned
isTrue
, then m and n must be equal.mode (str) – “iou” (intersection over union), “iof” (intersection over foreground) or “giou” (generalized intersection over union). Default “iou”.
is_aligned (bool, optional) – If True, then m and n must be equal. Default False.
eps (float, optional) – A value added to the denominator for numerical stability. Default 1e-6.
- Returns
shape (m, n) if
is_aligned
is False else shape (m,)- Return type
Tensor
Example
>>> bboxes1 = torch.FloatTensor([ >>> [0, 0, 10, 10], >>> [10, 10, 20, 20], >>> [32, 32, 38, 42], >>> ]) >>> bboxes2 = torch.FloatTensor([ >>> [0, 0, 10, 20], >>> [0, 10, 10, 19], >>> [10, 10, 20, 20], >>> ]) >>> overlaps = bbox_overlaps(bboxes1, bboxes2) >>> assert overlaps.shape == (3, 3) >>> overlaps = bbox_overlaps(bboxes1, bboxes2, is_aligned=True) >>> assert overlaps.shape == (3, )
Example
>>> empty = torch.empty(0, 4) >>> nonempty = torch.FloatTensor([[0, 0, 10, 9]]) >>> assert tuple(bbox_overlaps(empty, nonempty).shape) == (0, 1) >>> assert tuple(bbox_overlaps(nonempty, empty).shape) == (1, 0) >>> assert tuple(bbox_overlaps(empty, empty).shape) == (0, 0)
- easycv.models.detection.utils.boxes.bbox2distance(points, bbox, max_dis=None, eps=0.1)[source]¶
Decode bounding box based on distances.
- Parameters
points (Tensor) – Shape (n, 2), [x, y].
bbox (Tensor) – Shape (n, 4), “xyxy” format
max_dis (float) – Upper bound of the distance.
eps (float) – a small value to ensure target < max_dis, instead <=
- Returns
Decoded distances.
- Return type
Tensor
- easycv.models.detection.utils.boxes.distance2bbox(points, distance, max_shape=None)[source]¶
Decode distance prediction to bounding box.
- Parameters
points (Tensor) – Shape (B, N, 2) or (N, 2).
distance (Tensor) – Distance from the given point to 4 boundaries (left, top, right, bottom). Shape (B, N, 4) or (N, 4)
(Sequence[int] or torch.Tensor or Sequence[ (max_shape) – Sequence[int]],optional): Maximum bounds for boxes, specifies (H, W, C) or (H, W). If priors shape is (B, N, 4), then the max_shape should be a Sequence[Sequence[int]] and the length of max_shape should also be B.
- Returns
Boxes with shape (N, 4) or (B, N, 4)
- Return type
Tensor
- easycv.models.detection.utils.boxes.batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False)[source]¶
Performs non-maximum suppression in a batched fashion.
Modified from torchvision/ops/boxes.py#L39. In order to perform NMS independently per class, we add an offset to all the boxes. The offset is dependent only on the class idx, and is large enough so that boxes from different classes do not overlap.
Note
In v1.4.1 and later,
batched_nms
supports skipping the NMS and returns sorted raw results when nms_cfg is None.- Parameters
boxes (torch.Tensor) – boxes in shape (N, 4).
scores (torch.Tensor) – scores in shape (N, ).
idxs (torch.Tensor) – each index value correspond to a bbox cluster, and NMS will not be applied between elements of different idxs, shape (N, ).
nms_cfg (dict | None) –
Supports skipping the nms when nms_cfg is None, otherwise it should specify nms type and other parameters like iou_thr. Possible keys includes the following.
iou_thr (float): IoU threshold used for NMS.
split_thr (float): threshold number of boxes. In some cases the number of boxes is large (e.g., 200k). To avoid OOM during training, the users could set split_thr to a small value. If the number of boxes is greater than the threshold, it will perform NMS on each group of boxes separately and sequentially. Defaults to 10000.
class_agnostic (bool) – if true, nms is class agnostic, i.e. IoU thresholding happens over all boxes, regardless of the predicted class.
- Returns
kept dets and indice.
boxes (Tensor): Bboxes with score after nms, has shape (num_bboxes, 5). last dimension 5 arrange as (x1, y1, x2, y2, score)
keep (Tensor): The indices of remaining boxes in input boxes.
- Return type
tuple
easycv.models.heads package¶
Submodules¶
easycv.models.heads.cls_head module¶
- class easycv.models.heads.cls_head.ClsHead(with_avg_pool=False, label_smooth=0.0, in_channels=2048, with_fc=True, num_classes=1000, loss_config={'type': 'CrossEntropyLossWithLabelSmooth'}, input_feature_index=[0], init_cfg={'bias': 0.0, 'layer': 'Linear', 'std': 0.01, 'type': 'Normal'}, use_num_classes=True)[source]¶
Bases:
torch.nn.modules.module.Module
Simplest classifier head, with only one fc layer. Should Notice Evtorch module design input always be feature_list = [tensor, tensor,…]
- __init__(with_avg_pool=False, label_smooth=0.0, in_channels=2048, with_fc=True, num_classes=1000, loss_config={'type': 'CrossEntropyLossWithLabelSmooth'}, input_feature_index=[0], init_cfg={'bias': 0.0, 'layer': 'Linear', 'std': 0.01, 'type': 'Normal'}, use_num_classes=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x: List[torch.Tensor]) → List[torch.Tensor][source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- loss(cls_score: List[torch.Tensor], labels: torch.Tensor) → Dict[str, torch.Tensor][source]¶
- Parameters
cls_score – [N x num_classes]
labels – if don’t use mixup, shape is [N],else [N x num_classes]
- training: bool¶
easycv.models.heads.contrastive_head module¶
- class easycv.models.heads.contrastive_head.ContrastiveHead(temperature=0.1)[source]¶
Bases:
torch.nn.modules.module.Module
Head for contrastive learning.
- __init__(temperature=0.1)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pos, neg)[source]¶
- Parameters
pos (Tensor) – Nx1 positive similarity
neg (Tensor) – Nxk negative similarity
- training: bool¶
- class easycv.models.heads.contrastive_head.DebiasedContrastiveHead(temperature=0.1, tau=0.1)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(temperature=0.1, tau=0.1)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pos, neg)[source]¶
- Parameters
pos (Tensor) – Nx1 positive similarity
neg (Tensor) – Nxk negative similarity
- training: bool¶
easycv.models.heads.latent_pred_head module¶
- class easycv.models.heads.latent_pred_head.LatentPredictHead(predictor, size_average=True)[source]¶
Bases:
torch.nn.modules.module.Module
Head for contrastive learning.
- __init__(predictor, size_average=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input, target)[source]¶
- Parameters
input (Tensor) – NxC input features.
target (Tensor) – NxC target features.
- training: bool¶
- class easycv.models.heads.latent_pred_head.LatentClsHead(predictor)[source]¶
Bases:
torch.nn.modules.module.Module
Head for contrastive learning.
- __init__(predictor)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(input, target)[source]¶
- Parameters
input (Tensor) – NxC input features.
target (Tensor) – NxC target features.
- training: bool¶
easycv.models.heads.mp_metric_head module¶
- easycv.models.heads.mp_metric_head.EmbeddingExplansion(embs, labels, explanion_rate=4, alpha=1.0)[source]¶
Expand embedding: CVPR refer to https://github.com/clovaai/embedding-expansion combine PK sampled data, mixup anchor positive pair to generate more features, always combine with BatchHardminer. result on SOP and CUB need to be add
- Parameters
embs – [N , dims] tensor
labels – [N] tensor
explanion_rate – to expand N to explanion_rate * N
alpha – beta distribution parameter for mixup
- Returns
[N*explanion_rate , dims]
- Return type
embs
- class easycv.models.heads.mp_metric_head.MpMetrixHead(with_avg_pool=False, in_channels=2048, loss_config=[{'type': 'CircleLoss', 'loss_weight': 1.0, 'norm': True, 'ddp': True, 'm': 0.4, 'gamma': 80}], input_feature_index=[0], input_label_index=0, ignore_label=None)[source]¶
Bases:
torch.nn.modules.module.Module
Simplest classifier head, with only one fc layer.
- __init__(with_avg_pool=False, in_channels=2048, loss_config=[{'type': 'CircleLoss', 'loss_weight': 1.0, 'norm': True, 'ddp': True, 'm': 0.4, 'gamma': 80}], input_feature_index=[0], input_label_index=0, ignore_label=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x: List[torch.Tensor]) → List[torch.Tensor][source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.heads.multi_cls_head module¶
- class easycv.models.heads.multi_cls_head.MultiClsHead(pool_type='adaptive', in_indices=(0), with_last_layer_unpool=False, backbone='resnet50', norm_cfg={'type': 'BN'}, num_classes=1000)[source]¶
Bases:
torch.nn.modules.module.Module
Multiple classifier heads.
- FEAT_CHANNELS = {'resnet50': [64, 256, 512, 1024, 2048]}¶
- FEAT_LAST_UNPOOL = {'resnet50': 100352}¶
- __init__(pool_type='adaptive', in_indices=(0), with_last_layer_unpool=False, backbone='resnet50', norm_cfg={'type': 'BN'}, num_classes=1000)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.loss package¶
- class easycv.models.loss.CrossEntropyLoss(use_sigmoid=False, use_mask=False, reduction='mean', class_weight=None, loss_weight=1.0, loss_name='loss_ce', avg_non_ignore=False, label_ceil=False)[source]¶
Bases:
torch.nn.modules.module.Module
CrossEntropyLoss.
- Parameters
use_sigmoid (bool, optional) – Whether the prediction uses sigmoid of softmax. Defaults to False.
use_mask (bool, optional) – Whether to use mask cross entropy loss. Defaults to False.
reduction (str, optional) – . Defaults to ‘mean’. Options are “none”, “mean” and “sum”.
class_weight (list[float] | str, optional) – Weight of each class. If in str format, read them from a file. Defaults to None.
loss_weight (float, optional) – Weight of the loss. Defaults to 1.0.
loss_name (str, optional) – Name of the loss item. If you want this loss item to be included into the backward graph, loss_ must be the prefix of the name. Defaults to ‘loss_ce’.
avg_non_ignore (bool) – The flag decides to whether the loss is only averaged over non-ignored targets. Default: False. New in version 0.23.0.
label_ceil (bool) – When use bce and set label_ceil=True, it will make elements belong to (0, 1] in label change to 1. Default: False.
- __init__(use_sigmoid=False, use_mask=False, reduction='mean', class_weight=None, loss_weight=1.0, loss_name='loss_ce', avg_non_ignore=False, label_ceil=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(cls_score, label, weight=None, avg_factor=None, reduction_override=None, ignore_index=- 100, **kwargs)[source]¶
Forward function.
- property loss_name¶
Loss Name.
This function must be implemented and will return the name of this loss function. This name will be used to combine different loss items by simple sum operation. In addition, if you want this loss item to be included into the backward graph, loss_ must be the prefix of the name.
- Returns
The name of this loss item.
- Return type
str
- training: bool¶
- class easycv.models.loss.FacePoseLoss(pose_weight=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(pose_weight=1.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.WingLossWithPose(num_points=106, left_eye_left_corner_index=66, right_eye_right_corner_index=79, points_weight=1.0, contour_weight=1.5, eyebrow_weight=1.5, eye_weight=1.7, nose_weight=1.3, lip_weight=1.7, omega=10, epsilon=2)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(num_points=106, left_eye_left_corner_index=66, right_eye_right_corner_index=79, points_weight=1.0, contour_weight=1.5, eyebrow_weight=1.5, eye_weight=1.7, nose_weight=1.3, lip_weight=1.7, omega=10, epsilon=2)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target, pose)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.FocalLoss(use_sigmoid=True, gamma=2.0, alpha=0.25, reduction='mean', loss_weight=1.0, activated=False)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(use_sigmoid=True, gamma=2.0, alpha=0.25, reduction='mean', loss_weight=1.0, activated=False)[source]¶
-
- Parameters
use_sigmoid (bool, optional) – Whether to the prediction is used for sigmoid or softmax. Defaults to True.
gamma (float, optional) – The gamma for calculating the modulating factor. Defaults to 2.0.
alpha (float, optional) – A balanced form for Focal Loss. Defaults to 0.25.
reduction (str, optional) – The method used to reduce the loss into a scalar. Defaults to ‘mean’. Options are “none”, “mean” and “sum”.
loss_weight (float, optional) – Weight of loss. Defaults to 1.0.
activated (bool, optional) – Whether the input is activated. If True, it means the input has been activated and can be treated as probabilities. Else, it should be treated as logits. Defaults to False.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None)[source]¶
Forward function.
- Parameters
pred (torch.Tensor) – The prediction.
target (torch.Tensor) – The learning label of the prediction.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”.
- Returns
The calculated loss
- Return type
torch.Tensor
- training: bool¶
- class easycv.models.loss.VarifocalLoss(use_sigmoid=True, alpha=0.75, gamma=2.0, iou_weighted=True, reduction='mean', loss_weight=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(use_sigmoid=True, alpha=0.75, gamma=2.0, iou_weighted=True, reduction='mean', loss_weight=1.0)[source]¶
Varifocal Loss :param use_sigmoid: Whether the prediction is
used for sigmoid or softmax. Defaults to True.
- Parameters
alpha (float, optional) – A balance factor for the negative part of Varifocal Loss, which is different from the alpha of Focal Loss. Defaults to 0.75.
gamma (float, optional) – The gamma for calculating the modulating factor. Defaults to 2.0.
iou_weighted (bool, optional) – Whether to weight the loss of the positive examples with the iou target. Defaults to True.
reduction (str, optional) – The method used to reduce the loss into a scalar. Defaults to ‘mean’. Options are “none”, “mean” and “sum”.
loss_weight (float, optional) – Weight of loss. Defaults to 1.0.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None)[source]¶
Forward function. :param pred: The prediction. :type pred: torch.Tensor :param target: The learning target of the prediction. :type target: torch.Tensor :param weight: The weight of loss for each
prediction. Defaults to None.
- Parameters
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”.
- Returns
The calculated loss
- Return type
torch.Tensor
- training: bool¶
- class easycv.models.loss.GIoULoss(eps=1e-06, reduction='mean', loss_weight=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(eps=1e-06, reduction='mean', loss_weight=1.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.IoULoss(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[source]¶
Bases:
torch.nn.modules.module.Module
IoULoss.
Computing the IoU loss between a set of predicted bboxes and target bboxes.
- Parameters
linear (bool) – If True, use linear scale of loss else determined by mode. Default: False.
eps (float) – Eps to avoid log(0).
reduction (str) – Options are “none”, “mean” and “sum”.
loss_weight (float) – Weight of loss.
mode (str) – Loss scaling mode, including “linear”, “square”, and “log”. Default: ‘log’
- __init__(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]¶
Forward function.
- Parameters
pred (torch.Tensor) – The prediction.
target (torch.Tensor) – The learning target of the prediction.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None. Options are “none”, “mean” and “sum”.
- training: bool¶
- class easycv.models.loss.YOLOX_IOULoss(reduction='none', loss_type='iou')[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(reduction='none', loss_type='iou')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.JointsMSELoss(use_target_weight=False, loss_weight=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
MSE loss for heatmaps.
- Parameters
use_target_weight (bool) – Option to use weighted MSE loss. Different joint types may have different target weights.
loss_weight (float) – Weight of the loss. Default: 1.0.
- __init__(use_target_weight=False, loss_weight=1.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
- class easycv.models.loss.FocalLoss2d(gamma=2, weight=None, size_average=None, reduce=None, reduction='mean', num_classes=2)[source]¶
Bases:
torch.nn.modules.loss._WeightedLoss
- __init__(gamma=2, weight=None, size_average=None, reduce=None, reduction='mean', num_classes=2)[source]¶
FocalLoss2d, loss solve 2-class classification unbalance problem
- Parameters
gamma – focal loss param Gamma
weight – weight same as loss._WeightedLoss
size_average – size_average same as loss._WeightedLoss
reduce – reduce same as loss._WeightedLoss
reduction – reduce same as loss._WeightedLoss
num_classes – fix num 2
- Returns
Focalloss nn.module.loss object
- reduction: str¶
- class easycv.models.loss.DistributeMSELoss[source]¶
Bases:
torch.nn.modules.module.Module
- forward(input, target)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.CrossEntropyLossWithLabelSmooth(label_smooth=0.1, temperature=1.0, with_cls=False, embedding_size=512, num_classes=10000)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(label_smooth=0.1, temperature=1.0, with_cls=False, embedding_size=512, num_classes=10000)[source]¶
A softmax loss , with label_smooth and fc(to fit pytorch metric learning interface) :param label_smooth: label_smooth args, default=0.1 :param with_cls: if True, will generate a nn.Linear to trans input embedding from embedding_size to num_classes :param embedding_size: if input is feature not logits, then need this to indicate embedding shape :param num_classes: if input is feature not logits, then need this to indicate classification num_classes
- Returns
None
- Raises
IOError – An error occurred accessing the bigtable.Table object.
- forward(input, target)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.AMSoftmaxLoss(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]¶
AMsoftmax loss , with fc(to fit pytorch metric learning interface), paper: https://arxiv.org/pdf/1801.05599.pdf :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes :param margin: AMSoftmax param :param scale: AMSoftmax param, should increase num_classes
- forward(x, lb)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.ModelParallelSoftmaxLoss(embedding_size=512, num_classes=100000, scale=None, margin=None, bias=True)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(embedding_size=512, num_classes=100000, scale=None, margin=None, bias=True)[source]¶
ModelParallel Softmax by sailfish :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes
- forward(x, lb)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.ModelParallelAMSoftmaxLoss(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]¶
ModelParallel AMSoftmax by sailfish :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes
- forward(x, lb)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.SoftTargetCrossEntropy(num_classes=1000, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(num_classes=1000, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x: torch.Tensor, target: torch.Tensor) → torch.Tensor[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.CDNCriterion(num_classes, matcher, weight_dict, losses, eos_coef=None, loss_class_type='ce')[source]¶
Bases:
easycv.models.loss.set_criterion.set_criterion.SetCriterion
This class computes the loss for Conditional DETR. The process happens in two steps:
we compute hungarian assignment between ground truth boxes and the outputs of the model
we supervise each pair of matched ground-truth / prediction (supervise class and box)
- __init__(num_classes, matcher, weight_dict, losses, eos_coef=None, loss_class_type='ce')[source]¶
Create the criterion. :param num_classes: number of object categories, omitting the special no-object category :param matcher: module able to compute a matching between targets and proposals :param weight_dict: dict containing as key the names of the losses and as values their relative weight. :param losses: list of all the losses to be applied. See get_loss for list of available losses.
- forward(outputs, targets, aux_num, num_boxes)[source]¶
This performs the loss computation. :param outputs: dict of tensors, see the output specification of the model for the format :param targets: list of dicts, such that len(targets) == batch_size.
The expected keys in each dict depends on the losses applied, see each loss’ doc
- Parameters
return_indices – used for vis. if True, the layer0-5 indices will be returned as well.
- training: bool¶
- class easycv.models.loss.DNCriterion(weight_dict)[source]¶
Bases:
torch.nn.modules.module.Module
This class computes the loss for Conditional DETR. The process happens in two steps:
we compute hungarian assignment between ground truth boxes and the outputs of the model
we supervise each pair of matched ground-truth / prediction (supervise class and box)
- __init__(weight_dict)[source]¶
Create the criterion. :param num_classes: number of object categories, omitting the special no-object category :param matcher: module able to compute a matching between targets and proposals :param weight_dict: dict containing as key the names of the losses and as values their relative weight. :param losses: list of all the losses to be applied. See get_loss for list of available losses.
- prepare_for_loss(mask_dict)[source]¶
prepare dn components to calculate loss :param mask_dict: a dict that contains dn information
- tgt_loss_boxes(src_boxes, tgt_boxes, num_tgt)[source]¶
Compute the losses related to the bounding boxes, the L1 regression loss and the GIoU loss targets dicts must contain the key “boxes” containing a tensor of dim [nb_target_boxes, 4] The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size.
- tgt_loss_labels(src_logits_, tgt_labels_, num_tgt, focal_alpha, log=False)[source]¶
Classification loss (NLL) targets dicts must contain the key “labels” containing a tensor of dim [nb_target_boxes]
- forward(mask_dict, aux_num)[source]¶
compute dn loss in criterion :param mask_dict: a dict for dn information :param training: training or inference flag :param aux_num: aux loss number
- training: bool¶
- class easycv.models.loss.DBLoss(balance_loss=True, main_loss_type='DiceLoss', alpha=5, beta=10, ohem_ratio=3, eps=1e-06, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
Differentiable Binarization (DB) Loss Function :param parm: the super paramter for DB Loss :type parm: dict
- __init__(balance_loss=True, main_loss_type='DiceLoss', alpha=5, beta=10, ohem_ratio=3, eps=1e-06, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
- forward(predicts, labels)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class easycv.models.loss.HungarianMatcher(cost_dict, cost_class_type='ce_cost')[source]¶
Bases:
torch.nn.modules.module.Module
This class computes an assignment between the targets and the predictions of the network For efficiency reasons, the targets don’t include the no_object. Because of this, in general, there are more predictions than targets. In this case, we do a 1-to-1 matching of the best predictions, while the others are un-matched (and thus treated as non-objects).
- __init__(cost_dict, cost_class_type='ce_cost')[source]¶
Creates the matcher Params:
cost_class: This is the relative weight of the classification error in the matching cost cost_bbox: This is the relative weight of the L1 error of the bounding box coordinates in the matching cost cost_giou: This is the relative weight of the giou loss of the bounding box in the matching cost
- forward(outputs, targets)[source]¶
Performs the matching Params:
- outputs: This is a dict that contains at least these entries:
“pred_logits”: Tensor of dim [batch_size, num_queries, num_classes] with the classification logits “pred_boxes”: Tensor of dim [batch_size, num_queries, 4] with the predicted box coordinates
- targets: This is a list of targets (len(targets) = batch_size), where each target is a dict containing:
- “labels”: Tensor of dim [num_target_boxes] (where num_target_boxes is the number of ground-truth
objects in the target) containing the class labels
“boxes”: Tensor of dim [num_target_boxes, 4] containing the target box coordinates
- Returns
- index_i is the indices of the selected predictions (in order)
index_j is the indices of the corresponding selected targets (in order)
- For each batch element, it holds:
len(index_i) = len(index_j) = min(num_queries, num_target_boxes)
- Return type
A list of size batch_size, containing tuples of (index_i, index_j) where
- training: bool¶
- class easycv.models.loss.SetCriterion(num_classes, matcher, weight_dict, losses, eos_coef=None, loss_class_type='ce')[source]¶
Bases:
torch.nn.modules.module.Module
This class computes the loss for Conditional DETR. The process happens in two steps:
we compute hungarian assignment between ground truth boxes and the outputs of the model
we supervise each pair of matched ground-truth / prediction (supervise class and box)
- __init__(num_classes, matcher, weight_dict, losses, eos_coef=None, loss_class_type='ce')[source]¶
Create the criterion. :param num_classes: number of object categories, omitting the special no-object category :param matcher: module able to compute a matching between targets and proposals :param weight_dict: dict containing as key the names of the losses and as values their relative weight. :param losses: list of all the losses to be applied. See get_loss for list of available losses.
- loss_labels(outputs, targets, indices, num_boxes, log=True)[source]¶
Classification loss (Binary focal loss) targets dicts must contain the key “labels” containing a tensor of dim [nb_target_boxes]
- loss_cardinality(outputs, targets, indices, num_boxes)[source]¶
Compute the cardinality error, ie the absolute error in the number of predicted non-empty boxes This is not really a loss, it is intended for logging purposes only. It doesn’t propagate gradients
- loss_boxes(outputs, targets, indices, num_boxes)[source]¶
Compute the losses related to the bounding boxes, the L1 regression loss and the GIoU loss targets dicts must contain the key “boxes” containing a tensor of dim [nb_target_boxes, 4] The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size.
- forward(outputs, targets, num_boxes=None, return_indices=False)[source]¶
This performs the loss computation. :param outputs: dict of tensors, see the output specification of the model for the format :param targets: list of dicts, such that len(targets) == batch_size.
The expected keys in each dict depends on the losses applied, see each loss’ doc
- Parameters
return_indices – used for vis. if True, the layer0-5 indices will be returned as well.
- training: bool¶
- class easycv.models.loss.L1Loss(reduction='mean', loss_weight=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
L1 loss.
- Parameters
reduction (str, optional) – The method to reduce the loss. Options are “none”, “mean” and “sum”.
loss_weight (float, optional) – The weight of loss.
- __init__(reduction='mean', loss_weight=1.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None)[source]¶
Forward function.
- Parameters
pred (torch.Tensor) – The prediction.
target (torch.Tensor) – The learning target of the prediction.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- training: bool¶
- class easycv.models.loss.MultiLoss(loss_config_list, weight_1=1.0, weight_2=1.0, gtc_loss='sar', **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(loss_config_list, weight_1=1.0, weight_2=1.0, gtc_loss='sar', **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(predicts, label_ctc=None, label_sar=None, length=None)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.SmoothL1Loss(beta=1.0, reduction='mean', loss_weight=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
Smooth L1 loss. :param beta: The threshold in the piecewise function.
Defaults to 1.0.
- Parameters
reduction (str, optional) – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.
loss_weight (float, optional) – The weight of loss.
- __init__(beta=1.0, reduction='mean', loss_weight=1.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]¶
Forward function. :param pred: The prediction. :type pred: torch.Tensor :param target: The learning target of the prediction. :type target: torch.Tensor :param weight: The weight of loss for each
prediction. Defaults to None.
- Parameters
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.
- training: bool¶
- class easycv.models.loss.DiceLoss(smooth=1, exponent=2, reduction='mean', class_weight=None, loss_weight=1.0, ignore_index=255, loss_name='loss_dice', **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
DiceLoss.
This loss is proposed in V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation.
- Parameters
smooth (float) – A float number to smooth loss, and avoid NaN error. Default: 1
exponent (float) – An float number to calculate denominator value: sum{x^exponent} + sum{y^exponent}. Default: 2.
reduction (str, optional) – The method used to reduce the loss. Options are “none”, “mean” and “sum”. This parameter only works when per_image is True. Default: ‘mean’.
class_weight (list[float] | str, optional) – Weight of each class. If in str format, read them from a file. Defaults to None.
loss_weight (float, optional) – Weight of the loss. Default to 1.0.
ignore_index (int | None) – The label index to be ignored. Default: 255.
loss_name (str, optional) – Name of the loss item. If you want this loss item to be included into the backward graph, loss_ must be the prefix of the name. Defaults to ‘loss_dice’.
- __init__(smooth=1, exponent=2, reduction='mean', class_weight=None, loss_weight=1.0, ignore_index=255, loss_name='loss_dice', **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target, avg_factor=None, reduction_override=None, **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- property loss_name¶
Loss Name.
This function must be implemented and will return the name of this loss function. This name will be used to combine different loss items by simple sum operation. In addition, if you want this loss item to be included into the backward graph, loss_ must be the prefix of the name. :returns: The name of this loss item. :rtype: str
- training: bool¶
Submodules¶
easycv.models.loss.iou_loss module¶
- class easycv.models.loss.iou_loss.YOLOX_IOULoss(reduction='none', loss_type='iou')[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(reduction='none', loss_type='iou')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.iou_loss.IoULoss(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[source]¶
Bases:
torch.nn.modules.module.Module
IoULoss.
Computing the IoU loss between a set of predicted bboxes and target bboxes.
- Parameters
linear (bool) – If True, use linear scale of loss else determined by mode. Default: False.
eps (float) – Eps to avoid log(0).
reduction (str) – Options are “none”, “mean” and “sum”.
loss_weight (float) – Weight of loss.
mode (str) – Loss scaling mode, including “linear”, “square”, and “log”. Default: ‘log’
- __init__(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]¶
Forward function.
- Parameters
pred (torch.Tensor) – The prediction.
target (torch.Tensor) – The learning target of the prediction.
weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.
avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.
reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None. Options are “none”, “mean” and “sum”.
- training: bool¶
- class easycv.models.loss.iou_loss.GIoULoss(eps=1e-06, reduction='mean', loss_weight=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(eps=1e-06, reduction='mean', loss_weight=1.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.loss.mse_loss module¶
- class easycv.models.loss.mse_loss.JointsMSELoss(use_target_weight=False, loss_weight=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
MSE loss for heatmaps.
- Parameters
use_target_weight (bool) – Option to use weighted MSE loss. Different joint types may have different target weights.
loss_weight (float) – Weight of the loss. Default: 1.0.
- __init__(use_target_weight=False, loss_weight=1.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
easycv.models.loss.pytorch_metric_learning module¶
- class easycv.models.loss.pytorch_metric_learning.FocalLoss2d(gamma=2, weight=None, size_average=None, reduce=None, reduction='mean', num_classes=2)[source]¶
Bases:
torch.nn.modules.loss._WeightedLoss
- __init__(gamma=2, weight=None, size_average=None, reduce=None, reduction='mean', num_classes=2)[source]¶
FocalLoss2d, loss solve 2-class classification unbalance problem
- Parameters
gamma – focal loss param Gamma
weight – weight same as loss._WeightedLoss
size_average – size_average same as loss._WeightedLoss
reduce – reduce same as loss._WeightedLoss
reduction – reduce same as loss._WeightedLoss
num_classes – fix num 2
- Returns
Focalloss nn.module.loss object
- weight: Optional[Tensor]¶
- reduction: str¶
- training: bool¶
- class easycv.models.loss.pytorch_metric_learning.DistributeMSELoss[source]¶
Bases:
torch.nn.modules.module.Module
- forward(input, target)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.pytorch_metric_learning.CrossEntropyLossWithLabelSmooth(label_smooth=0.1, temperature=1.0, with_cls=False, embedding_size=512, num_classes=10000)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(label_smooth=0.1, temperature=1.0, with_cls=False, embedding_size=512, num_classes=10000)[source]¶
A softmax loss , with label_smooth and fc(to fit pytorch metric learning interface) :param label_smooth: label_smooth args, default=0.1 :param with_cls: if True, will generate a nn.Linear to trans input embedding from embedding_size to num_classes :param embedding_size: if input is feature not logits, then need this to indicate embedding shape :param num_classes: if input is feature not logits, then need this to indicate classification num_classes
- Returns
None
- Raises
IOError – An error occurred accessing the bigtable.Table object.
- forward(input, target)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.pytorch_metric_learning.AMSoftmaxLoss(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]¶
AMsoftmax loss , with fc(to fit pytorch metric learning interface), paper: https://arxiv.org/pdf/1801.05599.pdf :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes :param margin: AMSoftmax param :param scale: AMSoftmax param, should increase num_classes
- forward(x, lb)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.pytorch_metric_learning.ModelParallelSoftmaxLoss(embedding_size=512, num_classes=100000, scale=None, margin=None, bias=True)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(embedding_size=512, num_classes=100000, scale=None, margin=None, bias=True)[source]¶
ModelParallel Softmax by sailfish :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes
- forward(x, lb)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.pytorch_metric_learning.ModelParallelAMSoftmaxLoss(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]¶
ModelParallel AMSoftmax by sailfish :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes
- forward(x, lb)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.loss.pytorch_metric_learning.SoftTargetCrossEntropy(num_classes=1000, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(num_classes=1000, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x: torch.Tensor, target: torch.Tensor) → torch.Tensor[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.pose package¶
Subpackages¶
easycv.models.pose.heads package¶
- easycv.models.pose.heads.topdown_heatmap_base_head.decode_heatmap(heatmaps, img_metas, test_cfg)[source]¶
- class easycv.models.pose.heads.topdown_heatmap_base_head.TopdownHeatmapBaseHead[source]¶
Bases:
torch.nn.modules.module.Module
Base class for top-down heatmap heads.
All top-down heatmap heads should subclass it. All subclass should overwrite:
Methods:get_loss, supporting to calculate loss. Methods:get_accuracy, supporting to calculate accuracy. Methods:forward, supporting to forward model. Methods:inference_model, supporting to inference model.
- decode(img_metas, output, **kwargs)[source]¶
Decode keypoints from heatmaps.
- Parameters
img_metas (list(dict)) – Information about data augmentation By default this includes: - “image_file: path to the image file - “center”: center of the bbox - “scale”: scale of the bbox - “rotation”: rotation of the bbox - “bbox_score”: score of bbox
output (np.ndarray[N, K, H, W]) – model predicted heatmaps.
- training: bool¶
- class easycv.models.pose.heads.topdown_heatmap_simple_head.TopdownHeatmapSimpleHead(in_channels, out_channels, num_deconv_layers=3, num_deconv_filters=(256, 256, 256), num_deconv_kernels=(4, 4, 4), extra=None, in_index=0, input_transform=None, align_corners=False, loss_keypoint=None, train_cfg=None, test_cfg=None)[source]¶
Bases:
easycv.models.pose.heads.topdown_heatmap_base_head.TopdownHeatmapBaseHead
Top-down heatmap simple head. paper ref: Bin Xiao et al.
Simple Baselines for Human Pose Estimation and Tracking
.TopdownHeatmapSimpleHead is consisted of (>=0) number of deconv layers and a simple conv2d layer.
- Parameters
in_channels (int) – Number of input channels
out_channels (int) – Number of output channels
num_deconv_layers (int) – Number of deconv layers. num_deconv_layers should >= 0. Note that 0 means no deconv layers.
num_deconv_filters (list|tuple) – Number of filters. If num_deconv_layers > 0, the length of
num_deconv_kernels (list|tuple) – Kernel sizes.
in_index (int|Sequence[int]) – Input feature index. Default: 0
input_transform (str|None) –
Transformation type of input features. Options: ‘resize_concat’, ‘multiple_select’, None. Default: None.
- ’resize_concat’: Multiple feature maps will be resized to the
same size as the first one and then concat together. Usually used in FCN head of HRNet.
- ’multiple_select’: Multiple feature maps will be bundle into
a list and passed into decode head.
None: Only one select feature map is allowed.
align_corners (bool) – align_corners argument of F.interpolate. Default: False.
loss_keypoint (dict) – Config for keypoint loss. Default: None.
- __init__(in_channels, out_channels, num_deconv_layers=3, num_deconv_filters=(256, 256, 256), num_deconv_kernels=(4, 4, 4), extra=None, in_index=0, input_transform=None, align_corners=False, loss_keypoint=None, train_cfg=None, test_cfg=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- get_loss(output, target, target_weight)[source]¶
Calculate top-down keypoint loss.
Note
batch_size: N num_keypoints: K heatmaps height: H heatmaps weight: W
- Parameters
output (torch.Tensor[NxKxHxW]) – Output heatmaps.
target (torch.Tensor[NxKxHxW]) – Target heatmaps.
target_weight (torch.Tensor[NxKx1]) – Weights across different joint types.
- get_accuracy(output, target, target_weight)[source]¶
Calculate accuracy for top-down keypoint loss.
Note
batch_size: N num_keypoints: K heatmaps height: H heatmaps weight: W
- Parameters
output (torch.Tensor[NxKxHxW]) – Output heatmaps.
target (torch.Tensor[NxKxHxW]) – Target heatmaps.
target_weight (torch.Tensor[NxKx1]) – Weights across different joint types.
- inference_model(x, flip_pairs=None)[source]¶
Inference function.
- Returns
Output heatmaps.
- Return type
output_heatmap (np.ndarray)
- Parameters
x (torch.Tensor[NxKxHxW]) – Input features.
flip_pairs (None | list[tuple()) – Pairs of keypoints which are mirrored.
- training: bool¶
Submodules¶
easycv.models.pose.top_down module¶
- class easycv.models.pose.top_down.TopDown(backbone, neck=None, keypoint_head=None, train_cfg=None, test_cfg=None, pretrained=None, loss_pose=None)[source]¶
Bases:
easycv.models.base.BaseModel
Top-down pose detectors.
- Parameters
backbone (dict) – Backbone modules to extract feature.
keypoint_head (dict) – Keypoint head to process feature.
train_cfg (dict) – Config for training. Default: None.
test_cfg (dict) – Config for testing. Default: None.
pretrained (str) – Path to the pretrained models.
loss_pose (None) – Deprecated arguments. Please use loss_keypoint for heads instead.
- __init__(backbone, neck=None, keypoint_head=None, train_cfg=None, test_cfg=None, pretrained=None, loss_pose=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property with_neck¶
Check if has keypoint_head.
- property with_keypoint¶
Check if has keypoint_head.
- forward_train(img, target, target_weight, img_metas, **kwargs)[source]¶
Defines the computation performed at every call when training.
- forward_test(img, img_metas, return_heatmap=False, **kwargs)[source]¶
Defines the computation performed at every call when testing.
- show_result(img, result, skeleton=None, kpt_score_thr=0.3, bbox_color='green', pose_kpt_color=None, pose_link_color=None, text_color='white', radius=4, thickness=1, font_scale=0.5, bbox_thickness=1, win_name='', show=False, show_keypoint_weight=False, wait_time=0, out_file=None)[source]¶
Draw result over img.
- Parameters
img (str or Tensor) – The image to be displayed.
result (list[dict]) – The results to draw over img (bbox_result, pose_result).
skeleton (list[list]) – The connection of keypoints. skeleton is 0-based indexing.
kpt_score_thr (float, optional) – Minimum score of keypoints to be shown. Default: 0.3.
bbox_color (str or tuple or
Color
) – Color of bbox lines.pose_kpt_color (np.array[Nx3]`) – Color of N keypoints. If None, do not draw keypoints.
pose_link_color (np.array[Mx3]) – Color of M links. If None, do not draw links.
text_color (str or tuple or
Color
) – Color of texts.radius (int) – Radius of circles.
thickness (int) – Thickness of lines.
font_scale (float) – Font scales of texts.
win_name (str) – The window name.
show (bool) – Whether to show the image. Default: False.
show_keypoint_weight (bool) – Whether to change the transparency using the predicted confidence scores of keypoints.
wait_time (int) – Value of waitKey param. Default: 0.
out_file (str or None) – The filename to write the image. Default: None.
- Returns
Visualized img, only if not show or out_file.
- Return type
Tensor
- training: bool¶
easycv.models.selfsup package¶
Submodules¶
easycv.models.selfsup.byol module¶
- class easycv.models.selfsup.byol.BYOL(backbone, neck=None, head=None, pretrained=None, base_momentum=0.996, **kwargs)[source]¶
Bases:
easycv.models.base.BaseModel
BYOL unofficial implementation. Paper: https://arxiv.org/abs/2006.07733
- __init__(backbone, neck=None, head=None, pretrained=None, base_momentum=0.996, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward_train(img, **kwargs)[source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_test(img, **kwargs)[source]¶
Abstract interface for model forward in testing
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward(img, mode='train', **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.selfsup.dino module¶
- class easycv.models.selfsup.dino.MultiCropWrapper(backbone, head)[source]¶
Bases:
torch.nn.modules.module.Module
Perform forward pass separately on each resolution input. The inputs corresponding to a single resolution are clubbed and single forward is run on the same resolution inputs. Hence we do several forward passes = number of different resolutions used. We then concatenate all the output features and run the head forward on these concatenated features.
- __init__(backbone, head)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.dino.DINOLoss(out_dim, ncrops, warmup_teacher_temp, teacher_temp, warmup_teacher_temp_epochs, nepochs, device, student_temp=0.1, center_momentum=0.9)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(out_dim, ncrops, warmup_teacher_temp, teacher_temp, warmup_teacher_temp_epochs, nepochs, device, student_temp=0.1, center_momentum=0.9)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(student_output, teacher_output, epoch)[source]¶
Cross-entropy between softmax outputs of the teacher and student networks.
- training: bool¶
- class easycv.models.selfsup.dino.DINO(backbone, train_preprocess=[], neck=None, config=None, pretrained=None)[source]¶
Bases:
easycv.models.base.BaseModel
- __init__(backbone, train_preprocess=[], neck=None, config=None, pretrained=None)[source]¶
Init Moby
- Parameters
backbone – backbone config to build vision backbone
train_preprocess – [gaussBlur, mixUp, solarize]
neck – neck config to build Moby Neck
config – DINO parameter config
- forward_train(inputs)[source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_test(img, **kwargs)[source]¶
Abstract interface for model forward in testing
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_feature(img, **kwargs)[source]¶
Forward backbone
- Returns
feature tensor
- Return type
x (torch.Tensor)
- training: bool¶
- forward(img, gt_label=None, mode='train', extract_list=['neck'], **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
easycv.models.selfsup.mae module¶
- class easycv.models.selfsup.mae.MAE(backbone, neck, mask_ratio=0.75, norm_pix_loss=True, **kwargs)[source]¶
Bases:
easycv.models.base.BaseModel
- __init__(backbone, neck, mask_ratio=0.75, norm_pix_loss=True, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- patchify(imgs)[source]¶
convert image to patch
- Parameters
imgs – (N, 3, H, W)
- Returns
(N, L, patch_size**2 *3)
- Return type
x
- forward_loss(imgs, pred, mask)[source]¶
compute loss
- Parameters
imgs – (N, 3, H, W)
pred – (N, L, p*p*3)
mask – (N, L), 0 is keep, 1 is remove,
- forward_train(img, **kwargs)[source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_test(img, **kwargs)[source]¶
Abstract interface for model forward in testing
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward(img, mode='train', **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.selfsup.mixco module¶
- class easycv.models.selfsup.mixco.MIXCO(backbone, train_preprocess=[], neck=None, head=None, mixco_head=None, pretrained=None, queue_len=65536, feat_dim=128, momentum=0.999, **kwargs)[source]¶
Bases:
easycv.models.selfsup.moco.MOCO
MOCO.
A mixup version moco https://arxiv.org/pdf/2010.06300.pdf
- __init__(backbone, train_preprocess=[], neck=None, head=None, mixco_head=None, pretrained=None, queue_len=65536, feat_dim=128, momentum=0.999, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward_train(img, **kwargs)[source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- training: bool¶
easycv.models.selfsup.moby module¶
- class easycv.models.selfsup.moby.MoBY(backbone, train_preprocess=[], neck=None, head=None, pretrained=None, queue_len=4096, contrast_temperature=0.2, momentum=0.99, online_drop_path_rate=0.2, target_drop_path_rate=0.0, **kwargs)[source]¶
Bases:
easycv.models.base.BaseModel
MoBY. Part of the code is borrowed from: https://github.com/SwinTransformer/Transformer-SSL/blob/main/models/moby.py.
- __init__(backbone, train_preprocess=[], neck=None, head=None, pretrained=None, queue_len=4096, contrast_temperature=0.2, momentum=0.99, online_drop_path_rate=0.2, target_drop_path_rate=0.0, **kwargs)[source]¶
Init Moby
- Parameters
backbone – backbone config to build vision backbone
train_preprocess – [gaussBlur, mixUp, solarize]
neck – neck config to build Moby Neck
head – head config to build Moby Neck
pretrained – pretrained weight for backbone
queue_len – moby queue length
contrast_temperature – contrastive_loss temperature
momentum – ema target weights momentum
online_drop_path_rate – for transformer based backbone, set online model drop_path_rate
target_drop_path_rate – for transformer based backbone, set target model drop_path_rate
- forward_train(img, **kwargs)[source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_test(img, **kwargs)[source]¶
Abstract interface for model forward in testing
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_feature(img, **kwargs)[source]¶
Forward backbone
- Returns
feature tensor
- Return type
x (torch.Tensor)
- forward(img, gt_label=None, mode='train', extract_list=['neck'], **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.selfsup.moco module¶
- class easycv.models.selfsup.moco.MOCO(backbone, train_preprocess=[], neck=None, head=None, pretrained=None, queue_len=65536, feat_dim=128, momentum=0.999, **kwargs)[source]¶
Bases:
easycv.models.base.BaseModel
MOCO. Part of the code is borrowed from: https://github.com/facebookresearch/moco/blob/master/moco/builder.py.
- __init__(backbone, train_preprocess=[], neck=None, head=None, pretrained=None, queue_len=65536, feat_dim=128, momentum=0.999, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward_train(img, **kwargs)[source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_test(img, **kwargs)[source]¶
Abstract interface for model forward in testing
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_feature(img, **kwargs)[source]¶
Forward backbone
- Returns
feature tensor
- Return type
x (torch.Tensor)
- forward(img, gt_label=None, mode='train', extract_list=['neck'], **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.selfsup.necks module¶
- class easycv.models.selfsup.necks.DINONeck(in_dim, out_dim, use_bn=False, norm_last_layer=True, nlayers=3, hidden_dim=2048, bottleneck_dim=256)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(in_dim, out_dim, use_bn=False, norm_last_layer=True, nlayers=3, hidden_dim=2048, bottleneck_dim=256)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.necks.MoBYMLP(in_channels=256, hid_channels=4096, out_channels=256, num_layers=2, with_avg_pool=True)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(in_channels=256, hid_channels=4096, out_channels=256, num_layers=2, with_avg_pool=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.necks.NonLinearNeckSwav(in_channels, hid_channels, out_channels, with_avg_pool=True, export=False)[source]¶
Bases:
torch.nn.modules.module.Module
The non-linear neck in byol: fc-syncbn-relu-fc
- __init__(in_channels, hid_channels, out_channels, with_avg_pool=True, export=False)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.necks.NonLinearNeckV0(in_channels, hid_channels, out_channels, sync_bn=False, with_avg_pool=True)[source]¶
Bases:
torch.nn.modules.module.Module
The non-linear neck in ODC, fc-bn-relu-dropout-fc-relu
- __init__(in_channels, hid_channels, out_channels, sync_bn=False, with_avg_pool=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.necks.NonLinearNeckV1(in_channels, hid_channels, out_channels, with_avg_pool=True)[source]¶
Bases:
torch.nn.modules.module.Module
The non-linear neck in MoCO v2: fc-relu-fc
- __init__(in_channels, hid_channels, out_channels, with_avg_pool=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.necks.NonLinearNeckV2(in_channels, hid_channels, out_channels, with_avg_pool=True)[source]¶
Bases:
torch.nn.modules.module.Module
The non-linear neck in byol: fc-bn-relu-fc
- __init__(in_channels, hid_channels, out_channels, with_avg_pool=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.necks.NonLinearNeckSimCLR(in_channels, hid_channels, out_channels, num_layers=2, with_avg_pool=True)[source]¶
Bases:
torch.nn.modules.module.Module
SimCLR non-linear neck.
- Structure: fc(no_bias)-bn(has_bias)-[relu-fc(no_bias)-bn(no_bias)].
The substructures in [] can be repeated. For the SimCLR default setting, the repeat time is 1.
- However, PyTorch does not support to specify (weight=True, bias=False).
It only support “affine” including the weight and bias. Hence, the second BatchNorm has bias in this implementation. This is different from the offical implementation of SimCLR.
- Since SyncBatchNorm in pytorch<1.4.0 does not support 2D input, the input is
expanded to 4D with shape: (N,C,1,1). I am not sure if this workaround has no bugs. See the pull request here: https://github.com/pytorch/pytorch/pull/29626
- Parameters
in_channels – input channel number
hid_channels – hidden channels
out_channels – output channel number
num_layers (int) – number of fc layers, it is 2 in the SimCLR default setting.
with_avg_pool – output with average pooling
- __init__(in_channels, hid_channels, out_channels, num_layers=2, with_avg_pool=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.necks.RelativeLocNeck(in_channels, out_channels, sync_bn=False, with_avg_pool=True)[source]¶
Bases:
torch.nn.modules.module.Module
Relative patch location neck: fc-bn-relu-dropout
- __init__(in_channels, out_channels, sync_bn=False, with_avg_pool=True)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.necks.MAENeck(num_patches, embed_dim=768, patch_size=16, in_chans=3, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]¶
Bases:
torch.nn.modules.module.Module
MAE decoder
- Parameters
num_patches (int) – number of patches from encoder
embed_dim (int) – encoder embedding dimension
patch_size (int) – encoder patch size
in_chans (int) – input image channels
decoder_embed_dim (int) – decoder embedding dimension
decoder_depth (int) – number of decoder layers
decoder_num_heads (int) – Parallel attention heads
mlp_ratio (float) – mlp ratio
norm_layer – type of normalization layer
- __init__(num_patches, embed_dim=768, patch_size=16, in_chans=3, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
- forward(x, ids_restore)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- class easycv.models.selfsup.necks.FastConvMAENeck(num_patches, embed_dim=768, patch_size=16, in_channels=3, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]¶
Bases:
easycv.models.selfsup.necks.MAENeck
Fast ConvMAE decoder, refer to: https://github.com/Alpha-VL/FastConvMAE
- Parameters
num_patches (int) – number of patches from encoder
embed_dim (int) – encoder embedding dimension
patch_size (int) – encoder patch size
in_channels (int) – input image channels
decoder_embed_dim (int) – decoder embedding dimension
decoder_depth (int) – number of decoder layers
decoder_num_heads (int) – Parallel attention heads
mlp_ratio (float) – mlp ratio
norm_layer – type of normalization layer
- training: bool¶
- __init__(num_patches, embed_dim=768, patch_size=16, in_channels=3, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x, ids_restore)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
easycv.models.selfsup.simclr module¶
- class easycv.models.selfsup.simclr.SimCLR(backbone, train_preprocess=[], neck=None, head=None, pretrained=None)[source]¶
Bases:
easycv.models.base.BaseModel
- __init__(backbone, train_preprocess=[], neck=None, head=None, pretrained=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward_train(img, **kwargs)[source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_test(img, **kwargs)[source]¶
Abstract interface for model forward in testing
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward(img, mode='train', **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.selfsup.swav module¶
- class easycv.models.selfsup.swav.SWAV(backbone, train_preprocess=[], neck=None, config=None, pretrained=None)[source]¶
Bases:
easycv.models.base.BaseModel
- __init__(backbone, train_preprocess=[], neck=None, config=None, pretrained=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward_train(inputs)[source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_test(img, **kwargs)[source]¶
Abstract interface for model forward in testing
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_feature(img, **kwargs)[source]¶
Forward backbone
- Returns
feature tensor
- Return type
x (torch.Tensor)
- forward(img, gt_label=None, mode='train', extract_list=['neck'], **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.selfsup.swav.MultiPrototypes(output_dim, nmb_prototypes)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(output_dim, nmb_prototypes)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.utils package¶
Submodules¶
easycv.models.utils.accuracy module¶
easycv.models.utils.activation module¶
- class easycv.models.utils.activation.FReLU(in_channel)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(in_channel)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- easycv.models.utils.activation.build_activation_layer(cfg)[source]¶
Build activation layer.
- Parameters
cfg (dict) – The activation layer config, which should contain: - type (str): Layer type. - layer args: Args needed to instantiate an activation layer.
- Returns
Created activation layer.
- Return type
nn.Module
easycv.models.utils.conv_module module¶
- easycv.models.utils.conv_module.build_conv_layer(cfg, *args, **kwargs)[source]¶
Build convolution layer
- Parameters
cfg (None or dict) – cfg should contain: type (str): identify conv layer type. layer args: args needed to instantiate a conv layer.
- Returns
created conv layer
- Return type
layer (nn.Module)
- class easycv.models.utils.conv_module.ConvModule(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias='auto', conv_cfg=None, norm_cfg=None, act_cfg={'type': 'ReLU'}, inplace=True, order=('conv', 'norm', 'act'))[source]¶
Bases:
torch.nn.modules.module.Module
A conv block that contains conv/norm/activation layers.
- Parameters
in_channels (int) – Same as nn.Conv2d.
out_channels (int) – Same as nn.Conv2d.
kernel_size (int or tuple[int]) – Same as nn.Conv2d.
stride (int or tuple[int]) – Same as nn.Conv2d.
padding (int or tuple[int]) – Same as nn.Conv2d.
dilation (int or tuple[int]) – Same as nn.Conv2d.
groups (int) – Same as nn.Conv2d.
bias (bool or str) – If specified as auto, it will be decided by the norm_cfg. Bias will be set as True if norm_cfg is None, otherwise False.
conv_cfg (dict) – Config dict for convolution layer.
norm_cfg (dict) – Config dict for normalization layer.
act_cfg (dict) – Config of activation layers. Default: dict(type=’ReLU’)
inplace (bool) – Whether to use inplace mode for activation.
order (tuple[str]) – The order of conv/norm/activation layers. It is a sequence of “conv”, “norm” and “act”. Examples are (“conv”, “norm”, “act”) and (“act”, “conv”, “norm”).
- __init__(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias='auto', conv_cfg=None, norm_cfg=None, act_cfg={'type': 'ReLU'}, inplace=True, order=('conv', 'norm', 'act'))[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property norm¶
- forward(x, activate=True, norm=True)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.utils.conv_ws module¶
- easycv.models.utils.conv_ws.conv_ws_2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1, eps=1e-05)[source]¶
- class easycv.models.utils.conv_ws.ConvWS2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, eps=1e-05)[source]¶
Bases:
torch.nn.modules.conv.Conv2d
- __init__(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, eps=1e-05)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- bias: Optional[torch.Tensor]¶
- in_channels: int¶
- out_channels: int¶
- kernel_size: Tuple[int, ...]¶
- stride: Tuple[int, ...]¶
- padding: Union[str, Tuple[int, ...]]¶
- dilation: Tuple[int, ...]¶
- transposed: bool¶
- output_padding: Tuple[int, ...]¶
- groups: int¶
- padding_mode: str¶
- weight: torch.Tensor¶
easycv.models.utils.dist_utils module¶
- class easycv.models.utils.dist_utils.DistributedLossWrapper(loss, **kwargs)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(loss, **kwargs)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(embeddings, labels, *args, **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.utils.dist_utils.DistributedMinerWrapper(miner)[source]¶
Bases:
torch.nn.modules.module.Module
- __init__(miner)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(embeddings, labels, ref_emb=None, ref_labels=None)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.utils.gather_layer module¶
- class easycv.models.utils.gather_layer.GatherLayer(*args, **kwargs)[source]¶
Bases:
torch.autograd.function.Function
Gather tensors from all process, supporting backward propagation.
- static forward(ctx, input)[source]¶
Performs the operation.
This function is to be overridden by all subclasses.
It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Tensors should not be stored directly on ctx (though this is not currently enforced for backward compatibility). Instead, tensors should be saved either with
ctx.save_for_backward()
if they are intended to be used inbackward
(equivalently,vjp
) orctx.save_for_forward()
if they are intended to be used for injvp
.
- static backward(ctx, *grads)[source]¶
Defines a formula for differentiating the operation with backward mode automatic differentiation (alias to the vjp function).
This function is to be overridden by all subclasses.
It must accept a context
ctx
as the first argument, followed by as many outputs as theforward()
returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs toforward()
. Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve tensors saved during the forward pass. It also has an attribute
ctx.needs_input_grad
as a tuple of booleans representing whether each input needs gradient. E.g.,backward()
will havectx.needs_input_grad[0] = True
if the first input toforward()
needs gradient computated w.r.t. the output.
easycv.models.utils.init_weights module¶
easycv.models.utils.multi_pooling module¶
- class easycv.models.utils.multi_pooling.GeMPooling(p=3, eps=1e-06)[source]¶
Bases:
torch.nn.modules.module.Module
GemPooling used for image retrival p = 1, avgpooling p > 1 : increases the contrast of the pooled feature map and focuses on the salient features of the image p = infinite : spatial max-pooling layer
- __init__(p=3, eps=1e-06)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.utils.multi_pooling.MultiPooling(pool_type='adaptive', in_indices=(0), backbone='resnet50')[source]¶
Bases:
torch.nn.modules.module.Module
Pooling layers for features from multiple depth.
- POOL_PARAMS = {'resnet50': [{'kernel_size': 10, 'stride': 10, 'padding': 4}, {'kernel_size': 16, 'stride': 8, 'padding': 0}, {'kernel_size': 13, 'stride': 5, 'padding': 0}, {'kernel_size': 8, 'stride': 3, 'padding': 0}, {'kernel_size': 6, 'stride': 1, 'padding': 0}]}¶
- POOL_SIZES = {'resnet50': [12, 6, 4, 3, 2]}¶
- POOL_DIMS = {'resnet50': [9216, 9216, 8192, 9216, 8192]}¶
- __init__(pool_type='adaptive', in_indices=(0), backbone='resnet50')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.utils.multi_pooling.MultiAvgPooling(pool_type='adaptive', in_indices=(0), backbone='resnet50')[source]¶
Bases:
torch.nn.modules.module.Module
Pooling layers for features from multiple depth.
- POOL_PARAMS = {'resnet50': [{'kernel_size': 10, 'stride': 10, 'padding': 4}, {'kernel_size': 16, 'stride': 8, 'padding': 0}, {'kernel_size': 13, 'stride': 5, 'padding': 0}, {'kernel_size': 8, 'stride': 3, 'padding': 0}, {'kernel_size': 7, 'stride': 1, 'padding': 0}]}¶
- POOL_SIZES = {'resnet50': [12, 6, 4, 3, 1]}¶
- POOL_DIMS = {'resnet50': [9216, 9216, 8192, 9216, 2048]}¶
- __init__(pool_type='adaptive', in_indices=(0), backbone='resnet50')[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- training: bool¶
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
easycv.models.utils.norm module¶
- class easycv.models.utils.norm.SyncIBN(planes, ratio=0.5, eps=1e-05)[source]¶
Bases:
torch.nn.modules.module.Module
Instance-Batch Normalization layer from “Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net” <https://arxiv.org/pdf/1807.09441.pdf> :param planes: Number of channels for the input tensor :type planes: int :param ratio: Ratio of instance normalization in the IBN layer :type ratio: float
- __init__(planes, ratio=0.5, eps=1e-05)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- class easycv.models.utils.norm.IBN(planes, ratio=0.5, eps=1e-05)[source]¶
Bases:
torch.nn.modules.module.Module
Instance-Batch Normalization layer from “Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net” <https://arxiv.org/pdf/1807.09441.pdf> :param planes: Number of channels for the input tensor :type planes: int :param ratio: Ratio of instance normalization in the IBN layer :type ratio: float
- __init__(planes, ratio=0.5, eps=1e-05)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
- easycv.models.utils.norm.build_norm_layer(cfg, num_features, postfix='')[source]¶
Build normalization layer
- Parameters
cfg (dict) – cfg should contain: type (str): identify norm layer type. layer args: args needed to instantiate a norm layer. requires_grad (bool): [optional] whether stop gradient updates
num_features (int) – number of channels from input.
postfix (int, str) – appended into norm abbreviation to create named layer.
- Returns
abbreviation + postfix layer (nn.Module): created norm layer
- Return type
name (str)
easycv.models.utils.ops module¶
- easycv.models.utils.ops.resize_tensor(input, size=None, scale_factor=None, mode='nearest', align_corners=None, warning=True)[source]¶
Resize tensor with F.interpolate.
- Parameters
input (Tensor) – the input tensor.
size (Tuple[int, int]) – output spatial size.
scale_factor (float or Tuple[float]) – multiplier for spatial size. If scale_factor is a tuple, its length has to match input.dim().
mode (str) – algorithm used for upsampling: ‘nearest’ | ‘linear’ | ‘bilinear’ | ‘bicubic’ | ‘trilinear’ | ‘area’. Default: ‘nearest’
align_corners (bool) –
Geometrically, we consider the pixels of the input and output as squares rather than points. If set to True, the input and output tensors are aligned by the center points of their corner pixels, preserving the values at the corner pixels.
If set to False, the input and output tensors are aligned by the corner points of their corner pixels, and the interpolation uses edge value padding for out-of-boundary values, making this operation independent of input size when scale_factor is kept the same. This only has an effect when mode is ‘linear’, ‘bilinear’, ‘bicubic’ or ‘trilinear’.
easycv.models.utils.pos_embed module¶
- easycv.models.utils.pos_embed.get_2d_sincos_pos_embed(embed_dim, grid_size, cls_token=False)[source]¶
grid_size: int of the grid height and width return: pos_embed: [grid_size*grid_size, embed_dim] or [1+grid_size*grid_size, embed_dim] (w/ or w/o cls_token)
easycv.models.utils.res_layer module¶
- class easycv.models.utils.res_layer.ResLayer(block, num_blocks, in_channels, out_channels, expansion=None, stride=1, avg_down=False, conv_cfg=None, norm_cfg={'type': 'BN'}, **kwargs)[source]¶
Bases:
torch.nn.modules.container.Sequential
ResLayer to build ResNet style backbone. :param block: Residual block used to build ResLayer. :type block: nn.Module :param num_blocks: Number of blocks. :type num_blocks: int :param in_channels: Input channels of this block. :type in_channels: int :param out_channels: Output channels of this block. :type out_channels: int :param expansion: The expansion for BasicBlock/Bottleneck.
If not specified, it will firstly be obtained via
block.expansion
. If the block has no attribute “expansion”, the following default values will be used: 1 for BasicBlock and 4 for Bottleneck. Default: None.- Parameters
stride (int) – stride of the first block. Default: 1.
avg_down (bool) – Use AvgPool instead of stride conv when downsampling in the bottleneck. Default: False
conv_cfg (dict, optional) – dictionary to construct and config conv layer. Default: None
norm_cfg (dict) – dictionary to construct and config norm layer. Default: dict(type=’BN’)
easycv.models.utils.scale module¶
- class easycv.models.utils.scale.Scale(scale=1.0)[source]¶
Bases:
torch.nn.modules.module.Module
A learnable scale parameter
- __init__(scale=1.0)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
easycv.models.utils.sobel module¶
- class easycv.models.utils.sobel.Sobel[source]¶
Bases:
torch.nn.modules.module.Module
- forward(x)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool¶
Submodules¶
easycv.models.base module¶
- class easycv.models.base.BaseModel(init_cfg=None)[source]¶
Bases:
torch.nn.modules.module.Module
base class for model.
- __init__(init_cfg=None)[source]¶
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- property is_init: bool¶
- abstract forward_train(img: torch.Tensor, **kwargs) → Dict[str, torch.Tensor][source]¶
Abstract interface for model forward in training
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward_test(img: torch.Tensor, **kwargs) → Dict[str, torch.Tensor][source]¶
Abstract interface for model forward in testing
- Parameters
img (Tensor) – image tensor
kwargs (keyword arguments) – Specific to concrete implementation.
- forward(mode='train', *args, **kwargs)[source]¶
Defines the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Module
instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- train_step(data, optimizer)[source]¶
The iteration step during training.
This method defines an iteration step during training, except for the back propagation and optimizer updating, which are done in an optimizer hook. Note that in some complicated cases or models, the whole process including back propagation and optimizer updating is also defined in this method, such as GAN.
- Parameters
data (dict) – The output of dataloader.
optimizer (
torch.optim.Optimizer
| dict) – The optimizer of runner is passed totrain_step()
. This argument is unused and reserved.
- Returns
It should contain at least 3 keys:
loss
,log_vars
,num_samples
.loss
is a tensor for back propagation, which can be a weighted sum of multiple losses.log_vars
contains all the variables to be sent to the logger.num_samples
indicates the batch size (when the model is DDP, it means the batch size on each GPU), which is used for averaging the logs.
- Return type
dict
- val_step(data, optimizer)[source]¶
The iteration step during validation.
This method shares the same signature as
train_step()
, but used during val epochs. Note that the evaluation after training epochs is not implemented with this method, but an evaluation hook.
- training: bool¶
easycv.models.builder module¶
- easycv.models.builder.build_positional_encoding(cfg, default_args=None)[source]¶
Builder for Position Encoding.
- easycv.models.builder.build_feedforward_network(cfg, default_args=None)[source]¶
Builder for feed-forward network (FFN).
easycv.models.modelzoo module¶
easycv.models.registry module¶
easycv.utils package¶
Submodules¶
easycv.utils.alias_multinomial module¶
easycv.utils.bbox_util module¶
easycv.utils.checkpoint module¶
- easycv.utils.checkpoint.load_checkpoint(model, filename, map_location='cpu', strict=False, logger=None, revise_keys=[('^module\\.', '')])[source]¶
Load checkpoint from a file or URI.
- Parameters
model (Module) – Module to load checkpoint.
filename (str) – Accept local filepath, URL,
torchvision://xxx
,open-mmlab://xxx
. Please refer todocs/model_zoo.md
for details.map_location (str) – Same as
torch.load()
.strict (bool) – Whether to allow different params for the model and checkpoint.
logger (
logging.Logger
or None) – The logger for error message.revise_keys (list) – A list of customized keywords to modify the state_dict in checkpoint. Each item is a (pattern, replacement) pair of the regular expression operations. Default: strip the prefix ‘module.’ by [(r’^module.’, ‘’)].
- Returns
The loaded checkpoint.
- Return type
dict or OrderedDict
- easycv.utils.checkpoint.save_checkpoint(model, filename, optimizer=None, meta=None)[source]¶
Save checkpoint to file.
The checkpoint will have 3 fields:
meta
,state_dict
andoptimizer
. By defaultmeta
will contain version and time info.- Parameters
model (Module) – Module whose params are to be saved.
filename (str) – Checkpoint filename.
optimizer (
Optimizer
, optional) – Optimizer to be saved.meta (dict, optional) – Metadata to be saved in checkpoint.
easycv.utils.collect module¶
- easycv.utils.collect.nondist_forward_collect(func, data_loader, length)[source]¶
Forward and collect network outputs.
This function performs forward propagation and collects outputs. It can be used to collect results, features, losses, etc.
- Parameters
func (function) – The function to process data. The output must be a dictionary of CPU tensors.
length (int) – Expected length of output arrays.
- Returns
The concatenated outputs.
- Return type
results_all (dict(np.ndarray))
- easycv.utils.collect.dist_forward_collect(func, data_loader, rank, length, ret_rank=- 1)[source]¶
Forward and collect network outputs in a distributed manner.
This function performs forward propagation and collects outputs. It can be used to collect results, features, losses, etc.
- Parameters
func (function) – The function to process data. The output must be a dictionary of CPU tensors.
rank (int) – This process id.
length (int) – Expected length of output arrays.
ret_rank (int) – The process that returns. Other processes will return None.
- Returns
The concatenated outputs.
- Return type
results_all (dict(np.ndarray))
easycv.utils.config_tools module¶
- class easycv.utils.config_tools.WrapperConfig(cfg_dict=None, cfg_text=None, filename=None)[source]¶
Bases:
mmcv.utils.config.Config
A facility for config and config files. It supports common file formats as configs: python/json/yaml. The interface is the same as a dict object and also allows access config values as attributes. .. rubric:: Example
>>> cfg = Config(dict(a=1, b=dict(b1=[0, 1]))) >>> cfg.a 1 >>> cfg.b {'b1': [0, 1]} >>> cfg.b.b1 [0, 1] >>> cfg = Config.fromfile('tests/data/config/a.py') >>> cfg.filename "/home/kchen/projects/mmcv/tests/data/config/a.py" >>> cfg.item4 'test' >>> cfg "Config [path: /home/kchen/projects/mmcv/tests/data/config/a.py]: " "{'item1': [1, 2], 'item2': {'a': 0}, 'item3': True, 'item4': 'test'}"
- easycv.utils.config_tools.check_base_cfg_path(base_cfg_name='configs/base.py', father_cfg_name=None, easycv_root=None)[source]¶
Concatenate paths by parsing path rules. for example(pseudo-code):
1. ‘configs’ in base_cfg_name or ‘benchmarks’ in base_cfg_name: base_cfg_name = easycv_root + base_cfg_name 2. ‘configs’ not in base_cfg_name and ‘benchmarks’ not in base_cfg_name: base_cfg_name = father_cfg_name + base_cfg_name
- easycv.utils.config_tools.mmcv_file2dict_base(ori_filename, first_order_params=None, easycv_root=None)[source]¶
- easycv.utils.config_tools.adapt_pai_params(cfg_dict)[source]¶
- Parameters
cfg_dict (dict) – All parameters of cfg.
- Returns
Add the cfg of export and oss.
- Return type
cfg_dict (dict)
- easycv.utils.config_tools.pai_config_fromfile(ori_filename, user_config_params=None, model_type=None)[source]¶
- easycv.utils.config_tools.config_dict_edit(ori_cfg_dict, cfg_dict, reg, dict_mem_helper)[source]¶
edit ${configs.variables} in config dict to solve dependicies in config ori_cfg_dict: to find the true value of ${configs.variables} cfg_dict: for find leafs of dict by recursive reg: Regular expression pattern for find all ${configs.variables} in leafs of dict dict_mem_helper: to store the true value of ${configs.variables} which have been found
easycv.utils.constant module¶
easycv.utils.dist_utils module¶
- easycv.utils.dist_utils.obj2tensor(pyobj, device='cuda')[source]¶
Serialize picklable python object to tensor.
- easycv.utils.dist_utils.all_reduce_dict(py_dict, op='sum', group=None, to_float=True)[source]¶
Apply all reduce function for python dict object.
The code is modified from https://github.com/Megvii- BaseDetection/YOLOX/blob/main/yolox/utils/allreduce_norm.py.
NOTE: make sure that py_dict in different ranks has the same keys and the values should be in the same shape.
- Parameters
py_dict (dict) – Dict to be applied all reduce op.
op (str) – Operator, could be ‘sum’ or ‘mean’. Default: ‘sum’
group (
torch.distributed.group
, optional) – Distributed group, Default: None.to_float (bool) – Whether to convert all values of dict to float. Default: True.
- Returns
reduced python dict object.
- Return type
OrderedDict
- easycv.utils.dist_utils.sync_random_seed(seed=None, device='cuda')[source]¶
Make sure different ranks share the same seed. All workers must call this function, otherwise it will deadlock. This method is generally used in DistributedSampler, because the seed should be identical across all processes in the distributed group. In distributed sampling, different ranks should sample non-overlapped data in the dataset. Therefore, this function is used to make sure that each rank shuffles the data indices in the same order based on the same seed. Then different ranks could use different indices to select non-overlapped data from the same data list. :param seed: The seed. Default to None. :type seed: int, Optional :param device: The device where the seed will be put on.
Default to ‘cuda’.
- Returns
Seed to be used.
- Return type
int
easycv.utils.eval_utils module¶
- easycv.utils.eval_utils.generate_best_metric_name(evaluate_type, dataset_name, metric_names)[source]¶
Generate best metric name for different evaluator / different dataset / different metric_names evaluate_type: str dataset_name: None or str metric_names: None str or list[str] or tuple(str)
- Returns
list[str]
easycv.utils.flops_counter module¶
- easycv.utils.flops_counter.get_model_info(model, input_size, model_config, logger)[source]¶
get_model_info, check model parameters and Gflops
- easycv.utils.flops_counter.get_model_complexity_info(model, input_res, print_per_layer_stat=True, as_strings=True, input_constructor=None, ost=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶
- easycv.utils.flops_counter.params_to_string(params_num)[source]¶
converting number to string
- Parameters
params_num (float) – number
- Returns str
number
>>> params_to_string(1e9) '1000.0 M' >>> params_to_string(2e5) '200.0 k' >>> params_to_string(3e-9) '3e-09'
- easycv.utils.flops_counter.print_model_with_flops(model, units='GMac', precision=3, ost=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]¶
- easycv.utils.flops_counter.compute_average_flops_cost(self)[source]¶
A method that will be available after add_flops_counting_methods() is called on a desired net object. Returns current mean flops consumption per image.
- easycv.utils.flops_counter.start_flops_count(self)[source]¶
A method that will be available after add_flops_counting_methods() is called on a desired net object. Activates the computation of mean flops consumption per image. Call it before you run the network.
- easycv.utils.flops_counter.stop_flops_count(self)[source]¶
A method that will be available after add_flops_counting_methods() is called on a desired net object. Stops computing the mean flops consumption per image. Call whenever you want to pause the computation.
easycv.utils.gather module¶
easycv.utils.json_utils module¶
Utilities for dealing with writing json strings.
json_utils wraps json.dump and json.dumps so that they can be used to safely control the precision of floats when writing to json strings or files.
- class easycv.utils.json_utils.MyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]¶
Bases:
json.encoder.JSONEncoder
- default(o)[source]¶
Implement this method in a subclass such that it returns a serializable object for
o
, or calls the base implementation (to raise aTypeError
).For example, to support arbitrary iterators, you could implement default like this:
def default(self, o): try: iterable = iter(o) except TypeError: pass else: return list(iterable) # Let the base class default method raise the TypeError return JSONEncoder.default(self, o)
- easycv.utils.json_utils.dump(obj, fid, float_digits=- 1, **params)[source]¶
Wrapper of json.dump that allows specifying the float precision used.
- Parameters
obj – The object to dump.
fid – The file id to write to.
float_digits – The number of digits of precision when writing floats out.
**params – Additional parameters to pass to json.dumps.
- easycv.utils.json_utils.dumps(obj, float_digits=- 1, **params)[source]¶
Wrapper of json.dumps that allows specifying the float precision used.
- Parameters
obj – The object to dump.
float_digits – The number of digits of precision when writing floats out.
**params – Additional parameters to pass to json.dumps.
- Returns
JSON string representation of obj.
- Return type
output
- easycv.utils.json_utils.compat_dumps(data, float_digits=- 1)[source]¶
handle json dumps chinese and numpy data :param data python data structure: :param float_digits: The number of digits of precision when writing floats out.
- Returns
- json str, in python2 , the str is encoded with utf8
in python3, the str is unicode type(python3 str)
- easycv.utils.json_utils.PrettyParams(**params)[source]¶
Returns parameters for use with Dump and Dumps to output pretty json.
- Example usage:
`json_str = json_utils.Dumps(obj, **json_utils.PrettyParams())`
```json_str = json_utils.Dumps(obj, **json_utils.PrettyParams(allow_nans=False))```
- Parameters
**params – Additional params to pass to json.dump or json.dumps.
- Returns
- Parameters that are compatible with json_utils.Dump and
json_utils.Dumps.
- Return type
params
easycv.utils.logger module¶
- easycv.utils.logger.get_root_logger(log_file=None, log_level=20)[source]¶
Get the root logger.
The logger will be initialized if it has not been initialized. By default a StreamHandler will be added. If log_file is specified, a FileHandler will also be added. The name of the root logger is the top-level package name, e.g., “easycv”.
- Parameters
log_file (str | None) – The log filename. If specified, a FileHandler will be added to the root logger.
log_level (int) – The root logger level. Note that only the process of rank 0 is affected, while other processes will set the level to “Error” and be silent most of the time.
- Returns
The root logger.
- Return type
logging.Logger
- easycv.utils.logger.print_log(msg, logger=None, level=20)[source]¶
Print a log message.
- Parameters
msg (str) – The message to be logged.
logger (logging.Logger | str | None) – The logger to be used. Some special loggers are: - “root”: the root logger obtained with get_root_logger(). - “silent”: no message will be printed. - None: The print() method will be used to print log messages.
level (int) – Logging level. Only available when logger is a Logger object or “root”.
easycv.utils.metric_distance module¶
easycv.utils.misc module¶
- easycv.utils.misc.unmap(data, count, inds, fill=0)[source]¶
Unmap a subset of item (data) back to the original set of items (of size count)
- easycv.utils.misc.add_prefix(inputs, prefix)[source]¶
Add prefix for dict key.
- Parameters
inputs (dict) – The input dict with str keys.
prefix (str) – The prefix add to key name.
- Returns
The dict with keys wrapped with
prefix
.- Return type
dict
- easycv.utils.misc.reparameterize_models(model)[source]¶
- reparameterize model for inference, especially forf
rep conv block : merge 3x3 weight 1x1 weights
call module switch_to_deploy recursively
- Parameters
model – nn.Module
easycv.utils.preprocess_function module¶
- easycv.utils.preprocess_function.bninceptionPre(image, mean=[104, 117, 128], std=[1, 1, 1])[source]¶
- Parameters
image – pytorch Image tensor from PIL (range 0~1), bgr format
mean – norm mean
std – norm val
- Returns
A image norm in 0~255, rgb format
- easycv.utils.preprocess_function.randomErasing(image, probability=0.5, sl=0.02, sh=0.2, r1=0.3, mean=[0.4914, 0.4822, 0.4465])[source]¶
easycv.utils.profiling module¶
easycv.utils.py_util module¶
easycv.utils.registry module¶
- class easycv.utils.registry.Registry(name)[source]¶
Bases:
object
- property name¶
- property module_dict¶
- easycv.utils.registry.build_from_cfg(cfg, registry, default_args=None)[source]¶
Build a module from config dict.
- Parameters
cfg (dict) – Config dict. It should at least contain the key “type”.
registry (
Registry
) – The registry to search the type from.default_args (dict, optional) – Default initialization arguments.
- Returns
The constructed object.
- Return type
obj
easycv.utils.test_util module¶
Contains functions which are convenient for unit testing.
- easycv.utils.test_util.replace_data_for_test(cfg)[source]¶
replace real data with test data
- Parameters
cfg – Config object
- easycv.utils.test_util.dist_exec_wrapper(cmd, nproc_per_node, node_rank=0, nnodes=1, port='29527', addr='127.0.0.1', python_path=None)[source]¶
donot forget init dist in your function or script of cmd
`python from mmcv.runner import init_dist init_dist(launcher='pytorch') `
- easycv.utils.test_util.computeStats(backend, timings, batch_size=1, model_name='default')[source]¶
compute the statistical metric of time and speed
- easycv.utils.test_util.benchmark(predictor, input_data_list, backend='BACKEND', batch_size=1, model_name='default', num=200)[source]¶
evaluate the time and speed of different models
- class easycv.utils.test_util.DistributedTestCase(methodName='runTest')[source]¶
Bases:
unittest.case.TestCase
Distributed TestCase for test function with distributed mode. .. rubric:: Examples
import torch from mmcv.runner import init_dist from torch import distributed as dist
- def _test_func(*args, **kwargs):
init_dist(launcher=’pytorch’) rank = dist.get_rank() if rank == 0:
value = torch.tensor(1.0).cuda()
- else:
value = torch.tensor(2.0).cuda()
dist.all_reduce(value) return value.cpu().numpy()
- class DistTest(DistributedTestCase):
- def test_function_dist(self):
args = () # args should be python builtin type kwargs = {} # kwargs should be python builtin type self.start_with_torch(
)
- start_with_torch(func, num_gpus, assert_callback=None, save_all_ranks=False, *args, **kwargs)[source]¶
easycv package¶
Subpackages¶
easycv.file package¶
Submodules¶
easycv.file.base module¶
- class easycv.file.base.IOLocal[source]¶
Bases:
easycv.file.base.IOBase
easycv.file.file_io module¶
- easycv.file.file_io.set_oss_env(ak_id: str, ak_secret: str, hosts: Union[str, List[str]], buckets: Union[str, List[str]])[source]¶
- class easycv.file.file_io.IO(max_retry=10, retry_wait=0.1, max_retry_wait=30)[source]¶
Bases:
easycv.file.base.IOLocal
IO module to support both local and oss io. If access oss file, you need to authorize OSS, please refer to IO.access_oss.
- __init__(max_retry=10, retry_wait=0.1, max_retry_wait=30)[source]¶
Initialize self. See help(type(self)) for accurate signature.
- access_oss(ak_id: str = '', ak_secret: str = '', hosts: Union[str, List[str]] = '', buckets: Union[str, List[str]] = '')[source]¶
If access oss file, you need to authorize OSS as follows:
Method1: from easycv.file import io io.access_oss(
ak_id=’your_accesskey_id’, ak_secret=’your_accesskey_secret’, hosts=’your endpoint’ or [‘your endpoint1’, ‘your endpoint2’], buckets=’your bucket’ or [‘your bucket1’, ‘your bucket2’])
- Method2:
Add oss config to your local file ~/.ossutilconfig, as follows: More oss config information, please refer to: https://help.aliyun.com/document_detail/120072.html ``` [Credentials]
language = CH endpoint = your endpoint accessKeyID = your_accesskey_id accessKeySecret = your_accesskey_secret
- [Bucket-Endpoint]
bucket1 = endpoint1 bucket2 = endpoint2
``` Then run the following command, the config file will be read by default to authorize oss.
from easycv.file import io io.access_oss()
- open(full_path, mode='r')[source]¶
Same usage as the python build-in open. Support local path and oss path.
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # Write something to a oss file. with io.open(‘oss://bucket_name/demo.txt’, ‘w’) as f:
f.write(“test”)
# Read from a oss file. with io.open(‘oss://bucket_name/demo.txt’, ‘r’) as f:
print(f.read())
- Parameters
full_path – absolute oss path
- exists(path)[source]¶
Whether the file exists, same usage as os.path.exists. Support local path and oss path.
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss ret = io.exists(‘oss://bucket_name/dir’) print(ret)
- Parameters
path – oss path or local path
- move(src, dst)[source]¶
Move src to dst, same usage as shutil.move. Support local path and oss path.
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # move oss file to local io.move(‘oss://bucket_name/file.txt’, ‘/your/local/path/file.txt’) # move oss file to oss io.move(‘oss://bucket_name/file.txt’, ‘oss://bucket_name/file.txt’) # move local file to oss io.move(‘/your/local/file.txt’, ‘oss://bucket_name/file.txt’) # move directory io.move(‘oss://bucket_name/dir1’, ‘oss://bucket_name/dir2’)
- Parameters
src – oss path or local path
dst – oss path or local path
- copy(src, dst)[source]¶
Copy a file from src to dst. Same usage as shutil.copyfile. If you want to copy a directory, please use easycv.io.copytree
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # Copy a file from local to oss: io.copy(‘/your/local/file.txt’, ‘oss://bucket/dir/file.txt’)
# Copy a oss file to local: io.copy(‘oss://bucket/dir/file.txt’, ‘/your/local/file.txt’)
# Copy a file from oss to oss:: io.copy(‘oss://bucket/dir/file.txt’, ‘oss://bucket/dir/file2.txt’)
- Parameters
src – oss path or local path
dst – oss path or local path
- copytree(src, dst)[source]¶
Copy files recursively from src to dst. Same usage as shutil.copytree. If you want to copy a file, please use easycv.io.copy.
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # copy files from local to oss io.copytree(src=’/your/local/dir1’, dst=’oss://bucket_name/dir2’) # copy files from oss to local io.copytree(src=’oss://bucket_name/dir2’, dst=’/your/local/dir1’) # copy files from oss to oss io.copytree(src=’oss://bucket_name/dir1’, dst=’oss://bucket_name/dir2’)
- Parameters
src – oss path or local path
dst – oss path or local path
- listdir(path, recursive=False, full_path=False, contains: Optional[Union[List[str], str]] = None)[source]¶
List all objects in path. Same usage as os.listdir.
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss ret = io.listdir(‘oss://bucket/dir’, recursive=True) print(ret)
- Parameters
path – local file path or oss path.
recursive – If False, only list the top level objects. If True, recursively list all objects.
full_path – if full path, return files with path prefix.
contains – substr to filter list files.
return: A list of path.
- remove(path)[source]¶
Remove a file or a directory recursively. Same usage as os.remove or shutil.rmtree.
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # Remove a oss file io.remove(‘oss://bucket_name/file.txt’)
# Remove a oss directory io.remove(‘oss://bucket_name/dir/’)
- Parameters
path – local or oss path, file or directory
- rmtree(path)[source]¶
Remove directory recursively, same usage as shutil.rmtree
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.remove(‘oss://bucket_name/dir_name’) # Or io.remove(‘oss://bucket_name/dir_name/’)
- Parameters
path – oss path
- makedirs(path, exist_ok=True)[source]¶
Create directories recursively, same usage as os.makedirs
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.makedirs(‘oss://bucket/new_dir/’)
- Parameters
path – local or oss dir path
- isdir(path)[source]¶
Return whether a path is directory, same usage as os.path.isdir
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.isdir(‘oss://bucket/dir/’)
- Parameters
path – local or oss path
Return: bool, True or False.
- isfile(path)[source]¶
Return whether a path is file object, same usage as os.path.isfile
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.isfile(‘oss://bucket/file.txt’)
- Parameters
path – local or oss path
Return: bool, True or False.
- glob(file_path)[source]¶
Return a list of paths matching a pathname pattern. .. rubric:: Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.glob(‘oss://bucket/dir/*.txt’)
- Parameters
path – local or oss file pattern
Return: list, a list of paths.
- size(path: str) → int[source]¶
Get the size of file path, same usage as os.path.getsize
Example
from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss size = io.size(‘oss://bucket/file.txt’) print(size)
- Parameters
path – local or oss path.
Return: size of file in bytes
- class easycv.file.file_io.OSSFile(bucket, path, position=0)[source]¶
Bases:
object
easycv.runner package¶
Submodules¶
easycv.runner.ev_runner module¶
- class easycv.runner.ev_runner.EVRunner(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, fp16_enable=False)[source]¶
Bases:
mmcv.runner.epoch_based_runner.EpochBasedRunner
- __init__(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, fp16_enable=False)[source]¶
Epoch Runner for easycv, add support for oss IO and file sync.
- Parameters
model (
torch.nn.Module
) – The model to be run.batch_processor (callable) – A callable method that process a data batch. The interface of this method should be batch_processor(model, data, train_mode) -> dict
optimizer (dict or
torch.optim.Optimizer
) – It can be either an optimizer (in most cases) or a dict of optimizers (in models that requires more than one optimizer, e.g., GAN).work_dir (str, optional) – The working directory to save checkpoints and logs. Defaults to None.
logger (
logging.Logger
) – Logger used during training. Defaults to None. (The default value is just for backward compatibility)meta (dict | None) – A dict records some import information such as environment info and seed, which will be logged in logger hook. Defaults to None.
fp16_enable (bool) – if use fp16
- run_iter(data_batch, train_mode, **kwargs)[source]¶
process for each iteration.
- Parameters
data_batch – Batch of dict of data.
train_model (bool) – If set True, run training step else validation step.
- train(data_loader, **kwargs)[source]¶
Training process for one epoch which will iterate through all training data and call hooks at different stages.
- Parameters
data_loader – data loader object for training
- val(data_loader, **kwargs)[source]¶
Validation step which Deprecated, using evaluation hook instead.
- save_checkpoint(out_dir, filename_tmpl='epoch_{}.pth', save_optimizer=True, meta=None, create_symlink=True)[source]¶
Save checkpoint to file.
- Parameters
out_dir – Directory where checkpoint files are to be saved.
filename_tmpl (str, optional) – Checkpoint filename pattern.
save_optimizer (bool, optional) – save optimizer state.
meta (dict, optional) – Metadata to be saved in checkpoint.
- current_lr()[source]¶
Get current learning rates.
- Returns
- Current learning rates of all
param groups. If the runner has a dict of optimizers, this method will return a dict.
- Return type
list[float] | dict[str, list[float]]
- load_checkpoint(filename, map_location=device(type='cpu'), strict=False, logger=None)[source]¶
Load checkpoint from a file or URL.
- Parameters
filename (str) – Accept local filepath, URL,
torchvision://xxx
,open-mmlab://xxx
,oss://xxx
. Please refer todocs/source/model_zoo.md
for details.map_location (str) – Same as
torch.load()
.strict (bool) – Whether to allow different params for the model and checkpoint.
logger (
logging.Logger
or None) – The logger for error message.
- Returns
The loaded checkpoint.
- Return type
dict or OrderedDict