EasyCV DOCUMENTATION

EasyCV is an all-in-one computer vision toolbox based on PyTorch, mainly focus on self-supervised learning, image classification, metric-learning, object detection and so on.

Prepare Datasets

EasyCV provides various datasets for multi tasks. Please refer to the following guide for data preparation and keep the same data structure.

Image Classification

Cifar10

The CIFAR-10 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.

It consists of 60000 32x32 colour images in 10 classes, with 6000 images per class.

There are 50000 training images and 10000 test images.

Here is the list of classes in the CIFAR-10: airplane, automobile, bird, cat, deer, dog, frog, horse, ship, truck.

For more detailed information, please refer to CIFAR.

Download

Download data from cifar-10-python.tar.gz (163MB). And uncompress files to data/cifar10.

Directory structure is as follows:

data/cifar10
└── cifar-10-batches-py
    ├── batches.meta
    ├── data_batch_1
    ├── data_batch_2
    ├── data_batch_3
    ├── data_batch_4
    ├── data_batch_5
    ├── readme.html
    ├── read.py
    └── test_batch

Cifar100

The CIFAR-100 are labeled subsets of the 80 million tiny images dataset. They were collected by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton.

This dataset is just like the CIFAR-10, except it has 100 classes containing 600 images each.

There are 500 training images and 100 testing images per class.

The 100 classes in the CIFAR-100 are grouped into 20 superclasses. Each image comes with a “fine” label (the class to which it belongs) and a “coarse” label (the superclass to which it belongs).

For more detailed information, please refer to CIFAR.

Download

Download data from cifar-100-python.tar.gz (161MB). And uncompress files to data/cifar100.

Directory structure should be as follows:

data/cifar100
└── cifar-100-python
    ├── file.txt~
    ├── meta
    ├── test
    ├── train

Imagenet-1k

ImageNet is an image database organized according to the WordNet hierarchy (currently only the nouns).

It is used in the ImageNet Large Scale Visual Recognition Challenge(ILSVRC) and is a benchmark for image classification.

For more detailed information, please refer to ImageNet.

Download

ILSVRC2012 is widely used, download it as follows:

  1. Go to the download-url, Register an account and log in .

  2. Recommended ILSVRC2012, download the following files:

    • Training images (Task 1 & 2). 138GB.

    • Validation images (all tasks). 6.3GB.

  3. Unzip the downloaded file.

  4. Using this scrip to get data meta.

Directory structure should be as follows:

data/imagenet
└── train
    └── n01440764
    └── n01443537
    └── ...
└── val
    └── n01440764
    └── n01443537
    └── ...
└── meta
    ├── train.txt
    ├── val.txt
    ├── ...

Imagenet-1k-TFrecords

Original imagenet raw images packed in TFrecord format.

For more detailed information about Imagenet dataset, please refer to ImageNet.

Download
  1. Go to the download-url, Register an account and log in .

  2. The dataset is divided into two parts, part0 (79GB) and part1 (75GB), you need download all of them.

Directory structure should be as follows, put the image file and the idx file in the same folder:

data/imagenet
└── train
    ├── train-00000-of-01024
    ├── train-00000-of-01024.idx
    ├── train-00001-of-01024
    ├── train-00001-of-01024.idx
    ├── ...
└── validation
    ├── validation-00000-of-00128
    ├── validation-00000-of-00128.idx
    ├── validation-00001-of-00128
    ├── validation-00001-of-00128.idx
    ├── ...

Object Detection

PAI-iTAG detection

PAI-iTAG is a platform for intelligent data annotation, which supports the annotation of various data types such as images, texts, videos, and audios, as well as multi-modal mixed annotation.

Please refer to 智能标注iTAG for file format and data annotation.

Download

Download SmallCOCO dataset to data/demo_itag_coco, Directory structure should be as follows:

data/demo_itag_coco/
├── train2017
├── train2017_20_local.manifest
├── val2017
└── val2017_20_local.manifest

COCO2017

The COCO dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

The COCO dataset has been updated for several editions, and coco2017 is widely used. In 2017, the training/validation split was 118K/5K and test set is a subset of 41K images of the 2015 test set.

For more detailed information, please refer to COCO.

Download

Download train2017.zip (18G) ,val2017.zip (1G), annotations_trainval2017.zip (241MB) and uncompress files to to data/coco2017.

Directory structure is as follows:

data/coco2017
└── annotations
    ├── instances_train2017.json
    ├── instances_val2017.json
└── train2017
    ├── 000000000009.jpg
    ├── 000000000025.jpg
    ├── ...
└── val2017
    ├── 000000000139.jpg
    ├── 000000000285.jpg
    ├── ...

VOC2007

PASCAL VOC 2007 is a dataset for image recognition. The twenty object classes that have been selected are:

  • Person: person

  • Animal: bird, cat, cow, dog, horse, sheep

  • Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train

  • Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations.

For more detailed information, please refer to voc2007.

Download

Download VOCtrainval_06-Nov-2007.tar (439MB) and uncompress files to to data/VOCdevkit.

Directory structure is as follows:

data/VOCdevkit
└── VOC2007
    └── Annotations
        ├── 000005.xml
        ├── 001010.xml
    	├── ...
    └── JPEGImages
        ├── 000005.jpg
        ├── 001010.jpg
        ├── ...
    └── SegmentationClass
        ├── 000005.png
        ├── 001010.png
        ├── ...
    └── SegmentationObject
        ├── 000005.png
        ├── 001010.png
        ├── ...
    └── ImageSets
        └── Layout
            ├── train.txt
            ├── trainval.txt
            ├── val.txt
        └── Main
            ├── train.txt
            ├── val.txt
            ├── ...
        └── Segmentation
            ├── train.txt
            ├── trainval.txt
            ├── val.txt

VOC2012

The PASCAL VOC 2012 dataset contains 20 object categories including:

  • Person: person

  • Animal: bird, cat, cow, dog, horse, sheep

  • Vehicle: aeroplane, bicycle, boat, bus, car, motorbike, train

  • Indoor: bottle, chair, dining table, potted plant, sofa, tv/monitor

Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations.

For more detailed information, please refer to voc2012.

Download

Download VOCtrainval_11-May-2012.tar (2G) and uncompress files to to data/VOCdevkit.

Directory structure is as follows:

data/VOCdevkit
└── VOC2012
    └── Annotations
        ├── 000005.xml
        ├── 001010.xml
    	├── ...
    └── JPEGImages
        ├── 000005.jpg
        ├── 001010.jpg
        ├── ...
    └── SegmentationClass
        ├── 000005.png
        ├── 001010.png
        ├── ...
    └── SegmentationObject
        ├── 000005.png
        ├── 001010.png
        ├── ...
    └── ImageSets
        └── Layout
            ├── train.txt
            ├── trainval.txt
            ├── val.txt
        └── Main
            ├── train.txt
            ├── val.txt
            ├── ...
        └── Segmentation
            ├── train.txt
            ├── trainval.txt
            ├── val.txt

Self-Supervised Learning

Imagenet-1k

Refer to Image Classification: Imagenet-1k.

Imagenet-1k-TFrecords

Refer to Image Classification: Imagenet-1k-TFrecords.

Pose

COCO2017

The COCO dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

The COCO dataset has been updated for several editions, and coco2017 is widely used. In 2017, the training/validation split was 118K/5K and test set is a subset of 41K images of the 2015 test set.

For more detailed information, please refer to COCO.

Download

Download it as follows:

  1. Download data: train2017.zip (18G) , val2017.zip (1G)

  2. Download annotations: annotations_trainval2017.zip (241MB)

  3. Download person detection results: HRNet-Human-Pose-Estimation provides person detection result of COCO val2017 to reproduce our multi-person pose estimation results. Please download from OneDrive or GoogleDrive (26.2MB).

Then uncompress files to data/coco2017, directory structure is as follows:

data/coco2017
└── annotations
    ├── person_keypoints_train2017.json
    ├── person_keypoints_val2017.json
└── person_detection_results
    ├── COCO_val2017_detections_AP_H_56_person.json
    ├── COCO_test-dev2017_detections_AP_H_609_person.json
└── train2017
    ├── 000000000009.jpg
    ├── 000000000025.jpg
    ├── ...
└── val2017
    ├── 000000000139.jpg
    ├── 000000000285.jpg
    ├── ...

Image Segmentation

COCO Stuff 164k

For COCO Stuff 164k dataset, please run the following commands to download and convert the augmented dataset.

# download
mkdir coco_stuff164k && cd coco_stuff164k
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
wget http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip

# unzip
unzip train2017.zip -d images/
unzip val2017.zip -d images/
unzip stuffthingmaps_trainval2017.zip -d annotations/

# --nproc means 8 process for conversion, which could be omitted as well.
python tools/prepare_data/coco_stuff164k.py /path/to/coco_stuff164k --nproc 8

By convention, mask labels in /path/to/coco_stuff164k/annotations/*2017/*_labelTrainIds.png are used for COCO Stuff 164k training and testing.

The details of this dataset could be found at here.

Object Detection 3D

NuScenes

Download nuScenes V1.0 full dataset data and CAN bus expansion data HERE. Prepare nuscenes data by running:

python tools/prepare_data/prepare_nuscenes.py \
--root_path=./data/nuscenes \
--canbus_root_path=./data/canbus \
--out_dir=./data/nuscenes \
--version=v1.0

It will generate nuscenes_infos_temporal_{train,val}.pkl files.

The data structure is as follows:

data/nuscenes
 ├── can_bus
 ├── nuscenes-v1.0
 │   ├── maps
 │   ├── samples
 │   ├── sweeps
 │   ├── v1.0-test
 |   ├── v1.0-trainval
 |   ├── nuscenes_infos_temporal_train.pkl
 |   ├── nuscenes_infos_temporal_val.pkl

Quick Start

Prerequisites

  • python >= 3.6

  • Pytorch >= 1.5

  • mmcv >= 1.2.0

  • nvidia-dali == 0.25.0

Installation

Prepare environment

  1. Create a conda virtual environment and activate it.

    conda create -n ev python=3.6 -y
    conda activate ev
    
  2. Install PyTorch and torchvision

    The master branch works with PyTorch 1.5.1 or higher.

    conda install pytorch==1.7.0 torchvision==0.8.0 -c pytorch
    
  3. Install some python dependencies

    replace {cu_version} and {torch_version} to the version used in your environment

    # install mmcv
    pip install mmcv-full==1.4.4 -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
    # for example, install mmcv-full for cuda10.1 and pytorch 1.7.0
    pip install mmcv-full==1.4.4 -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html
    
    # install nvidia-dali
    pip install http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/third_party/nvidia_dali_cuda100-0.25.0-1535750-py3-none-manylinux2014_x86_64.whl
    
    # install common_io for MaxCompute table read (optional)
    pip install https://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/third_party/common_io-0.3.0-cp36-cp36m-linux_x86_64.whl
    
  4. Install EasyCV

    You can simply install easycv with the following command:

    pip install pai-easycv
    

    or clone the repository and then install it:

    git clone https://github.com/Alibaba/EasyCV.git
    cd easycv
    pip install -r requirements.txt
    pip install -v -e .  # or "python setup.py develop"
    
  5. Install pai_nni and blade_compressin

    When you use model quantize and prune, you need to install pai_nni and blade_compression with the following command:

    # install torch >= 1.8.0
    pip install torch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0
    
    # install mmcv >= 1.3.0 (torch version >= 1.8.0 does not support mmcv version < 1.3.0)
    pip install mmcv-full==1.4.4 -f https://download.openmmlab.com/mmcv/dist/{cu_version}/{torch_version}/index.html
    
    # install onnx and pai_nni
    pip install onnx
    pip install https://pai-nni.oss-cn-zhangjiakou.aliyuncs.com/release/2.5/pai_nni-2.5-py3-none-manylinux1_x86_64.whl
    
    # install blade_compression
    pip install http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/third_party/blade_compression-0.0.1-py3-none-any.whl
    
  6. If you want to use MSDeformAttn, you need to compiling CUDA operators

    cd easycv/thirdparty/deformable_attention/
    python setup.py build install
    # unit test (should see all checking is True)
    python test.py
    cd ../../..
    

Verification

Simple verification

from easycv.apis import *

You can also verify your installation using following quick-start examples

Self-supervised Learning Model Zoo

Pretrained models

MAE

Pretrained on ImageNet dataset.

Config Backbone Params
(backbone/total)
Train memory
(GB)
Flops inference time(V100)
(ms/img)
Epochs Download
mae_vit_base_patch16_8xb64_400e ViT-B/16 85M/111M 9.5 9.8G 8.03 400 model
mae_vit_base_patch16_8xb64_1600e ViT-B/16 85M/111M 9.5 9.8G 8.03 1600 model
mae_vit_large_patch16_8xb32_1600e ViT-L/16 303M/329M 11.3 20.8G 16.30 1600 model

Fast ConvMAE

Pretrained on ImageNet dataset.

Config Backbone Params
(backbone/total)
Train memory
(GB)
Flops inference time(V100)
(ms/img)
Total train time Epochs Download
fast_convmae_vit_base_patch16_8xb64_50e ConvViT-B/16 88M/115M 30.3 45.1G 6.88 20h
(8*A100)
50 model - log

The flops of Fast ConvMAE is about four times of MAE, because the mask of MAE only retains 25% of the tokens each forward, but the mask of Fast ConvMAE adopts a complementary strategy, dividing the mask into four complementary parts with 25% token each part. This is equivalent to learning four samples at each forward, achieving 4 times the learning effect.

DINO

Pretrained on ImageNet dataset.

Config Backbone Params
(backbone/total)
Train memory
(GB)
inference time(V100)
(ms/img)
Epochs Download
dino_deit_small_p16_8xb32_100e DeiT-S/16 21M/88M 10.5 6.17 100 model

MoBY

Pretrained on ImageNet dataset.

Config Backbone Params
(backbone/total)
Flops Train memory
(GB)
inference time(V100)
(ms/img)
Epochs Download
moby_deit_small_p16_4xb128_300e DeiT-S/16 21M/26M 18.6G 21.4 6.17 300 model - log
moby_swin_tiny_8xb64_300e Swin-T 27M/33M 18.1G 16.1 9.74 300 model - log

MoCo V2

Pretrained on ImageNet dataset.

Config Backbone Params
(backbone/total)
Flops Train memory
(GB)
inference time(V100)
(ms/img)
Epochs Download
mocov2_resnet50_8xb32_200e ResNet50 23M/28M 8.2G 5.4 8.59 200 model

SwAV

Pretrained on ImageNet dataset.

Config Backbone Params
(backbone/total)
Flops Train memory
(GB)
inference time(V100)
(ms/img)
Epochs Download
swav_resnet50_8xb32_200e ResNet50 23M/28M 12.9G 11.3 8.59 200 model - log

Benchmarks

For detailed usage of benchmark tools, please refer to benchmark README.md.

COCO2017 Object Detection

Algorithm Eval Config Pretrained Config mAP (Box) mAP (Mask) Download
Fast ConvMAE mask_rcnn_conv_vitdet_50e_coco fast_convmae_vit_base_patch16_8xb64_50e 51.3 45.6 eval model
SwAV mask_rcnn_r50_fpn_1x_coco swav_resnet50_8xb32_200e 40.38 36.48 eval model - log
MoCo-v2 mask_rcnn_r50_fpn_1x_coco mocov2_resnet50_8xb32_200e 39.9 35.8 eval model - log
MoBY mask_rcnn_swin_tiny_1x_coco moby_swin_tiny_8xb64_300e 43.11 39.37 eval model - log

VOC2012 Aug Semantic Segmentation

Algorithm Eval Config Pretrained Config mIOU Download
SwAV fcn_r50-d8_512x512_60e_voc12aug swav_resnet50_8xb32_200e 63.91 eval model - log
MoCo-v2 fcn_r50-d8_512x512_60e_voc12aug mocov2_resnet50_8xb32_200e 68.49 eval model - log

Detection Model Zoo

Inference default use V100 16G.

YOLOX-PAI

Pretrained on COCO2017 dataset. (The result has been optimized with PAI-Blade, and only computes the model inference time. To learn about end2end inference time, you can refer to export.md.)

Algorithm Config Params SpeedV100
fp16 b32
mAPval
0.5:0.95
APval
50
Download
YOLOX-s yolox_s_8xb16_300e_coco 9M 0.68ms 40.0 58.9 model - log
PAI-YOLOXs yoloxs_pai_8xb16_300e_coco 16M 0.71ms 41.4 60.0 model - log
PAI-YOLOXs-ASFF yoloxs_pai_asff_8xb16_300e_coco 21M 0.87ms 42.8 61.8 model - log
PAI-YOLOXs-ASFF-TOOD3 yoloxs_pai_asff_tood3_8xb16_300e_coco 24M 1.15ms 43.9 62.1 model - log
YOLOX-m yolox_m_8xb16_300e_coco 25M 1.52ms 46.3 64.9 model - log
YOLOX-l yolox_l_8xb8_300e_coco 54M 2.47ms 48.9 67.5 model - log
YOLOX-x yolox_x_8xb8_300e_coco 99M 4.74ms 50.9 69.2 model - log
YOLOX-tiny yolox_tiny_8xb16_300e_coco 5M 0.28ms 31.5 49.2 model - log
YOLOX-nano yolox_nano_8xb16_300e_coco 2.2M 0.19ms 26.5 42.6 model - log

ViTDet

| Algorithm | Config | Params
(backbone/total) | Train memory
(GB) | inference time(V100)
(ms/img) | bbox_mAPval
0.5:0.95 | mask_mAPval
0.5:0.95 | Download | | ———- | ———————————————————— | ———————— | ———————————————————— | ———————————————————— | ———————————————————— | ———————————————————— | ———————————————————— | | ViTDet_MaskRCNN | vitdet_maskrcnn | 86M/111M | 13.3 (fp16) | 138ms | 50.65 | 45.41 | model - log |

FCOS

Algorithm Config Params
(backbone/total)
Train memory
(GB)
inference time(V100)
(ms/img)
mAPval
0.5:0.95
APval
50
Download
FCOS-r50(caffe) fcos-r50 23M/32M 5.0 85.8ms 38.58 57.18 model - log
FCOS-r50(torch) fcos-r50 23M/32M 4.0 (fp16) 105.3ms 38.88 58.01 model - log

DETR

Algorithm Config Params
(backbone/total)
Train memory
(GB)
inference time(V100)
(ms/img)
bbox_mAPval
0.5:0.95
APval
50
Download
DETR-r50 detr-r50 23M/41M 8.5 48.5ms 39.92 60.52 model - log
DAB-DETR-r50 dab-detr-r50 23M/43M 2.6 58.5ms 42.52 63.03 model - log
DN-DETR-r50 dab-detr-r50 23M/43M 7.8 58.5ms 44.39 64.66 model - log

DINO

Algorithm Config Params
(backbone/total)
inference time(V100)
(ms/img)
bbox_mAPval
0.5:0.95
APval
50
Download Comment
DINO_4sc_r50_12e DINO_4sc_r50_12e 23M/47M 184ms 48.71 66.27 model - log Inference use V100 32G
DINO_4sc_r50_36e DINO_4sc_r50_36e 23M/47M 184ms 50.69 68.60 model - log Inference use V100 32G
DINO_4sc_swinl_12e DINO_4sc_swinl_12e 195M/217M 155ms 56.86 75.61 model - log Inference use V100 32G
DINO_4sc_swinl_36e DINO_4sc_swinl_36e 195M/217M 155ms 58.04 76.76 model - log Inference use V100 32G
DINO_5sc_swinl_36e DINO_5sc_swinl_36e 195M/217M 235ms 58.47 77.10 model - log Inference use V100 32G
DINO++_5sc_swinl_18e DINO++_5sc_swinl_18e 195M/218M 325ms 63.39 80.25 model - log Inference use A100 80G

Develop

1. Code Style

We adopt PEP8 as the preferred code style.

We use the following toolsseed isortseed isortseed isort for linting and formatting:

Style configurations of yapf and isort can be found in setup.cfg. We use pre-commit hook that checks and formats for flake8, yapf, seed-isort-config, isort, trailing whitespaces, fixes end-of-files, sorts requirments.txt automatically on every commit. The config for a pre-commit hook is stored in .pre-commit-config. After you clone the repository, you will need to install initialize pre-commit hook.

pip install -r requirements/tests.txt

From the repository folder

pre-commit install

After this on every commit check code linters and formatter will be enforced.

If you want to use pre-commit to check all the files, you can run

pre-commit run --all-files

If you only want to format and lint your code, you can run

sh scripts/linter.sh

2. Test

2.1 Unit test

bash scripts/ci_test.sh

2.2 Test data storage

As we need a lot of data for testing, including images, models. We use git lfs to store those large files.

  1. install git-lfs(version>=2.5.0)

for mac

brew install git-lfs
git lfs install

for centos, please download rpm from git-lfs github release website

wget http://101374-public.oss-cn-hangzhou-zmf.aliyuncs.com/git-lfs-3.2.0-1.el7.x86_64.rpm
sudo rpm -ivh git-lfs-3.2.0-1.el7.x86_64.rpm
git lfs install

for ubuntu

curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
git lfs install
  1. track your data type using git lfs, for example, to track png files

git lfs track "*.png"
  1. add your test files to data/test/ folder, you can make directories if you need.

git add data/test/test.png
  1. commit your test data to remote branch

git commit -m "xxx"

To pull data from remote repo, just as the same way you pull git files.

git pull origin branch_name

3. Build pip package

python setup.py sdist bdist_wheel

self-supervised learning tutorial

Data Preparation

To download the dataset, please refer to prepare_data.md.

Self-supervised learning support imagenet(raw and tfrecord) format data.

Imagenet format

You can download Imagenet data or use your own unlabeld image data. You should provide a directory which contains images for self-supervised training and a filelist which contains image path to the root directory. For example, the image directory is as follows

images/
├── 0001.jpg
├── 0002.jpg
├── 0003.jpg
|...
└── 9999.jpg

the content of filelist is

0001.jpg
0002.jpg
0003.jpg
...
9999.jpg

Local & PAI-DSW

We use configs/selfsup/mocov2/mocov2_rn50_8xb32_200e_jpg.py as an example config in which two config variable should be modified

data_train_list = 'filelist.txt'
data_train_root = 'images'

Training

Single gpu:

python tools/train.py \
		${CONFIG_PATH} \
		--work_dir ${WORK_DIR}

Multi gpus:

bash tools/dist_train.sh \
		${NUM_GPUS} \
		${CONFIG_PATH} \
		--work_dir ${WORK_DIR}
Arguments
  • NUM_GPUS: number of gpus

  • CONFIG_PATH: the config file path of a selfsup method

  • WORK_DIR: your path to save models and logs

Examples:

Edit data_rootpath in the ${CONFIG_PATH} to your own data path.

GPUS=8
bash tools/dist_train.sh configs/selfsup/mocov2/mocov2_rn50_8xb32_200e_jpg.py $GPUS

Export model

python tools/export.py \
		${CONFIG_PATH} \
		${CHECKPOINT} \
		${EXPORT_PATH}
Arguments
  • CONFIG_PATH: the config file path of a selfsup method

  • CHECKPOINT:your checkpoint file of a selfsup method named as epoch_*.pth

  • EXPORT_PATH: your path to save export model

Examples:

python tools/export.py configs/selfsup/mocov2/mocov2_rn50_8xb32_200e_jpg.py \
    work_dirs/selfsup/mocov2/epoch_200.pth \
    work_dirs/selfsup/mocov2/epoch_200_export.pth

Feature extract

Download test_image

import cv2
from easycv.predictors.feature_extractor import TorchFeatureExtractor

output_ckpt = 'work_dirs/selfsup/mocov2/epoch_200_export.pth'
fe = TorchFeatureExtractor(output_ckpt)

img = cv2.imread('248347732153_1040.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
feature = fe.predict([img])
print(feature[0]['feature'].shape)

YOLOX-PAI Turtorial

Introduction

Welcome to YOLOX-PAI! YOLOX-PAI is an incremental work of YOLOX based on PAI-EasyCV. We use various existing detection methods and PAI-Blade to boost the performance. We also provide an efficient way for end2end object detction.

In breif, our main contributions are:

  • Investigate various detection methods upon YOLOX to achieve SOTA object detection results.

  • Provide an easy way to use PAI-Blade to accelerate the inference process.

  • Provide a convenient way to train/evaluate/export YOLOX-PAI model and conduct end2end object detection.

To learn more details of YOLOX-PAI, you can refer to our technical report or arxiv paper.

image

Data preparation

To download the dataset, please refer to prepare_data.md.

Yolox support both coco format and PAI-Itag detection format,

COCO format

To use coco data to train detection, you can refer to configs/detection/yolox/yolox_s_8xb16_300e_coco.py for more configuration details.

PAI-Itag detection format

To use pai-itag detection format data to train detection, you can refer to configs/detection/yolox/yolox_s_8xb16_300e_coco_pai.py for more configuration details.

Quick Start

To use COCO format data, use config file configs/detection/yolox/yolox_s_8xb16_300e_coco.py

To use PAI-Itag format data, use config file configs/detection/yolox/yolox_s_8xb16_300e_coco_pai.py

You can use the quick_start.md for local installation or use our provided doker images (for both training and inference).

Pull Docker

sudo docker pull registry.cn-shanghai.aliyuncs.com/pai-ai-test/pai-easycv:yolox-pai

Start Container

sudo nvidia-docker run -it -v path:path --name easycv_yolox_pai --shm-size=10g --network=host registry.cn-shanghai.aliyuncs.com/pai-ai-test/pai-easycv:yolox-pai

Train

Single gpu:

python tools/train.py \
		${CONFIG_PATH} \
		--work_dir ${WORK_DIR}

Multi gpus:

bash tools/dist_train.sh \
		${NUM_GPUS} \
		${CONFIG_PATH} \
		--work_dir ${WORK_DIR}
Arguments
  • NUM_GPUS: number of gpus

  • CONFIG_PATH: the config file path of a detection method

  • WORK_DIR: your path to save models and logs

Examples:

Edit data_rootpath in the ${CONFIG_PATH} to your own data path.

GPUS=8
bash tools/dist_train.sh configs/detection/yolox/yolox_s_8xb16_300e_coco.py $GPUS

Evaluation

The pretrained model of YOLOX-PAI can be found here.

Single gpu:

python tools/eval.py \
		${CONFIG_PATH} \
		${CHECKPOINT} \
		--eval

Multi gpus:

bash tools/dist_test.sh \
		${CONFIG_PATH} \
		${NUM_GPUS} \
		${CHECKPOINT} \
		--eval
Arguments
  • CONFIG_PATH: the config file path of a detection method

  • NUM_GPUS: number of gpus

  • CHECKPOINT: the checkpoint file named as epoch_*.pth.

Examples:

GPUS=8
bash tools/dist_test.sh configs/detection/yolox/yolox_s_8xb16_300e_coco.py $GPUS work_dirs/detection/yolox/epoch_300.pth --eval

Export model

python tools/export.py \
		${CONFIG_PATH} \
		${CHECKPOINT} \
		${EXPORT_PATH}

For more details of the export process, you can refer to export.md.

Arguments
  • CONFIG_PATH: the config file path of a detection method

  • CHECKPOINT:your checkpoint file of a detection method named as epoch_*.pth.

  • EXPORT_PATH: your path to save export model

Examples:

python tools/export.py configs/detection/yolox/yolox_s_8xb16_300e_coco.py \
        work_dirs/detection/yolox/epoch_300.pth \
        work_dirs/detection/yolox/epoch_300_export.pth

Inference

Download exported models(preprocess, model, meta) or export your own model. Put them in the following format:

export_blade/
epoch_300_pre_notrt.pt.blade
epoch_300_pre_notrt.pt.blade.config.json
epoch_300_pre_notrt.pt.preprocess

Download test_image

import cv2
from easycv.predictors import TorchYoloXPredictor

output_ckpt = 'export_blade/epoch_300_pre_notrt.pt.blade'
detector = TorchYoloXPredictor(output_ckpt,use_trt_efficientnms=False)

img = cv2.imread('000000017627.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
output = detector.predict([img])
print(output)

# visualize image
image = img.copy()
for box, cls_name in zip(output[0]['detection_boxes'], output[0]['detection_class_names']):
    # box is [x1,y1,x2,y2]
    box = [int(b) for b in box]
    image = cv2.rectangle(image, tuple(box[:2]), tuple(box[2:4]), (0,255,0), 2)
    cv2.putText(image, cls_name, (box[0], box[1]-5), cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0,0,255), 2)

cv2.imwrite('result.jpg',image)

image classification tutorial

Data Preparation

To download the dataset, please refer to prepare_data.md.

Image classification support cifar and imagenet(raw and tfrecord) format data.

Cifar

To use Cifar data to train classification, you can refer to configs/classification/cifar10/swintiny_b64_5e_jpg.py for more configuration details.

Imagenet format

You can also use your self-defined data which follows imagenet format, you should provide a root directory which condatains images for classification training and a filelist which contains image path to the root directory. For example, the image root directory is as follows

images/
├── 0001.jpg
├── 0002.jpg
├── 0003.jpg
|...
└── 9999.jpg

each line of the filelist consists of two parts, subpath to the image files starting from the image root directory, class label string for the corresponding image, which are seperated by space

0001.jpg label1
0002.jpg label2
0003.jpg label3
...
9999.jpg label9999

To use Imagenet format data to train classification, you can refer to configs/classification/imagenet/imagenet_rn50_jpg.py for more configuration details.

Local & PAI-DSW

Training

Single gpu:

python tools/train.py \
		${CONFIG_PATH} \
		--work_dir ${WORK_DIR}

Multi gpus:

bash tools/dist_train.sh \
		${NUM_GPUS} \
		${CONFIG_PATH} \
		--work_dir ${WORK_DIR}
Arguments
  • NUM_GPUS: number of gpus

  • CONFIG_PATH: the config file path of a image classification method

  • WORK_DIR: your path to save models and logs

Examples:

Edit data_rootpath in the ${CONFIG_PATH} to your own data path.

single gpu training:
```shell
python tools/train.py  configs/classification/cifar10/swintiny_b64_5e_jpg.py --work_dir work_dirs/classification/cifar10/swintiny  --fp16
```

multi gpu training
```shell
GPUS=8
bash tools/dist_train.sh configs/classification/cifar10/swintiny_b64_5e_jpg.py $GPUS --fp16
```

training using python api
```python
import easycv.tools

import os
# config_path can be a local file or http url
config_path = 'configs/classification/cifar10/swintiny_b64_5e_jpg.py'
easycv.tools.train(config_path, gpus=8, fp16=False, master_port=29527)
```

Evaluation

Single gpu:

python tools/eval.py \
		${CONFIG_PATH} \
		${CHECKPOINT} \
		--eval

Multi gpus:

bash tools/dist_test.sh \
		${CONFIG_PATH} \
		${NUM_GPUS} \
		${CHECKPOINT} \
		--eval
Arguments
  • CONFIG_PATH: the config file path of a image classification method

  • NUM_GPUS: number of gpus

  • CHECKPOINT: the checkpoint file named as epoch_*.pth

Examples:

single gpu evaluation
```shell
python tools/eval.py configs/classification/cifar10/swintiny_b64_5e_jpg.py work_dirs/classification/cifar10/swintiny/epoch_350.pth --eval --fp16
```

multi-gpu evaluation

```shell
GPUS=8
bash tools/dist_test.sh configs/classification/cifar10/swintiny_b64_5e_jpg.py $GPUS work_dirs/classification/cifar10/swintiny/epoch_350.pth --eval --fp16
```

evaluation using python api
```python
import easycv.tools

import os
os.environ['CUDA_VISIBLE_DEVICES']='3,4,5,6'
config_path = 'configs/classification/cifar10/swintiny_b64_5e_jpg.py'
checkpoint_path = 'work_dirs/classification/cifar10/swintiny/epoch_350.pth'
easycv.tools.eval(config_path, checkpoint_path, gpus=8)
```

Export model for inference

If SyncBN is configured, we should replace it with BN in config file
```python
# imagenet_rn50.py
model = dict(
    ...
    backbone=dict(
        ...
        norm_cfg=dict(type='BN')), # SyncBN --> BN
    ...)
```

```shell
python tools/export.py configs/classification/cifar10/swintiny_b64_5e_jpg.py \
    work_dirs/classification/cifar10/swintiny/epoch_350.pth \
    work_dirs/classification/cifar10/swintiny/epoch_350_export.pth
```

or using python api
```python
import easycv.tools

config_path = './imagenet_rn50.py'
checkpoint_path = 'oss://pai-vision-data-hz/pretrained_models/easycv/resnet/resnet50.pth'
export_path = './resnet50_export.pt'
easycv.tools.export(config_path, checkpoint_path, export_path)
```

Inference

Download [test_image](http://pai-vision-data-hz.oss-cn-zhangjiakou.aliyuncs.com/data/cifar10/qince_data/predict/aeroplane_s_000004.png)


```python
import cv2
from easycv.predictors.classifier import TorchClassifier

output_ckpt = 'work_dirs/classification/cifar10/swintiny/epoch_350_export.pth'
tcls = TorchClassifier(output_ckpt)

img = cv2.imread('aeroplane_s_000004.png')
# input image should be RGB order
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
output = tcls.predict([img])
print(output)
```

file tutorial

The file module of easycv supports operations both on local and oss files, oss introduction please refer to: https://www.aliyun.com/product/oss .

If you operate oss files, you need refer to access_oss to authorize oss first.

Support operations

access_oss

Authorize oss.

Method1:

from easycv.file import io
io.access_oss(
ak_id='your_accesskey_id',
ak_secret='your_accesskey_secret',
hosts='your endpoint' or ['your endpoint1', 'your endpoint2'],
buckets='your bucket' or ['your bucket1', 'your bucket2'])

Method2:

Add oss config to your local file ~/.ossutilconfig, as follows: More oss config information, please refer to: https://help.aliyun.com/document_detail/120072.html

[Credentials]
language = CH
endpoint = your endpoint
accessKeyID = your_accesskey_id
accessKeySecret = your_accesskey_secret
[Bucket-Endpoint]
bucket1 = endpoint1
bucket2 = endpoint2

If you want to modify the path of the default oss config file (~/.ossutilconfig), you can do as follows:

$ export OSS_CONFIG_FILE='your oss config file path'

Then run the following command, the config file will be read by default to authorize oss.

from easycv.file import io
io.access_oss()

Method3:

Set environment variables as follow, EasyCV will automatically parse environment variables for authorization:

import os

os.environ['OSS_ACCESS_KEY_ID'] = 'your_accesskey_id'
os.environ['OSS_ACCESS_KEY_SECRET'] = 'your_accesskey_secret'
os.environ['OSS_ENDPOINTS'] = 'your endpoint1,your endpoint2'  # split with ","
os.environ['OSS_BUCKETS'] = 'your bucket1,your bucket2'  # split with ","

open

Support w,wb, a, r, rb modes on oss path. Local path is the same usage as the python build-in open.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

# Write something to a oss file.
with io.open('oss://bucket_name/demo.txt', 'w') as f:
	f.write("test")

# Read from a oss file.
with io.open('oss://bucket_name/demo.txt', 'r') as f:
  print(f.read())

Example for local:

from easycv.file import io

# Write something to a oss file.
with io.open('/your/local/path/demo.txt', 'w') as f:
	f.write("test")

# Read from a oss file.
with io.open('/your/local/path/demo.txt', 'r') as f:
  print(f.read())

exists

Whether the file exists, same usage as os.path.exists. Support local path and oss path.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

ret = io.exists('oss://bucket_name/dir')
print(ret)

Example for Local:

from easycv.file import io

ret = io.exists('oss://bucket_name/dir')
print(ret)

move

Move src to dst, same usage as shutil.move. Support local path and oss path.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

# move oss file to local
io.move('oss://bucket_name/file.txt', '/your/local/path/file.txt')
# move oss file to oss
io.move('oss://bucket_name/dir1/file.txt', 'oss://bucket_name/dir2/file.txt')
# move local file to oss
io.move('/your/local/file.txt', 'oss://bucket_name/file.txt')
# move directory
io.move('oss://bucket_name/dir1/', 'oss://bucket_name/dir2/')

Example for local:

from easycv.file import io

# move local file to local
io.move('/your/local/path1/file.txt', '/your/local/path2/file.txt')
# move local dir to local
io.move('/your/local/dir1', '/your/local/dir2')

copy

Copy a file from src to dst. Same usage as shutil.copyfile.If you want to copy a directory, please refert to [copytree](# copytree).

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

# Copy a file from local to oss:
io.copy('/your/local/file.txt', 'oss://bucket/dir/file.txt')
# Copy a oss file to local:
io.copy('oss://bucket/dir/file.txt', '/your/local/file.txt')
# Copy a file from oss to oss::
io.copy('oss://bucket/dir/file.txt', 'oss://bucket/dir/file2.txt')

Example for local:

from easycv.file import io

# Copy a file from local to local:
io.copy('/your/local/path1/file.txt', '/your/local/path2/file.txt'')

copytree

Copy files recursively from src to dst. Same usage as shutil.copytree.

If you want to copy a file, please use [copy](# copy).

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

# copy files from local to oss
io.copytree(src='/your/local/dir1', dst='oss://bucket_name/dir2')
# copy files from oss to local
io.copytree(src='oss://bucket_name/dir2', dst='/your/local/dir1')
# copy files from oss to oss
io.copytree(src='oss://bucket_name/dir1', dst='oss://bucket_name/dir2')

Example for local:

from easycv.file import io

# copy files from local to local
io.copytree(src='/your/local/dir1', dst='/your/local/dir2')

listdir

List all objects in path. Same usage as os.listdir.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

ret = io.listdir('oss://bucket/dir', recursive=True)
print(ret)

Example for local:

from easycv.file import io

ret = io.listdir('oss://bucket/dir', recursive=True)
print(ret)

remove

Remove a file or a directory recursively. Same usage as os.remove or shutil.rmtree.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

# Remove a oss file
io.remove('oss://bucket_name/file.txt')
# Remove a oss directory
io.remove('oss://bucket_name/dir/')

Example for local:

from easycv.file import io

# Remove a local file
io.remove('/your/local/path/file.txt')
# Remove a local directory
io.remove('/your/local/dir/')

rmtree

Remove directory recursively, same usage as shutil.rmtree.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

io.remove('oss://bucket_name/dir_name/')

Example for local:

from easycv.file import io

io.remove('/your/local/dir/')

makedirs

Create directories recursively, same usage as os.makedirs.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

io.makedirs('oss://bucket/new_dir/')

Example for local:

from easycv.file import io

io.makedirs('/your/local/new_dir/')

isdir

Return whether a path is directory, same usage as os.path.isdir.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config') # only oss file need, refer to `IO.access_oss`
ret = io.isdir('oss://bucket/dir/')
print(ret)

Example for local:

from easycv.file import io

ret = io.isdir('your/local/dir/')
print(ret)

isfile

Return whether a path is file object, same usage as os.path.isfile.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')
ret = io.isfile('oss://bucket/file.txt')
print(ret)

Example for local:

from easycv.file import io

ret = io.isfile('/your/local/path/file.txt')
print(ret)

glob

Return a list of paths matching a pathname pattern.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

ret = io.glob('oss://bucket/dir/*.txt')
print(ret)

Example for local:

from easycv.file import io

ret = io.glob('/your/local/dir/*.txt')
print(ret)

size

Get the size of file path, same usage as os.path.getsize.

Example for oss:

io.access_oss please refer to [access_oss](# access_oss).

from easycv.file import io

io.access_oss('your oss config')

size = io.size('oss://bucket/file.txt')
print(size)

Example for local:

from easycv.file import io

size = io.size('/your/local/path/file.txt')
print(size)

v 0.11.0 (09/05/2023)

Highlights

  • Support EasyCV as a plug-in for [modelscope](https://github.com/modelscope/modelscope.

v 0.10.0 (06/03/2023)

Highlights

  • Support STDC, STGCN, ReID and Multi-len MOT.

  • Support multi processes for predictor data preprocessing. For the model with more time consuming in data preprocessing, the speedup can reach more than 50%.

New Features

  • Support multi processes for predictor data preprocessing. (#272)

  • Support STDC model. (#284) (#286)

  • Support ReID and Multi-len MOT. (#285) (#295)

  • Support STGCN model, and support export blade model. (#293) (#299)

  • Add pose model zoo and support export torch jit and blade model for pose models. (#294)

  • Support train motchallenge and crowdhuman datasets for detection models. (#265)

Improvements

  • Speed up inference for face detector when using mtcnn. (#273)

  • Add mobilenet config for itag and imagenet dataset, and optimize ClsSourceImageList api to support string label. (#276) (#283)

  • Support multi-rows replacement for first order parameter. (#282)

  • Add a tool to convert itag dataset to raw dataset. (#290)

  • Add PoseTopDownPredictor to replace TorchPoseTopDownPredictorWithDetector (#296)

Bug Fixes

  • Remove git lfs dependencies. (#278)

  • Fix wholebody keypoints evaluation. (#287)

  • Fix DetSourceRaw while label file and image file not match. (#289)

v 0.9.0 (17/01/2023)

Highlights

  • Support Single-lens MOT (#258)

  • Support video recognition (X3D, SWIN-video) (#256)

New Features

  • Add inception config and voc config for FCN and UperNet (#261)

  • Add inference time under V100 for the benchmark of deitiii and hydra attention (#251)

  • Add bev-blancehybrid benchmark (#249)

Improvements

  • Optimize data source apis (#254)

  • Update predict.py to support input model directory (#252)

Bug Fixes

  • Fix MAE arg error after timm upgrade (#255)

  • Fix export SSL models bug, avoid loading default pretrained backbone model (#257)

  • Fix bug can’t find config files while easycv is installed (#253)

v 0.8.0 (5/12/2022)

Highlights

  • Add BEVFormer and improve the performance of BEVFormer (#224)

  • Add DINO++ and support objects365 pretrain (#242)

New Features

  • Add DeiT of Hydra Attention version (#220)

  • Add EdgeViTv3 (#214)

  • Add BEVFormer and improve the performance of BEVFormer (#224)

  • Add DINO++ and support objects365 pretrain (#242)

Improvements

  • Unify the parsing method of config scripts, and support both local and pai platform products (#235)

  • Add more data source apis for open source datasets, involving classification, detection, segmentation and keypoints tasks. And part of the data source apis support automatic download. For more information, please refer to data_hub (#206 #229)

  • Add confusion matrix metric for Classification models (#241)

  • Add prediction script (#239)

Bug Fixes

  • Sync the predict config in the config file for predictor (#238)

  • Fix index of image_scale with y2 with bottom_left implemented in _mosaic_combine (#231)

  • Add bevformer benchmark and fix classification predict bug (#240)

v 0.7.0 (3/11/2022)

Highlights

  • Support auto hyperparameter optimization of NNI (#211)

  • Add DeiT III (#171)

  • Add semantic segmentation model SegFormer (#191)

  • Add 3d detection model BEVFormer (#203)

New Features

  • Support semantic mask2former (#199)

  • Support face 2d keypoint detection (#191)

  • Support hand keypoints detection (#191)

  • Support wholebody keypoint detection (#207)

  • Support auto hyperparameter optimization of NNI (#211)

  • Add DeiT III (#171)

  • Add semantic segmentation model SegFormer (#191)

  • Add 3d detection model BEVFormer (#203)

Improvements

  • Optimze predcitor apis, support cpu and batch inference (#195)

  • Speed up ViTDet model (#177)

  • Support export jit model end2end for yolox (#215)

Bug Fixes

  • Fix the bug of io.copytree copying multiple directories (#193)

  • Move thirdparty into easycv (#216)

v 0.6.1 (06/09/2022)

Bug Fixes

  • Fix missing utils (#183)

v 0.6.0 (31/08/2022)

Highlights

  • Release YOLOX-PAI which achieves SOTA results within 40~50 mAP (less than 1ms) (#154 #172 #174 )

  • Add detection algo DINO (#144)

  • Add mask2former algo (#115)

  • Releases imagenet1k, imagenet22k, coco, lvis, voc2012 data with BaiduDisk to accelerate downloading (#145 )

New Features

  • Add detection predictor which support model inference without exporting models(#158 )

  • Add VitDet support for faster-rcnn (#155 )

  • Release YOLOX-PAI which achieves SOTA results within 40~50 mAP (less than 1ms) (#154 #172 #174 )

  • Support DINO algo (#144)

  • Add mask2former algo (#115)

Improvements

  • FCOS update torch_style (#170)

  • Add algo tables to describe which algo EasyCV support (#157 )

  • Refactor datasources api (#156 #140 )

  • Add PR and Issule template (#150)

  • Update Fast ConvMAE doc (#151)

Bug Fixes

  • Fix YOLOXLrUpdaterHook conflict with mmdet ( #169 )

  • Fix datasource cache problem(#153)

v 0.5.0 (28/07/2022)

Highlights

  • Self-Supervised support ConvMAE algorithm ((#101) (#121))

  • Classification support EfficientFormer algorithm (#128)

  • Detection support FCOS、DETR、DAB-DETR and DN-DETR algorithm ((#100) (#104) (#119))

  • Segmentation support UperNet algorithm (#118)

  • Support use torchacc to speed up training (#105)

New Features

  • Support use analyze tools (#133)

Bug Fixes

  • Update yolox config template and fix bugs (#134)

  • Fix yolox detector prediction export error (#125)

  • Fix common_io url error (#126)

Improvements

  • Add ViTDet visualization (#102)

  • Refactor detection pipline (#104)

v 0.4.0 (23/06/2022)

Highlights

  • Add semantic segmentation modules, support FCN algorithm (#71)

  • Expand classification model zoo (#55)

  • Support export model with blade for yolox (#66)

  • Support ViTDet algorithm (#35)

  • Add sailfish for extensible fully sharded data parallel training (#97)

  • Support run with mmdetection models (#25)

New Features

  • Set multiprocess env for speedup (#77)

  • Add data hub, summarized various datasets in different fields (#70)

Bug Fixes

  • Fix the inaccurate accuracy caused by missing the groundtruth_is_crowd field in CocoMaskEvaluator (#61)

  • Unified the usage of pretrained parameter and fix load bugs((#79) (#85) (#95)

Improvements

  • Update MAE pretrained models and benchmark (#50)

  • Add detection benchmark for SwAV and MoCo-v2 (#58)

  • Add moby swin-tiny pretrained model and benchmark (#72)

  • Update prepare_data.md, add more details (#69)

  • Optimize quantize code and support to export MNN model (#44)

v 0.3.0 (05/05/2022)

Highlights

  • Support image visualization for tensorboard and wandb (#15)

New Features

  • Update moby pretrained model to deit small (#10)

  • Support image visualization for tensorboard and wandb (#15)

  • Add mae vit-large benchmark and pretrained models (#24)

Bug Fixes

  • Fix extract.py for benchmarks (#7)

  • Fix inference error of classifier (#19)

  • Fix multi-process reading of detection datasource and accelerate data preprocessing (#23)

  • Fix torchvision transforms wrapper (#31)

Improvements

  • Add chinese readme (#39)

  • Add model compression tutorial (#20)

  • Add notebook tutorials (#22)

  • Uniform input and output format for transforms (#6)

  • Update model zoo link (#8)

  • Support readthedocs (#29)

  • refine autorelease gitworkflow (#13)

v 0.2.2 (07/04/2022)

  • initial commit & first release

  1. SOTA SSL Algorithms

EasyCV provides state-of-the-art algorithms in self-supervised learning based on contrastive learning such as SimCLR, MoCO V2, Swav, DINO and also MAE based on masked image modeling. We also provides standard benchmark tools for ssl model evaluation.

  1. Vision Transformers

EasyCV aims to provide plenty vision transformer models trained either using supervised learning or self-supervised learning, such as ViT, Swin-Transformer and XCit. More models will be added in the future.

  1. Functionality & Extensibility

In addition to SSL, EasyCV also support image classification, object detection, metric learning, and more area will be supported in the future. Although convering different area, EasyCV decompose the framework into different componets such as dataset, model, running hook, making it easy to add new compoenets and combining it with existing modules. EasyCV provide simple and comprehensive interface for inference. Additionaly, all models are supported on PAI-EAS, which can be easily deployed as online service and support automatic scaling and service moniting.

  1. Efficiency

EasyCV support multi-gpu and multi worker training. EasyCV use DALI to accelerate data io and preprocessing process, and use fp16 to accelerate training process. For inference optimization, EasyCV export model using jit script, which can be optimized by PAI-Blade.

easycv.apis package

Submodules

easycv.apis.export module

easycv.apis.export.export(cfg, ckpt_path, filename, model=None, **kwargs)[source]

export model for inference

Parameters
  • cfg – Config object

  • ckpt_path (str) – path to checkpoint file

  • filename (str) – filename to save exported models

  • model (nn.module) – model instance

class easycv.apis.export.PreProcess(target_size: Tuple[int, int] = (640, 640), keep_ratio: bool = True)[source]

Bases: object

Process the data input to model.

Parameters
  • target_size (Tuple[int, int]) – output spatial size.

  • keep_ratio (bool) – Whether to keep the aspect ratio when resizing the image.

__init__(target_size: Tuple[int, int] = (640, 640), keep_ratio: bool = True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.apis.export.ModelExportWrapper(model, example_inputs, trace_model: bool = True)[source]

Bases: torch.nn.modules.module.Module

__init__(model, example_inputs, trace_model: bool = True)None[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

trace_module(**kwargs)[source]
forward(image)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.apis.export.ProcessExportWrapper(example_inputs, process_fn: Optional[Callable] = None)[source]

Bases: torch.nn.modules.module.Module

split the preprocess that can be wrapped as a preprocess jit model the preproprocess procedure cannot be optimized in an end2end blade model due to dynamic shape problem

__init__(example_inputs, process_fn: Optional[Callable] = None)None[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(image)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

easycv.apis.test module

easycv.apis.test.single_cpu_test(model, data_loader, mode='test', show=False, out_dir=None, show_score_thr=0.3, **kwargs)[source]
easycv.apis.test.single_gpu_test(model, data_loader, mode='test', use_fp16=False, **kwargs)[source]

Test model with single.

This method tests model with single

Parameters
  • model (str) – Model to be tested.

  • data_loader (nn.Dataloader) – Pytorch data loader.

  • model – mode for model to forward

  • use_fp16 – Use fp16 inference

Returns

The prediction results.

Return type

list

easycv.apis.test.multi_gpu_test(model, data_loader, mode='test', tmpdir=None, gpu_collect=False, use_fp16=False, **kwargs)[source]

Test model with multiple gpus.

This method tests model with multiple gpus and collects the results under two different modes: gpu and cpu modes. By setting ‘gpu_collect=True’ it encodes results to gpu tensors and use gpu communication for results collection. On cpu mode it saves the results on different gpus to ‘tmpdir’ and collects them by the rank 0 worker.

Parameters
  • model (str) – Model to be tested.

  • data_loader (nn.Dataloader) – Pytorch data loader.

  • model – mode for model to forward

  • tmpdir (str) – Path of directory to save the temporary results from different gpus under cpu mode.

  • gpu_collect (bool) – Option to use either gpu or cpu to collect results.

  • use_fp16 – Use fp16 inference

Returns

The prediction results.

Return type

list

easycv.apis.test.collect_results_cpu(result_part, size, tmpdir=None)[source]
easycv.apis.test.serialize_tensor(tensor_collection)[source]
easycv.apis.test.collect_results_gpu(result_part, size)[source]

easycv.apis.train module

easycv.apis.train.init_random_seed(seed=None, device='cuda')[source]

Initialize random seed. If the seed is not set, the seed will be automatically randomized, and then broadcast to all processes to prevent some potential bugs. :param seed: The seed. Default to None. :type seed: int, Optional :param device: The device where the seed will be put on.

Default to ‘cuda’.

Returns

Seed to be used.

Return type

int

easycv.apis.train.set_random_seed(seed, deterministic=False)[source]

Set random seed.

Parameters
  • seed (int) – Seed to be used.

  • deterministic (bool) – Whether to set the deterministic option for CUDNN backend, i.e., set torch.backends.cudnn.deterministic to True and torch.backends.cudnn.benchmark to False. Default: False.

easycv.apis.train.train_model(model, data_loaders, cfg, distributed=False, timestamp=None, meta=None, use_fp16=False, validate=True, gpu_collect=True)[source]

Training API.

Parameters
  • model (nn.Module) – user defined model

  • data_loaders – a list of dataloader for training data

  • cfg – config object

  • distributed – distributed training or not

  • timestamp – time str formated as ‘%Y%m%d_%H%M%S’

  • meta – a dict containing meta data info, such as env_info, seed, iter, epoch

  • use_fp16 – use fp16 training or not

  • validate – do evaluation while training

  • gpu_collect – use gpu collect or cpu collect for tensor gathering

easycv.apis.train.get_skip_list_keywords(model)[source]
easycv.apis.train.build_optimizer(model, optimizer_cfg)[source]

Build optimizer from configs.

Parameters
  • model (nn.Module) – The model with parameters to be optimized.

  • optimizer_cfg (dict) –

    The config dict of the optimizer.

    Positional fields are:
    • type: class name of the optimizer.

    • lr: base learning rate.

    Optional fields are:
    • any arguments of the corresponding optimizer type, e.g., weight_decay, momentum, etc.

    • paramwise_options: a dict with regular expression as keys to match parameter names and a dict containing options as values. Options include 6 fields: lr, lr_mult, momentum, momentum_mult, weight_decay, weight_decay_mult.

Returns

The initialized optimizer.

Return type

torch.optim.Optimizer

Example

>>> model = torch.nn.modules.Conv1d(1, 1, 1)
>>> paramwise_options = {
>>>     '(bn|gn)(\d+)?.(weight|bias)': dict(weight_decay_mult=0.1),
>>>     '\Ahead.': dict(lr_mult=10, momentum=0)}
>>> optimizer_cfg = dict(type='SGD', lr=0.01, momentum=0.9,
>>>                      weight_decay=0.0001,
>>>                      paramwise_options=paramwise_options)
>>> optimizer = build_optimizer(model, optimizer_cfg)

easycv.apis.train_misc module

easycv.apis.train_misc.build_yolo_optimizer(model, optimizer_cfg)[source]

build optimizer for yolo.

easycv.datasets package

Subpackages

easycv.datasets.classification package

class easycv.datasets.classification.ClsDataset(data_source, pipeline)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for classification

Parameters
  • data_source – data source to parse input data

  • pipeline – transforms list

__init__(data_source, pipeline)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None, topk=(1, 5))[source]

evaluate classification task

Parameters
  • results – a dict of list of tensor, including prediction and groundtruth info, where prediction tensor is NxC,and the same with groundtruth labels.

  • evaluators – a list of evaluator

Returns

a dict of float, different metric values

Return type

eval_result

visualize(results, vis_num=10, **kwargs)[source]

Visulaize the model output on validation data. :param results: A dictionary containing

class: List of length number of test images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape and so on.

Parameters

vis_num – number of images visualized

Returns: A dictionary containing

images: Visulaized images, list of np.ndarray. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape and so on.

class easycv.datasets.classification.ClsOdpsDataset(data_source, pipeline, image_key='url_image', label_key='label', **kwargs)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for rotation prediction

__init__(data_source, pipeline, image_key='url_image', label_key='label', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]
Subpackages
easycv.datasets.classification.data_sources package
class easycv.datasets.classification.data_sources.ClsSourceCifar10(root, split, download=True)[source]

Bases: object

CLASSES = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
__init__(root, split, download=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.data_sources.ClsSourceCifar100(root, split, download=True)[source]

Bases: object

CLASSES = None
__init__(root, split, download=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.data_sources.ClsSourceImageListByClass(root, list_file, m_per_class=2, delimeter=' ', split_huge_listfile_byrank=False, cache_path='data/', max_try=20)[source]

Bases: object

Get the same m_per_class samples by the label idx.

Parameters
  • list_file – str / list(str), str means a input image list file path, this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label

  • root – str / list(str), root path for image_path, each list_file will need a root.

  • m_per_class – num of samples for each class.

  • delimeter – str, delimeter of each line in the list_file

  • split_huge_listfile_byrank – Adapt to the situation that the memory cannot fully load a huge amount of data list. If split, data list will be split to each rank.

  • cache_path – if split_huge_listfile_byrank is true, cache list_file will be saved to cache_path.

  • max_try – int, max try numbers of reading image

__init__(root, list_file, m_per_class=2, delimeter=' ', split_huge_listfile_byrank=False, cache_path='data/', max_try=20)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.data_sources.ClsSourceImageList(list_file, root='', delimeter=' ', split_huge_listfile_byrank=False, split_label_balance=False, cache_path='data/', class_list=None)[source]

Bases: object

data source for classification :param list_file: str / list(str), str means a input image list file path,

this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label

Parameters
  • root – str / list(str), root path for image_path, each list_file will need a root, if len(root) < len(list_file), we will use root[-1] to fill root list.

  • delimeter – str, delimeter of each line in the list_file

  • split_huge_listfile_byrank – Adapt to the situation that the memory cannot fully load a huge amount of data list. If split, data list will be split to each rank.

  • split_label_balance – if split_huge_listfile_byrank is true, whether split with label balance

  • cache_path – if split_huge_listfile_byrank is true, cache list_file will be saved to cache_path.

__init__(list_file, root='', delimeter=' ', split_huge_listfile_byrank=False, split_label_balance=False, cache_path='data/', class_list=None)[source]

Initialize self. See help(type(self)) for accurate signature.

static parse_list_file(list_file, root, delimeter, label_dict={})[source]
class easycv.datasets.classification.data_sources.ClsSourceItag(list_file, root='', class_list=None)[source]

Bases: easycv.datasets.classification.data_sources.image_list.ClsSourceImageList

data source itag for classification :param list_file: str / list(str), str means a input image list file path,

this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label

__init__(list_file, root='', class_list=None)[source]

Initialize self. See help(type(self)) for accurate signature.

static parse_list_file(list_file, label_dict, auto_collect_labels=True)[source]
class easycv.datasets.classification.data_sources.ClsSourceImageNetTFRecord(list_file='', root='', file_pattern=None, cache_path='data/cache/', max_try=10)[source]

Bases: object

data source for imagenet tfrecord.

__init__(list_file='', root='', file_pattern=None, cache_path='data/cache/', max_try=10)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.data_sources.ClsSourceCUB(*args, ann_file, image_class_labels_file, train_test_split_file, test_mode, data_prefix, **kwargs)[source]

Bases: object

The CUB-200-2011 Dataset. Support the CUB-200-2011 Dataset. Comparing with the CUB-200 Dataset, there are much more pictures in CUB-200-2011. :param ann_file: the annotation file.

images.txt in CUB.

Parameters
  • image_class_labels_file (str) – the label file. image_class_labels.txt in CUB.

  • train_test_split_file (str) – the split file. train_test_split_file.txt in CUB.

CLASSES = ['Black_footed_Albatross', 'Laysan_Albatross', 'Sooty_Albatross', 'Groove_billed_Ani', 'Crested_Auklet', 'Least_Auklet', 'Parakeet_Auklet', 'Rhinoceros_Auklet', 'Brewer_Blackbird', 'Red_winged_Blackbird', 'Rusty_Blackbird', 'Yellow_headed_Blackbird', 'Bobolink', 'Indigo_Bunting', 'Lazuli_Bunting', 'Painted_Bunting', 'Cardinal', 'Spotted_Catbird', 'Gray_Catbird', 'Yellow_breasted_Chat', 'Eastern_Towhee', 'Chuck_will_Widow', 'Brandt_Cormorant', 'Red_faced_Cormorant', 'Pelagic_Cormorant', 'Bronzed_Cowbird', 'Shiny_Cowbird', 'Brown_Creeper', 'American_Crow', 'Fish_Crow', 'Black_billed_Cuckoo', 'Mangrove_Cuckoo', 'Yellow_billed_Cuckoo', 'Gray_crowned_Rosy_Finch', 'Purple_Finch', 'Northern_Flicker', 'Acadian_Flycatcher', 'Great_Crested_Flycatcher', 'Least_Flycatcher', 'Olive_sided_Flycatcher', 'Scissor_tailed_Flycatcher', 'Vermilion_Flycatcher', 'Yellow_bellied_Flycatcher', 'Frigatebird', 'Northern_Fulmar', 'Gadwall', 'American_Goldfinch', 'European_Goldfinch', 'Boat_tailed_Grackle', 'Eared_Grebe', 'Horned_Grebe', 'Pied_billed_Grebe', 'Western_Grebe', 'Blue_Grosbeak', 'Evening_Grosbeak', 'Pine_Grosbeak', 'Rose_breasted_Grosbeak', 'Pigeon_Guillemot', 'California_Gull', 'Glaucous_winged_Gull', 'Heermann_Gull', 'Herring_Gull', 'Ivory_Gull', 'Ring_billed_Gull', 'Slaty_backed_Gull', 'Western_Gull', 'Anna_Hummingbird', 'Ruby_throated_Hummingbird', 'Rufous_Hummingbird', 'Green_Violetear', 'Long_tailed_Jaeger', 'Pomarine_Jaeger', 'Blue_Jay', 'Florida_Jay', 'Green_Jay', 'Dark_eyed_Junco', 'Tropical_Kingbird', 'Gray_Kingbird', 'Belted_Kingfisher', 'Green_Kingfisher', 'Pied_Kingfisher', 'Ringed_Kingfisher', 'White_breasted_Kingfisher', 'Red_legged_Kittiwake', 'Horned_Lark', 'Pacific_Loon', 'Mallard', 'Western_Meadowlark', 'Hooded_Merganser', 'Red_breasted_Merganser', 'Mockingbird', 'Nighthawk', 'Clark_Nutcracker', 'White_breasted_Nuthatch', 'Baltimore_Oriole', 'Hooded_Oriole', 'Orchard_Oriole', 'Scott_Oriole', 'Ovenbird', 'Brown_Pelican', 'White_Pelican', 'Western_Wood_Pewee', 'Sayornis', 'American_Pipit', 'Whip_poor_Will', 'Horned_Puffin', 'Common_Raven', 'White_necked_Raven', 'American_Redstart', 'Geococcyx', 'Loggerhead_Shrike', 'Great_Grey_Shrike', 'Baird_Sparrow', 'Black_throated_Sparrow', 'Brewer_Sparrow', 'Chipping_Sparrow', 'Clay_colored_Sparrow', 'House_Sparrow', 'Field_Sparrow', 'Fox_Sparrow', 'Grasshopper_Sparrow', 'Harris_Sparrow', 'Henslow_Sparrow', 'Le_Conte_Sparrow', 'Lincoln_Sparrow', 'Nelson_Sharp_tailed_Sparrow', 'Savannah_Sparrow', 'Seaside_Sparrow', 'Song_Sparrow', 'Tree_Sparrow', 'Vesper_Sparrow', 'White_crowned_Sparrow', 'White_throated_Sparrow', 'Cape_Glossy_Starling', 'Bank_Swallow', 'Barn_Swallow', 'Cliff_Swallow', 'Tree_Swallow', 'Scarlet_Tanager', 'Summer_Tanager', 'Artic_Tern', 'Black_Tern', 'Caspian_Tern', 'Common_Tern', 'Elegant_Tern', 'Forsters_Tern', 'Least_Tern', 'Green_tailed_Towhee', 'Brown_Thrasher', 'Sage_Thrasher', 'Black_capped_Vireo', 'Blue_headed_Vireo', 'Philadelphia_Vireo', 'Red_eyed_Vireo', 'Warbling_Vireo', 'White_eyed_Vireo', 'Yellow_throated_Vireo', 'Bay_breasted_Warbler', 'Black_and_white_Warbler', 'Black_throated_Blue_Warbler', 'Blue_winged_Warbler', 'Canada_Warbler', 'Cape_May_Warbler', 'Cerulean_Warbler', 'Chestnut_sided_Warbler', 'Golden_winged_Warbler', 'Hooded_Warbler', 'Kentucky_Warbler', 'Magnolia_Warbler', 'Mourning_Warbler', 'Myrtle_Warbler', 'Nashville_Warbler', 'Orange_crowned_Warbler', 'Palm_Warbler', 'Pine_Warbler', 'Prairie_Warbler', 'Prothonotary_Warbler', 'Swainson_Warbler', 'Tennessee_Warbler', 'Wilson_Warbler', 'Worm_eating_Warbler', 'Yellow_Warbler', 'Northern_Waterthrush', 'Louisiana_Waterthrush', 'Bohemian_Waxwing', 'Cedar_Waxwing', 'American_Three_toed_Woodpecker', 'Pileated_Woodpecker', 'Red_bellied_Woodpecker', 'Red_cockaded_Woodpecker', 'Red_headed_Woodpecker', 'Downy_Woodpecker', 'Bewick_Wren', 'Cactus_Wren', 'Carolina_Wren', 'House_Wren', 'Marsh_Wren', 'Rock_Wren', 'Winter_Wren', 'Common_Yellowthroat']
__init__(*args, ann_file, image_class_labels_file, train_test_split_file, test_mode, data_prefix, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

load_annotations()[source]
class easycv.datasets.classification.data_sources.ClsSourceImageNet1k(root, split)[source]

Bases: object

__init__(root, split)[source]
Parameters
  • root

    The root directory of the data example:if data/imagenet

    └── train

    └── n01440764 └── n01443537 └── …

    └── val

    └── n01440764 └── n01443537 └── …

    └── meta

    ├── train.txt ├── val.txt ├── …

    has input root = data/imagenet

  • split – train or val

read_data(image_path)[source]
class easycv.datasets.classification.data_sources.ClsSourceCaltech101(root, download=True)[source]

Bases: object

__init__(root, download=True)[source]

Initialize self. See help(type(self)) for accurate signature.

download(root)[source]
downloaded_exists(root)[source]
normalized_path(root)[source]
class easycv.datasets.classification.data_sources.ClsSourceCaltech256(root, download=True)[source]

Bases: object

__init__(root, download=True)[source]

Initialize self. See help(type(self)) for accurate signature.

download(root)[source]
class easycv.datasets.classification.data_sources.ClsSourceFlowers102(root, split, download=False)[source]

Bases: object

__init__(root, split, download=False)None[source]

Initialize self. See help(type(self)) for accurate signature.

download()[source]
class easycv.datasets.classification.data_sources.ClsSourceMnist(root, split, download=True)[source]

Bases: object

__init__(root, split, download=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.data_sources.ClsSourceFashionMnist(root, split, download=True)[source]

Bases: object

__init__(root, split, download=True)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.classification.data_sources.cifar module
class easycv.datasets.classification.data_sources.cifar.ClsSourceCifar10(root, split, download=True)[source]

Bases: object

CLASSES = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
__init__(root, split, download=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.data_sources.cifar.ClsSourceCifar100(root, split, download=True)[source]

Bases: object

CLASSES = None
__init__(root, split, download=True)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.classification.data_sources.class_list module
class easycv.datasets.classification.data_sources.class_list.ClsSourceImageListByClass(root, list_file, m_per_class=2, delimeter=' ', split_huge_listfile_byrank=False, cache_path='data/', max_try=20)[source]

Bases: object

Get the same m_per_class samples by the label idx.

Parameters
  • list_file – str / list(str), str means a input image list file path, this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label

  • root – str / list(str), root path for image_path, each list_file will need a root.

  • m_per_class – num of samples for each class.

  • delimeter – str, delimeter of each line in the list_file

  • split_huge_listfile_byrank – Adapt to the situation that the memory cannot fully load a huge amount of data list. If split, data list will be split to each rank.

  • cache_path – if split_huge_listfile_byrank is true, cache list_file will be saved to cache_path.

  • max_try – int, max try numbers of reading image

__init__(root, list_file, m_per_class=2, delimeter=' ', split_huge_listfile_byrank=False, cache_path='data/', max_try=20)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.classification.data_sources.fashiongen_h5 module
class easycv.datasets.classification.data_sources.fashiongen_h5.FashionGenH5(h5file_path, return_label=True, cache_path='data/fashionGenH5')[source]

Bases: object

__init__(h5file_path, return_label=True, cache_path='data/fashionGenH5')[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.classification.data_sources.image_list module
class easycv.datasets.classification.data_sources.image_list.ClsSourceImageList(list_file, root='', delimeter=' ', split_huge_listfile_byrank=False, split_label_balance=False, cache_path='data/', class_list=None)[source]

Bases: object

data source for classification :param list_file: str / list(str), str means a input image list file path,

this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label

Parameters
  • root – str / list(str), root path for image_path, each list_file will need a root, if len(root) < len(list_file), we will use root[-1] to fill root list.

  • delimeter – str, delimeter of each line in the list_file

  • split_huge_listfile_byrank – Adapt to the situation that the memory cannot fully load a huge amount of data list. If split, data list will be split to each rank.

  • split_label_balance – if split_huge_listfile_byrank is true, whether split with label balance

  • cache_path – if split_huge_listfile_byrank is true, cache list_file will be saved to cache_path.

__init__(list_file, root='', delimeter=' ', split_huge_listfile_byrank=False, split_label_balance=False, cache_path='data/', class_list=None)[source]

Initialize self. See help(type(self)) for accurate signature.

static parse_list_file(list_file, root, delimeter, label_dict={})[source]
class easycv.datasets.classification.data_sources.image_list.ClsSourceItag(list_file, root='', class_list=None)[source]

Bases: easycv.datasets.classification.data_sources.image_list.ClsSourceImageList

data source itag for classification :param list_file: str / list(str), str means a input image list file path,

this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label

__init__(list_file, root='', class_list=None)[source]

Initialize self. See help(type(self)) for accurate signature.

static parse_list_file(list_file, label_dict, auto_collect_labels=True)[source]
easycv.datasets.classification.data_sources.imagenet_tfrecord module
class easycv.datasets.classification.data_sources.imagenet_tfrecord.ClsSourceImageNetTFRecord(list_file='', root='', file_pattern=None, cache_path='data/cache/', max_try=10)[source]

Bases: object

data source for imagenet tfrecord.

__init__(list_file='', root='', file_pattern=None, cache_path='data/cache/', max_try=10)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.classification.data_sources.utils module
easycv.datasets.classification.data_sources.utils.split_listfile_byrank(list_file, label_balance, save_path='data/', delimeter=' ')[source]
easycv.datasets.classification.pipelines package
class easycv.datasets.classification.pipelines.MMAutoAugment(policies=[[{'type': 'Posterize', 'bits': 4, 'prob': 0.4}, {'type': 'Rotate', 'angle': 30.0, 'prob': 0.6}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 5, 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}], [{'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 6, 'prob': 0.8}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'Rotate', 'angle': 10.0, 'prob': 0.2}, {'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 30.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.0}, {'type': 'Equalize', 'prob': 0.8}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8}, {'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8}], [{'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4}, {'type': 'Invert', 'prob': 0.6}], [{'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal'}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}]], hparams={'pad_val': 128})[source]

Bases: object

Auto augmentation. This data augmentation is proposed in AutoAugment: Learning Augmentation Policies from Data. :param policies: The policies of auto augmentation. Each

policy in policies is a specific augmentation policy, and is composed by several augmentations (dict). When AutoAugment is called, a random policy in policies will be selected to augment images.

Parameters

hparams (dict) – Configs of hyperparameters. Hyperparameters will be used in policies that require these arguments if these arguments are not set in policy dicts. Defaults to use _HPARAMS_DEFAULT.

__init__(policies=[[{'type': 'Posterize', 'bits': 4, 'prob': 0.4}, {'type': 'Rotate', 'angle': 30.0, 'prob': 0.6}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 5, 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}], [{'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 6, 'prob': 0.8}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'Rotate', 'angle': 10.0, 'prob': 0.2}, {'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 30.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.0}, {'type': 'Equalize', 'prob': 0.8}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8}, {'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8}], [{'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4}, {'type': 'Invert', 'prob': 0.6}], [{'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal'}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}]], hparams={'pad_val': 128})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.MMRandAugment(num_policies, magnitude_level, magnitude_std=0.0, total_level=30, policies=[{'type': 'AutoContrast'}, {'type': 'Equalize'}, {'type': 'Invert'}, {'type': 'Rotate', 'magnitude_key': 'angle', 'magnitude_range': (0, 30)}, {'type': 'Posterize', 'magnitude_key': 'bits', 'magnitude_range': (4, 0)}, {'type': 'Solarize', 'magnitude_key': 'thr', 'magnitude_range': (256, 0)}, {'type': 'SolarizeAdd', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 110)}, {'type': 'ColorTransform', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Contrast', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Brightness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Sharpness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'horizontal'}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'vertical'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'horizontal'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'vertical'}], hparams={'pad_val': 128})[source]

Bases: object

Random augmentation. This data augmentation is proposed in RandAugment: Practical automated data augmentation with a reduced search space. :param policies: The policies of random augmentation. Each

policy in policies is one specific augmentation policy (dict). The policy shall at least have key type, indicating the type of augmentation. For those which have magnitude, (given to the fact they are named differently in different augmentation, ) magnitude_key and magnitude_range shall be the magnitude argument (str) and the range of magnitude (tuple in the format of (val1, val2)), respectively. Note that val1 is not necessarily less than val2.

Parameters
  • num_policies (int) – Number of policies to select from policies each time.

  • magnitude_level (int | float) – Magnitude level for all the augmentation selected.

  • total_level (int | float) – Total level for the magnitude. Defaults to 30.

  • magnitude_std (Number | str) –

    Deviation of magnitude noise applied. - If positive number, magnitude is sampled from normal distribution

    (mean=magnitude, std=magnitude_std).

    • If 0 or negative number, magnitude remains unchanged.

    • If str “inf”, magnitude is sampled from uniform distribution (range=[min, magnitude]).

  • hparams (dict) – Configs of hyperparameters. Hyperparameters will be used in policies that require these arguments if these arguments are not set in policy dicts. Defaults to use _HPARAMS_DEFAULT.

Note

magnitude_std will introduce some randomness to policy, modified by https://github.com/rwightman/pytorch-image-models. When magnitude_std=0, we calculate the magnitude as follows: .. math:

\text{magnitude} = \frac{\text{magnitude\_level}}
{\text{total\_level}} \times (\text{val2} - \text{val1})
+ \text{val1}
__init__(num_policies, magnitude_level, magnitude_std=0.0, total_level=30, policies=[{'type': 'AutoContrast'}, {'type': 'Equalize'}, {'type': 'Invert'}, {'type': 'Rotate', 'magnitude_key': 'angle', 'magnitude_range': (0, 30)}, {'type': 'Posterize', 'magnitude_key': 'bits', 'magnitude_range': (4, 0)}, {'type': 'Solarize', 'magnitude_key': 'thr', 'magnitude_range': (256, 0)}, {'type': 'SolarizeAdd', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 110)}, {'type': 'ColorTransform', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Contrast', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Brightness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Sharpness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'horizontal'}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'vertical'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'horizontal'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'vertical'}], hparams={'pad_val': 128})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.MMRandomErasing(erase_prob=0.5, min_area_ratio=0.02, max_area_ratio=0.4, aspect_range=(0.3, 3.3333333333333335), mode='const', fill_color=(128, 128, 128), fill_std=None)[source]

Bases: object

Randomly selects a rectangle region in an image and erase pixels. :param erase_prob: Probability that image will be randomly erased.

Default: 0.5

Parameters
  • min_area_ratio (float) – Minimum erased area / input image area Default: 0.02

  • max_area_ratio (float) – Maximum erased area / input image area Default: 0.4

  • aspect_range (sequence | float) – Aspect ratio range of erased area. if float, it will be converted to (aspect_ratio, 1/aspect_ratio) Default: (3/10, 10/3)

  • mode (str) – Fill method in erased area, can be: - const (default): All pixels are assign with the same value. - rand: each pixel is assigned with a random value in [0, 255]

  • fill_color (sequence | Number) – Base color filled in erased area. Defaults to (128, 128, 128).

  • fill_std (sequence | Number, optional) – If set and mode is ‘rand’, fill erased area with random color from normal distribution (mean=fill_color, std=fill_std); If not set, fill erased area with random color from uniform distribution (0~255). Defaults to None.

Note

See Random Erasing Data Augmentation This paper provided 4 modes: RE-R, RE-M, RE-0, RE-255, and use RE-M as default. The config of these 4 modes are: - RE-R: RandomErasing(mode=’rand’) - RE-M: RandomErasing(mode=’const’, fill_color=(123.67, 116.3, 103.5)) - RE-0: RandomErasing(mode=’const’, fill_color=0) - RE-255: RandomErasing(mode=’const’, fill_color=255)

__init__(erase_prob=0.5, min_area_ratio=0.02, max_area_ratio=0.4, aspect_range=(0.3, 3.3333333333333335), mode='const', fill_color=(128, 128, 128), fill_std=None)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.classification.pipelines.auto_augment module
easycv.datasets.classification.pipelines.auto_augment.random_negative(value, random_negative_prob)[source]

Randomly negate value based on random_negative_prob.

easycv.datasets.classification.pipelines.auto_augment.merge_hparams(policy: dict, hparams: dict)[source]

Merge hyperparameters into policy config. Only merge partial hyperparameters required of the policy. :param policy: Original policy config dict. :type policy: dict :param hparams: Hyperparameters need to be merged. :type hparams: dict

Returns

Policy config dict after adding hparams.

Return type

dict

class easycv.datasets.classification.pipelines.auto_augment.MMAutoAugment(policies=[[{'type': 'Posterize', 'bits': 4, 'prob': 0.4}, {'type': 'Rotate', 'angle': 30.0, 'prob': 0.6}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 5, 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}], [{'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 6, 'prob': 0.8}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'Rotate', 'angle': 10.0, 'prob': 0.2}, {'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 30.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.0}, {'type': 'Equalize', 'prob': 0.8}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8}, {'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8}], [{'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4}, {'type': 'Invert', 'prob': 0.6}], [{'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal'}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}]], hparams={'pad_val': 128})[source]

Bases: object

Auto augmentation. This data augmentation is proposed in AutoAugment: Learning Augmentation Policies from Data. :param policies: The policies of auto augmentation. Each

policy in policies is a specific augmentation policy, and is composed by several augmentations (dict). When AutoAugment is called, a random policy in policies will be selected to augment images.

Parameters

hparams (dict) – Configs of hyperparameters. Hyperparameters will be used in policies that require these arguments if these arguments are not set in policy dicts. Defaults to use _HPARAMS_DEFAULT.

__init__(policies=[[{'type': 'Posterize', 'bits': 4, 'prob': 0.4}, {'type': 'Rotate', 'angle': 30.0, 'prob': 0.6}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 5, 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}], [{'type': 'Solarize', 'thr': 170.66666666666666, 'prob': 0.6}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Posterize', 'bits': 6, 'prob': 0.8}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'Rotate', 'angle': 10.0, 'prob': 0.2}, {'type': 'Solarize', 'thr': 28.444444444444443, 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.6}, {'type': 'Posterize', 'bits': 5, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}], [{'type': 'Rotate', 'angle': 30.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.0}, {'type': 'Equalize', 'prob': 0.8}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Rotate', 'angle': 26.666666666666668, 'prob': 0.8}, {'type': 'ColorTransform', 'magnitude': 0.2, 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.8, 'prob': 0.8}, {'type': 'Solarize', 'thr': 56.888888888888886, 'prob': 0.8}], [{'type': 'Sharpness', 'magnitude': 0.7, 'prob': 0.4}, {'type': 'Invert', 'prob': 0.6}], [{'type': 'Shear', 'magnitude': 0.16666666666666666, 'prob': 0.6, 'direction': 'horizontal'}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.0, 'prob': 0.4}, {'type': 'Equalize', 'prob': 0.6}], [{'type': 'Equalize', 'prob': 0.4}, {'type': 'Solarize', 'thr': 142.22222222222223, 'prob': 0.2}], [{'type': 'Solarize', 'thr': 113.77777777777777, 'prob': 0.6}, {'type': 'AutoContrast', 'prob': 0.6}], [{'type': 'Invert', 'prob': 0.6}, {'type': 'Equalize', 'prob': 1.0}], [{'type': 'ColorTransform', 'magnitude': 0.4, 'prob': 0.6}, {'type': 'Contrast', 'magnitude': 0.8, 'prob': 1.0}], [{'type': 'Equalize', 'prob': 0.8}, {'type': 'Equalize', 'prob': 0.6}]], hparams={'pad_val': 128})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.MMRandAugment(num_policies, magnitude_level, magnitude_std=0.0, total_level=30, policies=[{'type': 'AutoContrast'}, {'type': 'Equalize'}, {'type': 'Invert'}, {'type': 'Rotate', 'magnitude_key': 'angle', 'magnitude_range': (0, 30)}, {'type': 'Posterize', 'magnitude_key': 'bits', 'magnitude_range': (4, 0)}, {'type': 'Solarize', 'magnitude_key': 'thr', 'magnitude_range': (256, 0)}, {'type': 'SolarizeAdd', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 110)}, {'type': 'ColorTransform', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Contrast', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Brightness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Sharpness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'horizontal'}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'vertical'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'horizontal'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'vertical'}], hparams={'pad_val': 128})[source]

Bases: object

Random augmentation. This data augmentation is proposed in RandAugment: Practical automated data augmentation with a reduced search space. :param policies: The policies of random augmentation. Each

policy in policies is one specific augmentation policy (dict). The policy shall at least have key type, indicating the type of augmentation. For those which have magnitude, (given to the fact they are named differently in different augmentation, ) magnitude_key and magnitude_range shall be the magnitude argument (str) and the range of magnitude (tuple in the format of (val1, val2)), respectively. Note that val1 is not necessarily less than val2.

Parameters
  • num_policies (int) – Number of policies to select from policies each time.

  • magnitude_level (int | float) – Magnitude level for all the augmentation selected.

  • total_level (int | float) – Total level for the magnitude. Defaults to 30.

  • magnitude_std (Number | str) –

    Deviation of magnitude noise applied. - If positive number, magnitude is sampled from normal distribution

    (mean=magnitude, std=magnitude_std).

    • If 0 or negative number, magnitude remains unchanged.

    • If str “inf”, magnitude is sampled from uniform distribution (range=[min, magnitude]).

  • hparams (dict) – Configs of hyperparameters. Hyperparameters will be used in policies that require these arguments if these arguments are not set in policy dicts. Defaults to use _HPARAMS_DEFAULT.

Note

magnitude_std will introduce some randomness to policy, modified by https://github.com/rwightman/pytorch-image-models. When magnitude_std=0, we calculate the magnitude as follows: .. math:

\text{magnitude} = \frac{\text{magnitude\_level}}
{\text{total\_level}} \times (\text{val2} - \text{val1})
+ \text{val1}
__init__(num_policies, magnitude_level, magnitude_std=0.0, total_level=30, policies=[{'type': 'AutoContrast'}, {'type': 'Equalize'}, {'type': 'Invert'}, {'type': 'Rotate', 'magnitude_key': 'angle', 'magnitude_range': (0, 30)}, {'type': 'Posterize', 'magnitude_key': 'bits', 'magnitude_range': (4, 0)}, {'type': 'Solarize', 'magnitude_key': 'thr', 'magnitude_range': (256, 0)}, {'type': 'SolarizeAdd', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 110)}, {'type': 'ColorTransform', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Contrast', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Brightness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Sharpness', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.9)}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'horizontal'}, {'type': 'Shear', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.3), 'direction': 'vertical'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'horizontal'}, {'type': 'Translate', 'magnitude_key': 'magnitude', 'magnitude_range': (0, 0.45), 'direction': 'vertical'}], hparams={'pad_val': 128})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Shear(magnitude, pad_val=128, prob=0.5, direction='horizontal', random_negative_prob=0.5, interpolation='bicubic')[source]

Bases: object

Shear images. :param magnitude: The magnitude used for shear. :type magnitude: int | float :param pad_val: Pixel pad_val value for constant fill.

If a sequence of length 3, it is used to pad_val R, G, B channels respectively. Defaults to 128.

Parameters
  • prob (float) – The probability for performing Shear therefore should be in range [0, 1]. Defaults to 0.5.

  • direction (str) – The shearing direction. Options are ‘horizontal’ and ‘vertical’. Defaults to ‘horizontal’.

  • random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.

  • interpolation (str) – Interpolation method. Options are ‘nearest’, ‘bilinear’, ‘bicubic’, ‘area’, ‘lanczos’. Defaults to ‘bicubic’.

__init__(magnitude, pad_val=128, prob=0.5, direction='horizontal', random_negative_prob=0.5, interpolation='bicubic')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Translate(magnitude, pad_val=128, prob=0.5, direction='horizontal', random_negative_prob=0.5, interpolation='nearest')[source]

Bases: object

Translate images. :param magnitude: The magnitude used for translate. Note that

the offset is calculated by magnitude * size in the corresponding direction. With a magnitude of 1, the whole image will be moved out of the range.

Parameters
  • pad_val (int, Sequence[int]) – Pixel pad_val value for constant fill. If a sequence of length 3, it is used to pad_val R, G, B channels respectively. Defaults to 128.

  • prob (float) – The probability for performing translate therefore should be in range [0, 1]. Defaults to 0.5.

  • direction (str) – The translating direction. Options are ‘horizontal’ and ‘vertical’. Defaults to ‘horizontal’.

  • random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.

  • interpolation (str) – Interpolation method. Options are ‘nearest’, ‘bilinear’, ‘bicubic’, ‘area’, ‘lanczos’. Defaults to ‘nearest’.

__init__(magnitude, pad_val=128, prob=0.5, direction='horizontal', random_negative_prob=0.5, interpolation='nearest')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Rotate(angle, center=None, scale=1.0, pad_val=128, prob=0.5, random_negative_prob=0.5, interpolation='nearest')[source]

Bases: object

Rotate images. :param angle: The angle used for rotate. Positive values stand for

clockwise rotation.

Parameters
  • center (tuple[float], optional) – Center point (w, h) of the rotation in the source image. If None, the center of the image will be used. Defaults to None.

  • scale (float) – Isotropic scale factor. Defaults to 1.0.

  • pad_val (int, Sequence[int]) – Pixel pad_val value for constant fill. If a sequence of length 3, it is used to pad_val R, G, B channels respectively. Defaults to 128.

  • prob (float) – The probability for performing Rotate therefore should be in range [0, 1]. Defaults to 0.5.

  • random_negative_prob (float) – The probability that turns the angle negative, which should be in range [0,1]. Defaults to 0.5.

  • interpolation (str) – Interpolation method. Options are ‘nearest’, ‘bilinear’, ‘bicubic’, ‘area’, ‘lanczos’. Defaults to ‘nearest’.

__init__(angle, center=None, scale=1.0, pad_val=128, prob=0.5, random_negative_prob=0.5, interpolation='nearest')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.AutoContrast(prob=0.5)[source]

Bases: object

Auto adjust image contrast. :param prob: The probability for performing invert therefore should

be in range [0, 1]. Defaults to 0.5.

__init__(prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Invert(prob=0.5)[source]

Bases: object

Invert images. :param prob: The probability for performing invert therefore should

be in range [0, 1]. Defaults to 0.5.

__init__(prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Equalize(prob=0.5)[source]

Bases: object

Equalize the image histogram. :param prob: The probability for performing invert therefore should

be in range [0, 1]. Defaults to 0.5.

__init__(prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Solarize(thr, prob=0.5)[source]

Bases: object

Solarize images (invert all pixel values above a threshold). :param thr: The threshold above which the pixels value will be

inverted.

Parameters

prob (float) – The probability for solarizing therefore should be in range [0, 1]. Defaults to 0.5.

__init__(thr, prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.SolarizeAdd(magnitude, thr=128, prob=0.5)[source]

Bases: object

SolarizeAdd images (add a certain value to pixels below a threshold). :param magnitude: The value to be added to pixels below the thr. :type magnitude: int | float :param thr: The threshold below which the pixels value will be

adjusted.

Parameters

prob (float) – The probability for solarizing therefore should be in range [0, 1]. Defaults to 0.5.

__init__(magnitude, thr=128, prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Posterize(bits, prob=0.5)[source]

Bases: object

Posterize images (reduce the number of bits for each color channel). :param bits: Number of bits for each pixel in the output img,

which should be less or equal to 8.

Parameters

prob (float) – The probability for posterizing therefore should be in range [0, 1]. Defaults to 0.5.

__init__(bits, prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Contrast(magnitude, prob=0.5, random_negative_prob=0.5)[source]

Bases: object

Adjust images contrast. :param magnitude: The magnitude used for adjusting contrast. A

positive magnitude would enhance the contrast and a negative magnitude would make the image grayer. A magnitude=0 gives the origin img.

Parameters
  • prob (float) – The probability for performing contrast adjusting therefore should be in range [0, 1]. Defaults to 0.5.

  • random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.

__init__(magnitude, prob=0.5, random_negative_prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.ColorTransform(magnitude, prob=0.5, random_negative_prob=0.5)[source]

Bases: object

Adjust images color balance. :param magnitude: The magnitude used for color transform. A

positive magnitude would enhance the color and a negative magnitude would make the image grayer. A magnitude=0 gives the origin img.

Parameters
  • prob (float) – The probability for performing ColorTransform therefore should be in range [0, 1]. Defaults to 0.5.

  • random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.

__init__(magnitude, prob=0.5, random_negative_prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Brightness(magnitude, prob=0.5, random_negative_prob=0.5)[source]

Bases: object

Adjust images brightness. :param magnitude: The magnitude used for adjusting brightness. A

positive magnitude would enhance the brightness and a negative magnitude would make the image darker. A magnitude=0 gives the origin img.

Parameters
  • prob (float) – The probability for performing contrast adjusting therefore should be in range [0, 1]. Defaults to 0.5.

  • random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.

__init__(magnitude, prob=0.5, random_negative_prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Sharpness(magnitude, prob=0.5, random_negative_prob=0.5)[source]

Bases: object

Adjust images sharpness. :param magnitude: The magnitude used for adjusting sharpness. A

positive magnitude would enhance the sharpness and a negative magnitude would make the image bulr. A magnitude=0 gives the origin img.

Parameters
  • prob (float) – The probability for performing contrast adjusting therefore should be in range [0, 1]. Defaults to 0.5.

  • random_negative_prob (float) – The probability that turns the magnitude negative, which should be in range [0,1]. Defaults to 0.5.

__init__(magnitude, prob=0.5, random_negative_prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.Cutout(shape, pad_val=128, prob=0.5)[source]

Bases: object

Cutout images. :param shape: Expected cutout shape (h, w).

If given as a single value, the value will be used for both h and w.

Parameters
  • pad_val (int, Sequence[int]) – Pixel pad_val value for constant fill. If it is a sequence, it must have the same length with the image channels. Defaults to 128.

  • prob (float) – The probability for performing cutout therefore should be in range [0, 1]. Defaults to 0.5.

__init__(shape, pad_val=128, prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.classification.pipelines.auto_augment.PILGaussianBlur(prob=0.1, radius_min=0.1, radius_max=2.0)[source]

Bases: object

__init__(prob=0.1, radius_min=0.1, radius_max=2.0)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.classification.pipelines.transform module
class easycv.datasets.classification.pipelines.transform.MMRandomErasing(erase_prob=0.5, min_area_ratio=0.02, max_area_ratio=0.4, aspect_range=(0.3, 3.3333333333333335), mode='const', fill_color=(128, 128, 128), fill_std=None)[source]

Bases: object

Randomly selects a rectangle region in an image and erase pixels. :param erase_prob: Probability that image will be randomly erased.

Default: 0.5

Parameters
  • min_area_ratio (float) – Minimum erased area / input image area Default: 0.02

  • max_area_ratio (float) – Maximum erased area / input image area Default: 0.4

  • aspect_range (sequence | float) – Aspect ratio range of erased area. if float, it will be converted to (aspect_ratio, 1/aspect_ratio) Default: (3/10, 10/3)

  • mode (str) – Fill method in erased area, can be: - const (default): All pixels are assign with the same value. - rand: each pixel is assigned with a random value in [0, 255]

  • fill_color (sequence | Number) – Base color filled in erased area. Defaults to (128, 128, 128).

  • fill_std (sequence | Number, optional) – If set and mode is ‘rand’, fill erased area with random color from normal distribution (mean=fill_color, std=fill_std); If not set, fill erased area with random color from uniform distribution (0~255). Defaults to None.

Note

See Random Erasing Data Augmentation This paper provided 4 modes: RE-R, RE-M, RE-0, RE-255, and use RE-M as default. The config of these 4 modes are: - RE-R: RandomErasing(mode=’rand’) - RE-M: RandomErasing(mode=’const’, fill_color=(123.67, 116.3, 103.5)) - RE-0: RandomErasing(mode=’const’, fill_color=0) - RE-255: RandomErasing(mode=’const’, fill_color=255)

__init__(erase_prob=0.5, min_area_ratio=0.02, max_area_ratio=0.4, aspect_range=(0.3, 3.3333333333333335), mode='const', fill_color=(128, 128, 128), fill_std=None)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.classification.odps module
class easycv.datasets.classification.odps.ClsOdpsDataset(data_source, pipeline, image_key='url_image', label_key='label', **kwargs)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for rotation prediction

__init__(data_source, pipeline, image_key='url_image', label_key='label', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]
easycv.datasets.classification.raw module
class easycv.datasets.classification.raw.ClsDataset(data_source, pipeline)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for classification

Parameters
  • data_source – data source to parse input data

  • pipeline – transforms list

__init__(data_source, pipeline)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None, topk=(1, 5))[source]

evaluate classification task

Parameters
  • results – a dict of list of tensor, including prediction and groundtruth info, where prediction tensor is NxC,and the same with groundtruth labels.

  • evaluators – a list of evaluator

Returns

a dict of float, different metric values

Return type

eval_result

visualize(results, vis_num=10, **kwargs)[source]

Visulaize the model output on validation data. :param results: A dictionary containing

class: List of length number of test images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape and so on.

Parameters

vis_num – number of images visualized

Returns: A dictionary containing

images: Visulaized images, list of np.ndarray. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape and so on.

easycv.datasets.detection package

class easycv.datasets.detection.DetDataset(data_source, pipeline, profiling=False, classes=None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for Detection

__init__(data_source, pipeline, profiling=False, classes=None)[source]
Parameters
  • data_source – Data_source config dict

  • pipeline – Pipeline config list

  • profiling – If set True, will print pipeline time

  • classes – A list of class names, used in evaluation for result and groundtruth visualization

evaluate(results, evaluators=None, logger=None)[source]

Evaluates the detection boxes. :param results: A dictionary containing

detection_boxes: List of length number of test images.

Float32 numpy array of shape [num_boxes, 4] and format [ymin, xmin, ymax, xmax] in absolute image coordinates.

detection_scores: List of length number of test images,

detection scores for the boxes, float32 numpy array of shape [num_boxes].

detection_classes: List of length number of test images,

integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.

img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

Parameters

evaluators – evaluators to calculate metric with results and groundtruth_dict

visualize(results, vis_num=10, score_thr=0.3, **kwargs)[source]

Visulaize the model output on validation data. :param results: A dictionary containing

detection_boxes: List of length number of test images.

Float32 numpy array of shape [num_boxes, 4] and format [ymin, xmin, ymax, xmax] in absolute image coordinates.

detection_scores: List of length number of test images,

detection scores for the boxes, float32 numpy array of shape [num_boxes].

detection_classes: List of length number of test images,

integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.

img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

Parameters
  • vis_num – number of images visualized

  • score_thr – The threshold to filter box, boxes with scores greater than score_thr will be kept.

Returns: A dictionary containing

images: Visulaized images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

class easycv.datasets.detection.DetImagesMixDataset(data_source, pipeline, dynamic_scale=None, skip_type_keys=None, profiling=False, classes=None, yolo_format=True, label_padding=True)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

A wrapper of multiple images mixed dataset.

Suitable for training on multiple images mixed data augmentation like mosaic and mixup. For the augmentation pipeline of mixed image data, the get_indexes method needs to be provided to obtain the image indexes, and you can set skip_flags to change the pipeline running process. At the same time, we provide the dynamic_scale parameter to dynamically change the output image size.

output boxes format: cx, cy, w, h

Parameters
  • data_source (DetSourceCoco) – The dataset to be mixed.

  • pipeline (Sequence[dict]) – Sequence of transform object or config dict to be composed.

  • dynamic_scale (tuple[int], optional) – The image scale can be changed dynamically. Default to None.

  • skip_type_keys (list[str], optional) – Sequence of type string to be skip pipeline. Default to None.

  • label_padding – out labeling padding [N, 120, 5]

__init__(data_source, pipeline, dynamic_scale=None, skip_type_keys=None, profiling=False, classes=None, yolo_format=True, label_padding=True)[source]

Args: data_source: Data_source config dict pipeline: Pipeline config list profiling: If set True, will print pipeline time classes: A list of class names, used in evaluation for result and groundtruth visualization

update_skip_type_keys(skip_type_keys)[source]

Update skip_type_keys. It is called by an external hook.

Parameters

skip_type_keys (list[str], optional) – Sequence of type string to be skip pipeline.

update_dynamic_scale(dynamic_scale)[source]

Update dynamic_scale. It is called by an external hook.

Parameters

dynamic_scale (tuple[int]) – The image scale can be changed dynamically.

results2json(results, outfile_prefix)[source]

Dump the detection results to a COCO style json file.

There are 3 types of results: proposals, bbox predictions, mask predictions, and they have different data types. This method will automatically recognize the type, and dump them to json files.

Parameters
  • results (list[list | tuple | ndarray]) – Testing results of the dataset.

  • outfile_prefix (str) – The filename prefix of the json files. If the prefix is “somepath/xxx”, the json files will be named “somepath/xxx.bbox.json”, “somepath/xxx.segm.json”, “somepath/xxx.proposal.json”.

Returns

str]: Possible keys are “bbox”, “segm”, “proposal”, and values are corresponding filenames.

Return type

dict[str

format_results(results, jsonfile_prefix=None, **kwargs)[source]

Format the results to json (standard format for COCO evaluation).

Parameters
  • results (list[tuple | numpy.ndarray]) – Testing results of the dataset.

  • jsonfile_prefix (str | None) – The prefix of json files. It includes the file path and the prefix of filename, e.g., “a/b/prefix”. If not specified, a temp file will be created. Default: None.

Returns

(result_files, tmp_dir), result_files is a dict containing the json filepaths, tmp_dir is the temporal directory created for saving json files when jsonfile_prefix is not specified.

Return type

tuple

Subpackages
easycv.datasets.detection.data_sources package
class easycv.datasets.detection.data_sources.DetSourceCoco(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]

Bases: object

coco data source

__init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]
Parameters
  • ann_file – Path of annotation file.

  • img_prefix – coco path prefix

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as True

load_annotations(ann_file)[source]

Load annotation from COCO style annotation file. :param ann_file: Path of annotation file. :type ann_file: str

Returns

Annotation info from COCO api.

Return type

list[dict]

get_ann_info(idx)[source]

Get COCO annotation by index. :param idx: Index of data. :type idx: int

Returns

Annotation info of specified index.

Return type

dict

get_cat_ids(idx)[source]

Get COCO category ids by index. :param idx: Index of data. :type idx: int

Returns

All categories in the image of specified index.

Return type

list[int]

xyxy2xywh(bbox)[source]

Convert xyxy style bounding boxes to xywh style for COCO evaluation. :param bbox: The bounding boxes, shape (4, ), in

xyxy order.

Returns

The converted bounding boxes, in xywh order.

Return type

list[float]

pre_pipeline(results)[source]

Prepare results dict for pipeline.

prepare_train_img(idx)[source]

Get training data and annotations after pipeline. :param idx: Index of data. :type idx: int

Returns

Training data and annotation after pipeline with new keys introduced by pipeline.

Return type

dict

class easycv.datasets.detection.data_sources.DetSourceCocoPanoptic(ann_file, pan_ann_file, img_prefix, seg_prefix, pipeline, outfile_prefix='test/test_pan', test_mode=False, filter_empty_gt=False, thing_classes=None, stuff_classes=None, iscrowd=False)[source]

Bases: easycv.datasets.detection.data_sources.coco.DetSourceCoco

cocopanoptic data source

__init__(ann_file, pan_ann_file, img_prefix, seg_prefix, pipeline, outfile_prefix='test/test_pan', test_mode=False, filter_empty_gt=False, thing_classes=None, stuff_classes=None, iscrowd=False)[source]
Parameters
  • ann_file (str) – Path of coco detection annotation file

  • pan_ann_file (str) – Path of coco panoptic annotation file

  • img_prefix (str) – Path of image file

  • seg_prefix (str) – Path of semantic image file

  • pipeline (list[dict]) – list of data augmentatin operation

  • outfile_prefix (str, optional) – The filename prefix of the output files. If the prefix is “somepath/xxx”, the json files will be named “somepath/xxx.panoptic.json”, “somepath/xxx.bbox.json”, “somepath/xxx.segm.json”

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • thing_classes (list[str], optional) – list of thing classes. Defaults to None.

  • stuff_classes (list[str], optional) – list of thing classes. Defaults to None.

  • iscrowd (bool, optional) – when traing setted as False, when val setted as True. Defaults to False.

load_annotations_pan(ann_file)[source]

Load annotation from COCO Panoptic style annotation file.

Parameters

ann_file (str) – Path of annotation file.

Returns

Annotation info from COCO api.

Return type

list[dict]

get_ann_info_pan(idx)[source]

Get COCO annotation by index.

Parameters

idx (int) – Index of data.

Returns

Annotation info of specified index.

Return type

dict

pre_pipeline(results)[source]

Prepare results dict for pipeline.

prepare_train_img(idx)[source]

Get training data and annotations after pipeline.

Parameters

idx (int) – Index of data.

Returns

Training data and annotation after pipeline with new keys introduced by pipeline.

Return type

dict

results2json(results)[source]

Dump the results to a COCO style json file.

There are 4 types of results: proposals, bbox predictions, mask predictions, panoptic segmentation predictions, and they have different data types. This method will automatically recognize the type, and dump them to json files.

[
    {
        'pan_results': np.array, # shape (h, w)
        # ins_results which includes bboxes and RLE encoded masks
        # is optional.
        'ins_results': (list[np.array], list[list[str]])
    },
    ...
]
Parameters

results (list[dict]) – Testing results of the dataset.

Returns

str]: Possible keys are “panoptic”, “bbox”, “segm”, “proposal”, and values are corresponding filenames.

Return type

dict[str

get_gt_json(result_files)[source]

get input for coco panptic evaluation

Parameters

result_files (dict) – path of predict result

Returns

gt label gt_folder (str): path of gt file pred_json(dict): predict result pred_folder(str): path of pred file categories(dict): panoptic categories

Return type

gt_json (dict)

class easycv.datasets.detection.data_sources.DetSourceObjects365(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=[], iscrowd=False)[source]

Bases: easycv.datasets.detection.data_sources.coco.DetSourceCoco

objects365 data source. The form of the objects365 dataset folder build:

|- objects365
|- annotation

|- zhiyuan_objv2_train.json |- zhiyuan_objv2_val.json

|- train
|- patch0

|- *****(imageID)

|- patch1

|- *****(imageID)

|- patch50

|- *****(imageID)

|- val
|- patch0

|- *****(imageID)

|- patch1

|- *****(imageID)

|- patch43

|- *****(imageID)

__init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=[], iscrowd=False)[source]
Parameters
  • ann_file – Path of annotation file.

  • img_prefix – coco path prefix

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as True

load_annotations(ann_file)[source]

Load annotation from COCO style annotation file. :param ann_file: Path of annotation file. :type ann_file: str

Returns

Annotation info from COCO api.

Return type

list[dict]

class easycv.datasets.detection.data_sources.DetSourcePAI(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data format please refer to: https://help.aliyun.com/document_detail/311173.html

__init__(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]
Parameters
  • path – Path of manifest path with pai label format

  • classes – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.DetSourceRaw(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- data_dir

|-images

|-1.jpg |-…

|-labels

|-1.txt |-…

` Label txt file is as follows: The first column is the label id, and columns 2 to 5 are coordinates relative to the image width and height [x_center, y_center, bbox_w, bbox_h]. ` 15 0.519398 0.544087 0.476359 0.572061 2 0.501859 0.820726 0.996281 0.332178 … ``` .. rubric:: Example

data_source = DetSourceRaw(

img_root_path=’/your/data_dir/images’, label_root_path=’/your/data_dir/labels’,

)

__init__(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]
Parameters
  • img_root_path – images dir path

  • label_root_path – labels dir path

  • classes (list, optional) – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • delimeter – delimeter of txt file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

post_process_fn(result_dict)[source]
get_ann_info(idx)[source]

Get raw annotation info, include bounding boxes, labels and so on. bboxes format is as [x1, y1, x2, y2] without normalization.

class easycv.datasets.detection.data_sources.DetSourceVOC(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- voc_data

|-ImageSets
|-Main

|-train.txt |-…

|-JPEGImages

|-00001.jpg |-…

|-Annotations

|-00001.xml |-…

``` Example1:

data_source = DetSourceVOC(

path=’/your/voc_data/ImageSets/Main/train.txt’, classes=${VOC_CLASSES},

)

Example1:
data_source = DetSourceVOC(

path=’/your/voc_data/train.txt’, classes=${VOC_CLASSES}, img_root_path=’/your/voc_data/images’, img_root_path=’/your/voc_data/annotations’

)

__init__(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – path of img id list file in ImageSets/Main/

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.DetSourceVOC2007(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.voc.DetSourceVOC

__init__(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading

  • download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used

  • split – train or val

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

class easycv.datasets.detection.data_sources.DetSourceVOC2012(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.voc.DetSourceVOC

__init__(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading

  • download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used

  • split – train or val

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

class easycv.datasets.detection.data_sources.DetSourceCoco2017(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]

Bases: easycv.datasets.detection.data_sources.coco.DetSourceCoco

coco2017 data source

__init__(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]
Parameters
  • path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading

  • download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used

  • split – train or val

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as True

class easycv.datasets.detection.data_sources.DetSourceLvis(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.coco.DetSourceCoco

lvis data source

__init__(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False, **kwargs)[source]
Parameters
  • path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading

  • download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used

  • split – train or val

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as True

cfg = {'dataset': 'images', 'links': ['https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_train.json.zip', 'https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_val.json.zip', 'http://images.cocodataset.org/zips/train2017.zip', 'http://images.cocodataset.org/zips/val2017.zip'], 'train': 'lvis_v1_train.json', 'val': 'lvis_v1_val.json'}
load_annotations(ann_file)[source]

Load annotation from COCO style annotation file. :param ann_file: Path of annotation file. :type ann_file: str

Returns

Annotation info from COCO api.

Return type

list[dict]

download()[source]
merge_images_folder()[source]
class easycv.datasets.detection.data_sources.DetSourceWiderPerson(path, classes=['pedestrians', 'riders', 'partially-visible persons', 'ignore regions', 'crowd'], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.txt', parse_fn=<function parse_txt>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

CLASSES = ['pedestrians', 'riders', 'partially-visible persons', 'ignore regions', 'crowd']

dataset_name=’Wider Person’, paper_info=@article{zhang2019widerperson, Author = {Zhang, Shifeng and Xie, Yiliang and Wan, Jun and Xia, Hansheng and Li, Stan Z. and Guo, Guodong}, journal = {IEEE Transactions on Multimedia (TMM)}, Title = {WiderPerson: A Diverse Dataset for Dense Pedestrian Detection in the Wild}, Year = {2019}}

__init__(path, classes=['pedestrians', 'riders', 'partially-visible persons', 'ignore regions', 'crowd'], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.txt', parse_fn=<function parse_txt>, num_processes=1, **kwargs)None[source]
Parameters
  • path – path of img id list file in root

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the WiderPerso data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the WiderPerso data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.DetSourceAfricanWildlife(path, classes=['buffalo', 'elephant', 'rhino', 'zebra'], cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.txt', parse_fn=<function parse_txt>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- data

|-buffalo

|-001.jpg |-001.txt |-…

|-elephant

|-001.jpg |-001.txt |-…

|-rhino

|-001.jpg |-001.txt

|-…

``` Example1:

data_source = DetSourceAfricanWildlife(

path=’/your/data/’, classes=${CLASSES},

)

CLASSES = ['buffalo', 'elephant', 'rhino', 'zebra']
__init__(path, classes=['buffalo', 'elephant', 'rhino', 'zebra'], cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.txt', parse_fn=<function parse_txt>, num_processes=1, **kwargs)None[source]
Parameters
  • path – path of img id list file in root

  • classes – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.DetSourcePet(path, classes_id=1, img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- data

|-annotations
|-annotations

|-list.txt |-test.txt |-trainval.txt |-xmls

|-Abyssinian_6.xml |-…

|-images
|-images

|-Abyssinian_6.jpg |-…

``` Example0:

data_source = DetSourcePet(

path=’/your/data/annotations/annotations/trainval.txt’, classes_id=1 or 2 or 3,

Example1:
data_source = DetSourcePet(

path=’/your/data/annotations/annotations/trainval.txt’, classes_id=1 or 2 or 3, img_root_path=’/your/data//images’, img_root_path=’/your/data/annotations/annotations/xmls’

)

CLASSES_CFG = {1: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37], 2: [1, 2], 3: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25]}
__init__(path, classes_id=1, img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – path of img id list file in pet format

  • classes_id – 1= 1:37 Class ids, 2 = 1:Cat 2:Dog, 3 = 1-25:Cat 1:12:Dog

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.DetSourceWiderFace(ann_file, img_prefix, classes='blur', cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parse_load>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

CLASSES = {'blur': ['clear', 'normal blur', 'heavy blur'], 'expression': ['typical expression', 'exaggerate expression'], 'illumination': ['normal illumination', 'extreme illumination'], 'invalid': ['false valid image)', 'true (invalid image)'], 'occlusion': ['no occlusion', 'partial occlusion', 'heavy occlusion'], 'pose': ['typical pose', 'atypical pose']}

Citation: @inproceedings{yang2016wider, Author = {Yang, Shuo and Luo, Ping and Loy, Chen Change and Tang, Xiaoou}, Booktitle = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, Title = {WIDER FACE: A Face Detection Benchmark}, Year = {2016}}

__init__(ann_file, img_prefix, classes='blur', cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parse_load>, num_processes=1, **kwargs)None[source]
Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held.

  • classes (str) – classes defalut=’blur’

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.DetSourceCrowdHuman(ann_file, img_prefix, gt_op='vbox', classes=['mask', 'person'], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parse_load>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

CLASSES = ['mask', 'person']
Citation:

@article{shao2018crowdhuman, title={CrowdHuman: A Benchmark for Detecting Human in a Crowd}, author={Shao, Shuai and Zhao, Zijian and Li, Boxun and Xiao, Tete and Yu, Gang and Zhang, Xiangyu and Sun, Jian}, journal={arXiv preprint arXiv:1805.00123}, year={2018}

}

__init__(ann_file, img_prefix, gt_op='vbox', classes=['mask', 'person'], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parse_load>, num_processes=1, **kwargs)None[source]
Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held.

  • gt_op (str) – vbox(visible box), fbox(full box), hbox(head box), defalut vbox

  • classes (list) – classes defalut=[‘mask’, ‘person’]

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

Submodules
easycv.datasets.detection.data_sources.coco module
class easycv.datasets.detection.data_sources.coco.DetSourceCoco(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]

Bases: object

coco data source

__init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]
Parameters
  • ann_file – Path of annotation file.

  • img_prefix – coco path prefix

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as True

load_annotations(ann_file)[source]

Load annotation from COCO style annotation file. :param ann_file: Path of annotation file. :type ann_file: str

Returns

Annotation info from COCO api.

Return type

list[dict]

get_ann_info(idx)[source]

Get COCO annotation by index. :param idx: Index of data. :type idx: int

Returns

Annotation info of specified index.

Return type

dict

get_cat_ids(idx)[source]

Get COCO category ids by index. :param idx: Index of data. :type idx: int

Returns

All categories in the image of specified index.

Return type

list[int]

xyxy2xywh(bbox)[source]

Convert xyxy style bounding boxes to xywh style for COCO evaluation. :param bbox: The bounding boxes, shape (4, ), in

xyxy order.

Returns

The converted bounding boxes, in xywh order.

Return type

list[float]

pre_pipeline(results)[source]

Prepare results dict for pipeline.

prepare_train_img(idx)[source]

Get training data and annotations after pipeline. :param idx: Index of data. :type idx: int

Returns

Training data and annotation after pipeline with new keys introduced by pipeline.

Return type

dict

class easycv.datasets.detection.data_sources.coco.DetSourceCoco2017(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]

Bases: easycv.datasets.detection.data_sources.coco.DetSourceCoco

coco2017 data source

__init__(pipeline, path=None, download=True, split='train', test_mode=False, filter_empty_gt=False, classes=None, iscrowd=False)[source]
Parameters
  • path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading

  • download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used

  • split – train or val

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as True

class easycv.datasets.detection.data_sources.coco.DetSourceTinyPerson(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=['sea_person', 'earth_person'], iscrowd=False)[source]

Bases: easycv.datasets.detection.data_sources.coco.DetSourceCoco

TINY PERSON data source

CLASSES = ['sea_person', 'earth_person']
__init__(ann_file, img_prefix, pipeline, test_mode=False, filter_empty_gt=False, classes=['sea_person', 'earth_person'], iscrowd=False)[source]
Parameters
  • ann_file – Path of annotation file.

  • img_prefix – coco path prefix

  • test_mode (bool, optional) – If set True, self._filter_imgs will not works.

  • filter_empty_gt (bool, optional) – If set true, images without bounding boxes of the dataset’s classes will be filtered out. This option only works when test_mode=False, i.e., we never filter images during tests.

  • iscrowd – when traing setted as False, when val setted as True

easycv.datasets.detection.data_sources.pai_format module
easycv.datasets.detection.data_sources.pai_format.get_prior_task_id(keys)[source]

“The task id ends with check is the highest priority.

easycv.datasets.detection.data_sources.pai_format.is_itag_v2(row)[source]

The keyword of the data source is picUrl in v1, but is source in v2

easycv.datasets.detection.data_sources.pai_format.parser_manifest_row_str(row_str, classes)[source]
class easycv.datasets.detection.data_sources.pai_format.DetSourcePAI(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data format please refer to: https://help.aliyun.com/document_detail/311173.html

__init__(path, classes=[], cache_at_init=False, cache_on_the_fly=False, parse_fn=<function parser_manifest_row_str>, num_processes=1, **kwargs)[source]
Parameters
  • path – Path of manifest path with pai label format

  • classes – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

easycv.datasets.detection.data_sources.raw module
easycv.datasets.detection.data_sources.raw.parse_raw(source_iter, classes=None, delimeter=' ')[source]
class easycv.datasets.detection.data_sources.raw.DetSourceRaw(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- data_dir

|-images

|-1.jpg |-…

|-labels

|-1.txt |-…

` Label txt file is as follows: The first column is the label id, and columns 2 to 5 are coordinates relative to the image width and height [x_center, y_center, bbox_w, bbox_h]. ` 15 0.519398 0.544087 0.476359 0.572061 2 0.501859 0.820726 0.996281 0.332178 … ``` .. rubric:: Example

data_source = DetSourceRaw(

img_root_path=’/your/data_dir/images’, label_root_path=’/your/data_dir/labels’,

)

__init__(img_root_path, label_root_path, classes=[], cache_at_init=False, cache_on_the_fly=False, delimeter=' ', parse_fn=<function parse_raw>, num_processes=1, **kwargs)[source]
Parameters
  • img_root_path – images dir path

  • label_root_path – labels dir path

  • classes (list, optional) – classes list

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • delimeter – delimeter of txt file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

post_process_fn(result_dict)[source]
get_ann_info(idx)[source]

Get raw annotation info, include bounding boxes, labels and so on. bboxes format is as [x1, y1, x2, y2] without normalization.

easycv.datasets.detection.data_sources.utils module
easycv.datasets.detection.data_sources.utils.exif_size(img)[source]
easycv.datasets.detection.data_sources.voc module
easycv.datasets.detection.data_sources.voc.parse_xml(source_item, classes)[source]
class easycv.datasets.detection.data_sources.voc.DetSourceVOC(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.base.DetSourceBase

data dir is as follows: ``` |- voc_data

|-ImageSets
|-Main

|-train.txt |-…

|-JPEGImages

|-00001.jpg |-…

|-Annotations

|-00001.xml |-…

``` Example1:

data_source = DetSourceVOC(

path=’/your/voc_data/ImageSets/Main/train.txt’, classes=${VOC_CLASSES},

)

Example1:
data_source = DetSourceVOC(

path=’/your/voc_data/train.txt’, classes=${VOC_CLASSES}, img_root_path=’/your/voc_data/images’, img_root_path=’/your/voc_data/annotations’

)

__init__(path, classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – path of img id list file in ImageSets/Main/

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

get_source_iterator()[source]

Return data list iterator, source iterator will be passed to parse_fn, and parse_fn will receive params of item of source iter and classes for parsing. What does parse_fn need, what does source iterator returns.

class easycv.datasets.detection.data_sources.voc.DetSourceVOC2012(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.voc.DetSourceVOC

__init__(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading

  • download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used

  • split – train or val

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

class easycv.datasets.detection.data_sources.voc.DetSourceVOC2007(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]

Bases: easycv.datasets.detection.data_sources.voc.DetSourceVOC

__init__(path=None, download=True, split='train', classes=[], img_root_path=None, label_root_path=None, cache_at_init=False, cache_on_the_fly=False, img_suffix='.jpg', label_suffix='.xml', parse_fn=<function parse_xml>, num_processes=1, **kwargs)[source]
Parameters
  • path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading

  • download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used

  • split – train or val

  • classes – classes list

  • img_root_path – image dir path, if None, default to detect the image dir by the relative path of the path according to the VOC data format.

  • label_root_path – label dir path, if None, default to detect the label dir by the relative path of the path according to the VOC data format.

  • cache_at_init – if set True, will cache in memory in __init__ for faster training

  • cache_on_the_fly – if set True, will cache in memroy during training

  • img_suffix – suffix of image file

  • label_suffix – suffix of label file

  • parse_fn – parse function to parse item of source iterator

  • num_processes – number of processes to parse samples

easycv.datasets.detection.pipelines package
class easycv.datasets.detection.pipelines.MMToTensor[source]

Bases: object

Transform image to Tensor. Required key: ‘img’. Modifies key: ‘img’. :param results: contain all information about training. :type results: dict

class easycv.datasets.detection.pipelines.NormalizeTensor(mean, std)[source]

Bases: object

Normalize the Tensor image (CxHxW), with mean and std. Required key: ‘img’. Modifies key: ‘img’. :param mean: Mean values of 3 channels. :type mean: list[float] :param std: Std values of 3 channels. :type std: list[float]

__init__(mean, std)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.MMMosaic(img_scale=(640, 640), center_ratio_range=(0.5, 1.5), pad_val=114)[source]

Bases: object

Mosaic augmentation. Given 4 images, mosaic transform combines them into one output image. The output image is composed of the parts from each sub- image. .. code:: text

mosaic transform

center_x

center_y |----+-------------+-----------|
| cropped | |

|pad | image3 | image4 | | | | | +----|————-+———–+

The mosaic transform steps are as follows:
  1. Choose the mosaic center as the intersections of 4 images

  2. Get the left top image according to the index, and randomly sample another 3 images from the custom dataset.

  3. Sub image will be cropped if image is larger than mosaic patch

Parameters
  • img_scale (Sequence[int]) – Image size after mosaic pipeline of single image. Default to (640, 640).

  • center_ratio_range (Sequence[float]) – Center ratio range of mosaic output. Default to (0.5, 1.5).

  • pad_val (int) – Pad value. Default to 114.

__init__(img_scale=(640, 640), center_ratio_range=(0.5, 1.5), pad_val=114)[source]

Initialize self. See help(type(self)) for accurate signature.

get_indexes(dataset)[source]

Call function to collect indexes. :param dataset: The dataset. :type dataset: DetImagesMixDataset

Returns

indexes.

Return type

list

class easycv.datasets.detection.pipelines.MMMixUp(img_scale=(640, 640), ratio_range=(0.5, 1.5), flip_ratio=0.5, pad_val=114, max_iters=15, min_bbox_size=5, min_area_ratio=0.2, max_aspect_ratio=20)[source]

Bases: object

MixUp data augmentation. .. code:: text

mixup transform

mixup image |

+——–|——–+ | | |

—————+ |
image |
|—————–+

pad

The mixup transform steps are as follows::
  1. Another random image is picked by dataset and embedded in the top left patch(after padding and resizing)

  2. The target of mixup transform is the weighted average of mixup image and origin image.

Parameters
  • img_scale (Sequence[int]) – Image output size after mixup pipeline. Default: (640, 640).

  • ratio_range (Sequence[float]) – Scale ratio of mixup image. Default: (0.5, 1.5).

  • flip_ratio (float) – Horizontal flip ratio of mixup image. Default: 0.5.

  • pad_val (int) – Pad value. Default: 114.

  • max_iters (int) – The maximum number of iterations. If the number of iterations is greater than max_iters, but gt_bbox is still empty, then the iteration is terminated. Default: 15.

  • min_bbox_size (float) – Width and height threshold to filter bboxes. If the height or width of a box is smaller than this value, it will be removed. Default: 5.

  • min_area_ratio (float) – Threshold of area ratio between original bboxes and wrapped bboxes. If smaller than this value, the box will be removed. Default: 0.2.

  • max_aspect_ratio (float) – Aspect ratio of width and height threshold to filter bboxes. If max(h/w, w/h) larger than this value, the box will be removed. Default: 20.

__init__(img_scale=(640, 640), ratio_range=(0.5, 1.5), flip_ratio=0.5, pad_val=114, max_iters=15, min_bbox_size=5, min_area_ratio=0.2, max_aspect_ratio=20)[source]

Initialize self. See help(type(self)) for accurate signature.

get_indexes(dataset)[source]

Call function to collect indexes. :param dataset: The dataset. :type dataset: DetImagesMixDataset

Returns

indexes.

Return type

list

class easycv.datasets.detection.pipelines.MMRandomAffine(max_rotate_degree=10.0, max_translate_ratio=0.1, scaling_ratio_range=(0.5, 1.5), max_shear_degree=2.0, border=(0, 0), border_val=(114, 114, 114), min_bbox_size=2, min_area_ratio=0.2, max_aspect_ratio=20)[source]

Bases: object

Random affine transform data augmentation. for yolox This operation randomly generates affine transform matrix which including rotation, translation, shear and scaling transforms. :param max_rotate_degree: Maximum degrees of rotation transform.

Default: 10.

Parameters
  • max_translate_ratio (float) – Maximum ratio of translation. Default: 0.1.

  • scaling_ratio_range (tuple[float]) – Min and max ratio of scaling transform. Default: (0.5, 1.5).

  • max_shear_degree (float) – Maximum degrees of shear transform. Default: 2.

  • border (tuple[int]) – Distance from height and width sides of input image to adjust output shape. Only used in mosaic dataset. Default: (0, 0).

  • border_val (tuple[int]) – Border padding values of 3 channels. Default: (114, 114, 114).

  • min_bbox_size (float) – Width and height threshold to filter bboxes. If the height or width of a box is smaller than this value, it will be removed. Default: 2.

  • min_area_ratio (float) – Threshold of area ratio between original bboxes and wrapped bboxes. If smaller than this value, the box will be removed. Default: 0.2.

  • max_aspect_ratio (float) – Aspect ratio of width and height threshold to filter bboxes. If max(h/w, w/h) larger than this value, the box will be removed.

__init__(max_rotate_degree=10.0, max_translate_ratio=0.1, scaling_ratio_range=(0.5, 1.5), max_shear_degree=2.0, border=(0, 0), border_val=(114, 114, 114), min_bbox_size=2, min_area_ratio=0.2, max_aspect_ratio=20)[source]

Initialize self. See help(type(self)) for accurate signature.

filter_gt_bboxes(origin_bboxes, wrapped_bboxes)[source]
class easycv.datasets.detection.pipelines.MMPhotoMetricDistortion(brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18)[source]

Bases: object

Apply photometric distortion to image sequentially, every transformation is applied with a probability of 0.5. The position of random contrast is in second or second to last. 1. random brightness 2. random contrast (mode 0) 3. convert color from BGR to HSV 4. random saturation 5. random hue 6. convert color from HSV to BGR 7. random contrast (mode 1) 8. randomly swap channels :param brightness_delta: delta of brightness. :type brightness_delta: int :param contrast_range: range of contrast. :type contrast_range: tuple :param saturation_range: range of saturation. :type saturation_range: tuple :param hue_delta: delta of hue. :type hue_delta: int

__init__(brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.MMResize(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', override=False)[source]

Bases: object

Resize images & bbox & mask. This transform resizes the input image to some scale. Bboxes and masks are then resized with the same scale factor. If the input dict contains the key “scale”, then the scale in the input dict is used, otherwise the specified scale in the init method is used. If the input dict contains the key “scale_factor” (if MultiScaleFlipAug does not give img_scale but scale_factor), the actual scale will be computed by image shape and scale_factor. img_scale can either be a tuple (single-scale) or a list of tuple (multi-scale). There are 3 multiscale modes: - ratio_range is not None: randomly sample a ratio from the ratio range and multiply it with the image scale. - ratio_range is None and multiscale_mode == "range": randomly sample a scale from the multiscale range. - ratio_range is None and multiscale_mode == "value": randomly sample a scale from multiple scales. :param img_scale: Images scales for resizing. :type img_scale: tuple or list[tuple] :param multiscale_mode: Either “range” or “value”. :type multiscale_mode: str :param ratio_range: (min_ratio, max_ratio) :type ratio_range: tuple[float] :param keep_ratio: Whether to keep the aspect ratio when resizing the

image.

Parameters
  • bbox_clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.

  • backend (str) – Image resize backend, choices are ‘cv2’ and ‘pillow’. These two backends generates slightly different results. Defaults to ‘cv2’.

  • override (bool, optional) – Whether to override scale and scale_factor so as to call resize twice. Default False. If True, after the first resizing, the existed scale and scale_factor will be ignored so the second resizing can be allowed. This option is a work-around for multiple times of resize in DETR. Defaults to False.

__init__(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', override=False)[source]

Initialize self. See help(type(self)) for accurate signature.

static random_select(img_scales)[source]

Randomly select an img_scale from given candidates. :param img_scales: Images scales for selection. :type img_scales: list[tuple]

Returns

Returns a tuple (img_scale, scale_dix), where img_scale is the selected image scale and scale_idx is the selected index in the given candidates.

Return type

(tuple, int)

static random_sample(img_scales)[source]

Randomly sample an img_scale when multiscale_mode=='range'. :param img_scales: Images scale range for sampling.

There must be two tuples in img_scales, which specify the lower and upper bound of image scales.

Returns

Returns a tuple (img_scale, None), where img_scale is sampled scale and None is just a placeholder to be consistent with random_select().

Return type

(tuple, None)

static random_sample_ratio(img_scale, ratio_range)[source]

Randomly sample an img_scale when ratio_range is specified. A ratio will be randomly sampled from the range specified by ratio_range. Then it would be multiplied with img_scale to generate sampled scale. :param img_scale: Images scale base to multiply with ratio. :type img_scale: tuple :param ratio_range: The minimum and maximum ratio to scale

the img_scale.

Returns

Returns a tuple (scale, None), where scale is sampled ratio multiplied with img_scale and None is just a placeholder to be consistent with random_select().

Return type

(tuple, None)

class easycv.datasets.detection.pipelines.MMRandomFlip(flip_ratio=None, direction='horizontal')[source]

Bases: object

Flip the image & bbox & mask. If the input dict contains the key “flip”, then the flag will be used, otherwise it will be randomly decided by a ratio specified in the init method. When random flip is enabled, flip_ratio/direction can either be a float/string or tuple of float/string. There are 3 flip modes: - flip_ratio is float, direction is string: the image will be

direction``ly flipped with probability of ``flip_ratio . E.g., flip_ratio=0.5, direction='horizontal', then image will be horizontally flipped with probability of 0.5.

  • flip_ratio is float, direction is list of string: the image wil

    be direction[i]``ly flipped with probability of ``flip_ratio/len(direction). E.g., flip_ratio=0.5, direction=['horizontal', 'vertical'], then image will be horizontally flipped with probability of 0.25, vertically with probability of 0.25.

  • flip_ratio is list of float, direction is list of string:

    given len(flip_ratio) == len(direction), the image wil be direction[i]``ly flipped with probability of ``flip_ratio[i]. E.g., flip_ratio=[0.3, 0.5], direction=['horizontal', 'vertical'], then image will be horizontally flipped with probability of 0.3, vertically with probability of 0.5.

Parameters
  • flip_ratio (float | list[float], optional) – The flipping probability. Default: None.

  • direction (str | list[str], optional) – The flipping direction. Options are ‘horizontal’, ‘vertical’, ‘diagonal’. Default: ‘horizontal’. If input is a list, the length must equal flip_ratio. Each element in flip_ratio indicates the flip probability of corresponding direction.

__init__(flip_ratio=None, direction='horizontal')[source]

Initialize self. See help(type(self)) for accurate signature.

bbox_flip(bboxes, img_shape, direction)[source]

Flip bboxes horizontally. :param bboxes: Bounding boxes, shape (…, 4*k) :type bboxes: numpy.ndarray :param img_shape: Image shape (height, width) :type img_shape: tuple[int] :param direction: Flip direction. Options are ‘horizontal’,

‘vertical’.

Returns

Flipped bounding boxes.

Return type

numpy.ndarray

class easycv.datasets.detection.pipelines.MMPad(size=None, size_divisor=None, pad_to_square=False, pad_val={'img': 0, 'masks': 0, 'seg': 255})[source]

Bases: object

Pad the image & mask. There are two padding modes: (1) pad to a fixed size and (2) pad to the minimum size that is divisible by some number. Added keys are “pad_shape”, “pad_fixed_size”, “pad_size_divisor”, :param size: Fixed padding size. :type size: tuple, optional :param size_divisor: The divisor of padded size. :type size_divisor: int, optional :param pad_to_square: Whether to pad the image into a square.

Currently only used for YOLOX. Default: False.

Parameters

pad_val (dict, optional) – A dict for padding value, the default value is dict(img=0, masks=0, seg=255).

__init__(size=None, size_divisor=None, pad_to_square=False, pad_val={'img': 0, 'masks': 0, 'seg': 255})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.MMNormalize(mean, std, to_rgb=True)[source]

Bases: object

Normalize the image. Added key is “img_norm_cfg”. :param mean: Mean values of 3 channels. :type mean: sequence :param std: Std values of 3 channels. :type std: sequence :param to_rgb: Whether to convert the image from BGR to RGB,

default is true.

__init__(mean, std, to_rgb=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.LoadImageFromFile(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]

Bases: object

Load an image from file. Required keys are “img_prefix” and “img_info” (a dict that must contain the key “filename”). Added or updated keys are “filename”, “img”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0) and “img_norm_cfg” (means=0 and stds=1). :param to_float32: Whether to convert the loaded image to a float32

numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.

Parameters
  • color_type (str) – The flag argument for mmcv.imfrombytes(). Defaults to ‘color’.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').

__init__(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.LoadImageFromWebcam(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]

Bases: easycv.datasets.detection.pipelines.mm_transforms.LoadImageFromFile

Load an image from webcam.

Similar with LoadImageFromFile, but the image read from webcam is in results['img'].

class easycv.datasets.detection.pipelines.LoadMultiChannelImageFromFiles(to_float32=False, color_type='unchanged', file_client_args={'backend': 'disk'})[source]

Bases: object

Load multi-channel images from a list of separate channel files. Required keys are “img_prefix” and “img_info” (a dict that must contain the key “filename”, which is expected to be a list of filenames). Added or updated keys are “filename”, “img”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0) and “img_norm_cfg” (means=0 and stds=1). :param to_float32: Whether to convert the loaded image to a float32

numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.

Parameters
  • color_type (str) – The flag argument for mmcv.imfrombytes(). Defaults to ‘color’.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').

__init__(to_float32=False, color_type='unchanged', file_client_args={'backend': 'disk'})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.LoadAnnotations(with_bbox=True, with_label=True, with_mask=False, with_seg=False, poly2mask=True, file_client_args={'backend': 'disk'})[source]

Bases: object

Load multiple types of annotations. :param with_bbox: Whether to parse and load the bbox annotation.

Default: True.

Parameters
  • with_label (bool) – Whether to parse and load the label annotation. Default: True.

  • with_mask (bool) – Whether to parse and load the mask annotation. Default: False.

  • with_seg (bool) – Whether to parse and load the semantic segmentation annotation. Default: False.

  • poly2mask (bool) – Whether to convert the instance masks from polygons to bitmaps. Default: True.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').

__init__(with_bbox=True, with_label=True, with_mask=False, with_seg=False, poly2mask=True, file_client_args={'backend': 'disk'})[source]

Initialize self. See help(type(self)) for accurate signature.

process_polygons(polygons)[source]

Convert polygons to list of ndarray and filter invalid polygons. :param polygons: Polygons of one instance. :type polygons: list[list]

Returns

Processed polygons.

Return type

list[numpy.ndarray]

class easycv.datasets.detection.pipelines.MMMultiScaleFlipAug(transforms, img_scale=None, scale_factor=None, flip=False, flip_direction='horizontal')[source]

Bases: object

Test-time augmentation with multiple scales and flipping. An example configuration is as followed: .. code-block:

img_scale=[(1333, 400), (1333, 800)],
flip=True,
transforms=[
    dict(type='Resize', keep_ratio=True),
    dict(type='RandomFlip'),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='ImageToTensor', keys=['img']),
    dict(type='Collect', keys=['img']),
]

After MultiScaleFLipAug with above configuration, the results are wrapped into lists of the same length as followed: .. code-block:

dict(
    img=[...],
    img_shape=[...],
    scale=[(1333, 400), (1333, 400), (1333, 800), (1333, 800)]
    flip=[False, True, False, True]
    ...
)
Parameters
  • transforms (list[dict]) – Transforms to apply in each augmentation.

  • img_scale (tuple | list[tuple] | None) – Images scales for resizing.

  • scale_factor (float | list[float] | None) – Scale factors for resizing.

  • flip (bool) – Whether apply flip augmentation. Default: False.

  • flip_direction (str | list[str]) – Flip augmentation directions, options are “horizontal”, “vertical” and “diagonal”. If flip_direction is a list, multiple flip augmentations will be applied. It has no effect when flip == False. Default: “horizontal”.

__init__(transforms, img_scale=None, scale_factor=None, flip=False, flip_direction='horizontal')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.MMRandomCrop(crop_size, crop_type='absolute', allow_negative_crop=False, recompute_bbox=False, bbox_clip_border=True)[source]

Bases: object

Random crop the image & bboxes & masks.

The absolute crop_size is sampled based on crop_type and image_size, then the cropped results are generated.

Parameters
  • crop_size (tuple) – The relative ratio or absolute pixels of height and width.

  • crop_type (str, optional) – one of “relative_range”, “relative”, “absolute”, “absolute_range”. “relative” randomly crops (h * crop_size[0], w * crop_size[1]) part from an input of size (h, w). “relative_range” uniformly samples relative crop size from range [crop_size[0], 1] and [crop_size[1], 1] for height and width respectively. “absolute” crops from an input with absolute size (crop_size[0], crop_size[1]). “absolute_range” uniformly samples crop_h in range [crop_size[0], min(h, crop_size[1])] and crop_w in range [crop_size[0], min(w, crop_size[1])]. Default “absolute”.

  • allow_negative_crop (bool, optional) – Whether to allow a crop that does not contain any bbox area. Default False.

  • recompute_bbox (bool, optional) – Whether to re-compute the boxes based on cropped instance masks. Default False.

  • bbox_clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.

Note

  • If the image is smaller than the absolute crop size, return the

    original image.

  • The keys for bboxes, labels and masks must be aligned. That is, gt_bboxes corresponds to gt_labels and gt_masks, and gt_bboxes_ignore corresponds to gt_labels_ignore and gt_masks_ignore.

  • If the crop does not contain any gt-bbox region and allow_negative_crop is set to False, skip this image.

__init__(crop_size, crop_type='absolute', allow_negative_crop=False, recompute_bbox=False, bbox_clip_border=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.MMFilterAnnotations(min_gt_bbox_wh=(1.0, 1.0), min_gt_mask_area=1, by_box=True, by_mask=False, keep_empty=True)[source]

Bases: object

Filter invalid annotations. :param min_gt_bbox_wh: Minimum width and height of ground truth

boxes. Default: (1., 1.)

Parameters
  • min_gt_mask_area (int) – Minimum foreground area of ground truth masks. Default: 1

  • by_box (bool) – Filter instances with bounding boxes not meeting the min_gt_bbox_wh threshold. Default: True

  • by_mask (bool) – Filter instances with masks not meeting min_gt_mask_area threshold. Default: False

  • keep_empty (bool) – Whether to return None when it becomes an empty bbox after filtering. Default: True

__init__(min_gt_bbox_wh=(1.0, 1.0), min_gt_mask_area=1, by_box=True, by_mask=False, keep_empty=True)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.detection.pipelines.mm_transforms module
class easycv.datasets.detection.pipelines.mm_transforms.MMToTensor[source]

Bases: object

Transform image to Tensor. Required key: ‘img’. Modifies key: ‘img’. :param results: contain all information about training. :type results: dict

class easycv.datasets.detection.pipelines.mm_transforms.NormalizeTensor(mean, std)[source]

Bases: object

Normalize the Tensor image (CxHxW), with mean and std. Required key: ‘img’. Modifies key: ‘img’. :param mean: Mean values of 3 channels. :type mean: list[float] :param std: Std values of 3 channels. :type std: list[float]

__init__(mean, std)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.MMMosaic(img_scale=(640, 640), center_ratio_range=(0.5, 1.5), pad_val=114)[source]

Bases: object

Mosaic augmentation. Given 4 images, mosaic transform combines them into one output image. The output image is composed of the parts from each sub- image. .. code:: text

mosaic transform

center_x

center_y |----+-------------+-----------|
| cropped | |

|pad | image3 | image4 | | | | | +----|————-+———–+

The mosaic transform steps are as follows:
  1. Choose the mosaic center as the intersections of 4 images

  2. Get the left top image according to the index, and randomly sample another 3 images from the custom dataset.

  3. Sub image will be cropped if image is larger than mosaic patch

Parameters
  • img_scale (Sequence[int]) – Image size after mosaic pipeline of single image. Default to (640, 640).

  • center_ratio_range (Sequence[float]) – Center ratio range of mosaic output. Default to (0.5, 1.5).

  • pad_val (int) – Pad value. Default to 114.

__init__(img_scale=(640, 640), center_ratio_range=(0.5, 1.5), pad_val=114)[source]

Initialize self. See help(type(self)) for accurate signature.

get_indexes(dataset)[source]

Call function to collect indexes. :param dataset: The dataset. :type dataset: DetImagesMixDataset

Returns

indexes.

Return type

list

class easycv.datasets.detection.pipelines.mm_transforms.MMMixUp(img_scale=(640, 640), ratio_range=(0.5, 1.5), flip_ratio=0.5, pad_val=114, max_iters=15, min_bbox_size=5, min_area_ratio=0.2, max_aspect_ratio=20)[source]

Bases: object

MixUp data augmentation. .. code:: text

mixup transform

mixup image |

+——–|——–+ | | |

—————+ |
image |
|—————–+

pad

The mixup transform steps are as follows::
  1. Another random image is picked by dataset and embedded in the top left patch(after padding and resizing)

  2. The target of mixup transform is the weighted average of mixup image and origin image.

Parameters
  • img_scale (Sequence[int]) – Image output size after mixup pipeline. Default: (640, 640).

  • ratio_range (Sequence[float]) – Scale ratio of mixup image. Default: (0.5, 1.5).

  • flip_ratio (float) – Horizontal flip ratio of mixup image. Default: 0.5.

  • pad_val (int) – Pad value. Default: 114.

  • max_iters (int) – The maximum number of iterations. If the number of iterations is greater than max_iters, but gt_bbox is still empty, then the iteration is terminated. Default: 15.

  • min_bbox_size (float) – Width and height threshold to filter bboxes. If the height or width of a box is smaller than this value, it will be removed. Default: 5.

  • min_area_ratio (float) – Threshold of area ratio between original bboxes and wrapped bboxes. If smaller than this value, the box will be removed. Default: 0.2.

  • max_aspect_ratio (float) – Aspect ratio of width and height threshold to filter bboxes. If max(h/w, w/h) larger than this value, the box will be removed. Default: 20.

__init__(img_scale=(640, 640), ratio_range=(0.5, 1.5), flip_ratio=0.5, pad_val=114, max_iters=15, min_bbox_size=5, min_area_ratio=0.2, max_aspect_ratio=20)[source]

Initialize self. See help(type(self)) for accurate signature.

get_indexes(dataset)[source]

Call function to collect indexes. :param dataset: The dataset. :type dataset: DetImagesMixDataset

Returns

indexes.

Return type

list

class easycv.datasets.detection.pipelines.mm_transforms.MMRandomAffine(max_rotate_degree=10.0, max_translate_ratio=0.1, scaling_ratio_range=(0.5, 1.5), max_shear_degree=2.0, border=(0, 0), border_val=(114, 114, 114), min_bbox_size=2, min_area_ratio=0.2, max_aspect_ratio=20)[source]

Bases: object

Random affine transform data augmentation. for yolox This operation randomly generates affine transform matrix which including rotation, translation, shear and scaling transforms. :param max_rotate_degree: Maximum degrees of rotation transform.

Default: 10.

Parameters
  • max_translate_ratio (float) – Maximum ratio of translation. Default: 0.1.

  • scaling_ratio_range (tuple[float]) – Min and max ratio of scaling transform. Default: (0.5, 1.5).

  • max_shear_degree (float) – Maximum degrees of shear transform. Default: 2.

  • border (tuple[int]) – Distance from height and width sides of input image to adjust output shape. Only used in mosaic dataset. Default: (0, 0).

  • border_val (tuple[int]) – Border padding values of 3 channels. Default: (114, 114, 114).

  • min_bbox_size (float) – Width and height threshold to filter bboxes. If the height or width of a box is smaller than this value, it will be removed. Default: 2.

  • min_area_ratio (float) – Threshold of area ratio between original bboxes and wrapped bboxes. If smaller than this value, the box will be removed. Default: 0.2.

  • max_aspect_ratio (float) – Aspect ratio of width and height threshold to filter bboxes. If max(h/w, w/h) larger than this value, the box will be removed.

__init__(max_rotate_degree=10.0, max_translate_ratio=0.1, scaling_ratio_range=(0.5, 1.5), max_shear_degree=2.0, border=(0, 0), border_val=(114, 114, 114), min_bbox_size=2, min_area_ratio=0.2, max_aspect_ratio=20)[source]

Initialize self. See help(type(self)) for accurate signature.

filter_gt_bboxes(origin_bboxes, wrapped_bboxes)[source]
class easycv.datasets.detection.pipelines.mm_transforms.MMPhotoMetricDistortion(brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18)[source]

Bases: object

Apply photometric distortion to image sequentially, every transformation is applied with a probability of 0.5. The position of random contrast is in second or second to last. 1. random brightness 2. random contrast (mode 0) 3. convert color from BGR to HSV 4. random saturation 5. random hue 6. convert color from HSV to BGR 7. random contrast (mode 1) 8. randomly swap channels :param brightness_delta: delta of brightness. :type brightness_delta: int :param contrast_range: range of contrast. :type contrast_range: tuple :param saturation_range: range of saturation. :type saturation_range: tuple :param hue_delta: delta of hue. :type hue_delta: int

__init__(brightness_delta=32, contrast_range=(0.5, 1.5), saturation_range=(0.5, 1.5), hue_delta=18)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.MMResize(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', override=False)[source]

Bases: object

Resize images & bbox & mask. This transform resizes the input image to some scale. Bboxes and masks are then resized with the same scale factor. If the input dict contains the key “scale”, then the scale in the input dict is used, otherwise the specified scale in the init method is used. If the input dict contains the key “scale_factor” (if MultiScaleFlipAug does not give img_scale but scale_factor), the actual scale will be computed by image shape and scale_factor. img_scale can either be a tuple (single-scale) or a list of tuple (multi-scale). There are 3 multiscale modes: - ratio_range is not None: randomly sample a ratio from the ratio range and multiply it with the image scale. - ratio_range is None and multiscale_mode == "range": randomly sample a scale from the multiscale range. - ratio_range is None and multiscale_mode == "value": randomly sample a scale from multiple scales. :param img_scale: Images scales for resizing. :type img_scale: tuple or list[tuple] :param multiscale_mode: Either “range” or “value”. :type multiscale_mode: str :param ratio_range: (min_ratio, max_ratio) :type ratio_range: tuple[float] :param keep_ratio: Whether to keep the aspect ratio when resizing the

image.

Parameters
  • bbox_clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.

  • backend (str) – Image resize backend, choices are ‘cv2’ and ‘pillow’. These two backends generates slightly different results. Defaults to ‘cv2’.

  • override (bool, optional) – Whether to override scale and scale_factor so as to call resize twice. Default False. If True, after the first resizing, the existed scale and scale_factor will be ignored so the second resizing can be allowed. This option is a work-around for multiple times of resize in DETR. Defaults to False.

__init__(img_scale=None, multiscale_mode='range', ratio_range=None, keep_ratio=True, bbox_clip_border=True, backend='cv2', override=False)[source]

Initialize self. See help(type(self)) for accurate signature.

static random_select(img_scales)[source]

Randomly select an img_scale from given candidates. :param img_scales: Images scales for selection. :type img_scales: list[tuple]

Returns

Returns a tuple (img_scale, scale_dix), where img_scale is the selected image scale and scale_idx is the selected index in the given candidates.

Return type

(tuple, int)

static random_sample(img_scales)[source]

Randomly sample an img_scale when multiscale_mode=='range'. :param img_scales: Images scale range for sampling.

There must be two tuples in img_scales, which specify the lower and upper bound of image scales.

Returns

Returns a tuple (img_scale, None), where img_scale is sampled scale and None is just a placeholder to be consistent with random_select().

Return type

(tuple, None)

static random_sample_ratio(img_scale, ratio_range)[source]

Randomly sample an img_scale when ratio_range is specified. A ratio will be randomly sampled from the range specified by ratio_range. Then it would be multiplied with img_scale to generate sampled scale. :param img_scale: Images scale base to multiply with ratio. :type img_scale: tuple :param ratio_range: The minimum and maximum ratio to scale

the img_scale.

Returns

Returns a tuple (scale, None), where scale is sampled ratio multiplied with img_scale and None is just a placeholder to be consistent with random_select().

Return type

(tuple, None)

class easycv.datasets.detection.pipelines.mm_transforms.MMRandomFlip(flip_ratio=None, direction='horizontal')[source]

Bases: object

Flip the image & bbox & mask. If the input dict contains the key “flip”, then the flag will be used, otherwise it will be randomly decided by a ratio specified in the init method. When random flip is enabled, flip_ratio/direction can either be a float/string or tuple of float/string. There are 3 flip modes: - flip_ratio is float, direction is string: the image will be

direction``ly flipped with probability of ``flip_ratio . E.g., flip_ratio=0.5, direction='horizontal', then image will be horizontally flipped with probability of 0.5.

  • flip_ratio is float, direction is list of string: the image wil

    be direction[i]``ly flipped with probability of ``flip_ratio/len(direction). E.g., flip_ratio=0.5, direction=['horizontal', 'vertical'], then image will be horizontally flipped with probability of 0.25, vertically with probability of 0.25.

  • flip_ratio is list of float, direction is list of string:

    given len(flip_ratio) == len(direction), the image wil be direction[i]``ly flipped with probability of ``flip_ratio[i]. E.g., flip_ratio=[0.3, 0.5], direction=['horizontal', 'vertical'], then image will be horizontally flipped with probability of 0.3, vertically with probability of 0.5.

Parameters
  • flip_ratio (float | list[float], optional) – The flipping probability. Default: None.

  • direction (str | list[str], optional) – The flipping direction. Options are ‘horizontal’, ‘vertical’, ‘diagonal’. Default: ‘horizontal’. If input is a list, the length must equal flip_ratio. Each element in flip_ratio indicates the flip probability of corresponding direction.

__init__(flip_ratio=None, direction='horizontal')[source]

Initialize self. See help(type(self)) for accurate signature.

bbox_flip(bboxes, img_shape, direction)[source]

Flip bboxes horizontally. :param bboxes: Bounding boxes, shape (…, 4*k) :type bboxes: numpy.ndarray :param img_shape: Image shape (height, width) :type img_shape: tuple[int] :param direction: Flip direction. Options are ‘horizontal’,

‘vertical’.

Returns

Flipped bounding boxes.

Return type

numpy.ndarray

class easycv.datasets.detection.pipelines.mm_transforms.MMRandomCrop(crop_size, crop_type='absolute', allow_negative_crop=False, recompute_bbox=False, bbox_clip_border=True)[source]

Bases: object

Random crop the image & bboxes & masks.

The absolute crop_size is sampled based on crop_type and image_size, then the cropped results are generated.

Parameters
  • crop_size (tuple) – The relative ratio or absolute pixels of height and width.

  • crop_type (str, optional) – one of “relative_range”, “relative”, “absolute”, “absolute_range”. “relative” randomly crops (h * crop_size[0], w * crop_size[1]) part from an input of size (h, w). “relative_range” uniformly samples relative crop size from range [crop_size[0], 1] and [crop_size[1], 1] for height and width respectively. “absolute” crops from an input with absolute size (crop_size[0], crop_size[1]). “absolute_range” uniformly samples crop_h in range [crop_size[0], min(h, crop_size[1])] and crop_w in range [crop_size[0], min(w, crop_size[1])]. Default “absolute”.

  • allow_negative_crop (bool, optional) – Whether to allow a crop that does not contain any bbox area. Default False.

  • recompute_bbox (bool, optional) – Whether to re-compute the boxes based on cropped instance masks. Default False.

  • bbox_clip_border (bool, optional) – Whether clip the objects outside the border of the image. Defaults to True.

Note

  • If the image is smaller than the absolute crop size, return the

    original image.

  • The keys for bboxes, labels and masks must be aligned. That is, gt_bboxes corresponds to gt_labels and gt_masks, and gt_bboxes_ignore corresponds to gt_labels_ignore and gt_masks_ignore.

  • If the crop does not contain any gt-bbox region and allow_negative_crop is set to False, skip this image.

__init__(crop_size, crop_type='absolute', allow_negative_crop=False, recompute_bbox=False, bbox_clip_border=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.MMPad(size=None, size_divisor=None, pad_to_square=False, pad_val={'img': 0, 'masks': 0, 'seg': 255})[source]

Bases: object

Pad the image & mask. There are two padding modes: (1) pad to a fixed size and (2) pad to the minimum size that is divisible by some number. Added keys are “pad_shape”, “pad_fixed_size”, “pad_size_divisor”, :param size: Fixed padding size. :type size: tuple, optional :param size_divisor: The divisor of padded size. :type size_divisor: int, optional :param pad_to_square: Whether to pad the image into a square.

Currently only used for YOLOX. Default: False.

Parameters

pad_val (dict, optional) – A dict for padding value, the default value is dict(img=0, masks=0, seg=255).

__init__(size=None, size_divisor=None, pad_to_square=False, pad_val={'img': 0, 'masks': 0, 'seg': 255})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.MMNormalize(mean, std, to_rgb=True)[source]

Bases: object

Normalize the image. Added key is “img_norm_cfg”. :param mean: Mean values of 3 channels. :type mean: sequence :param std: Std values of 3 channels. :type std: sequence :param to_rgb: Whether to convert the image from BGR to RGB,

default is true.

__init__(mean, std, to_rgb=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.LoadImageFromFile(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]

Bases: object

Load an image from file. Required keys are “img_prefix” and “img_info” (a dict that must contain the key “filename”). Added or updated keys are “filename”, “img”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0) and “img_norm_cfg” (means=0 and stds=1). :param to_float32: Whether to convert the loaded image to a float32

numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.

Parameters
  • color_type (str) – The flag argument for mmcv.imfrombytes(). Defaults to ‘color’.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').

__init__(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.LoadImageFromWebcam(to_float32=False, color_type='color', file_client_args={'backend': 'disk'})[source]

Bases: easycv.datasets.detection.pipelines.mm_transforms.LoadImageFromFile

Load an image from webcam.

Similar with LoadImageFromFile, but the image read from webcam is in results['img'].

class easycv.datasets.detection.pipelines.mm_transforms.LoadMultiChannelImageFromFiles(to_float32=False, color_type='unchanged', file_client_args={'backend': 'disk'})[source]

Bases: object

Load multi-channel images from a list of separate channel files. Required keys are “img_prefix” and “img_info” (a dict that must contain the key “filename”, which is expected to be a list of filenames). Added or updated keys are “filename”, “img”, “img_shape”, “ori_shape” (same as img_shape), “pad_shape” (same as img_shape), “scale_factor” (1.0) and “img_norm_cfg” (means=0 and stds=1). :param to_float32: Whether to convert the loaded image to a float32

numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.

Parameters
  • color_type (str) – The flag argument for mmcv.imfrombytes(). Defaults to ‘color’.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').

__init__(to_float32=False, color_type='unchanged', file_client_args={'backend': 'disk'})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.LoadAnnotations(with_bbox=True, with_label=True, with_mask=False, with_seg=False, poly2mask=True, file_client_args={'backend': 'disk'})[source]

Bases: object

Load multiple types of annotations. :param with_bbox: Whether to parse and load the bbox annotation.

Default: True.

Parameters
  • with_label (bool) – Whether to parse and load the label annotation. Default: True.

  • with_mask (bool) – Whether to parse and load the mask annotation. Default: False.

  • with_seg (bool) – Whether to parse and load the semantic segmentation annotation. Default: False.

  • poly2mask (bool) – Whether to convert the instance masks from polygons to bitmaps. Default: True.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').

__init__(with_bbox=True, with_label=True, with_mask=False, with_seg=False, poly2mask=True, file_client_args={'backend': 'disk'})[source]

Initialize self. See help(type(self)) for accurate signature.

process_polygons(polygons)[source]

Convert polygons to list of ndarray and filter invalid polygons. :param polygons: Polygons of one instance. :type polygons: list[list]

Returns

Processed polygons.

Return type

list[numpy.ndarray]

class easycv.datasets.detection.pipelines.mm_transforms.LoadPanopticAnnotations(with_bbox=True, with_label=True, with_mask=True, with_seg=True, file_client_args={'backend': 'disk'})[source]

Bases: easycv.datasets.detection.pipelines.mm_transforms.LoadAnnotations

Load multiple types of panoptic annotations.

Parameters
  • with_bbox (bool) – Whether to parse and load the bbox annotation. Default: True.

  • with_label (bool) – Whether to parse and load the label annotation. Default: True.

  • with_mask (bool) – Whether to parse and load the mask annotation. Default: True.

  • with_seg (bool) – Whether to parse and load the semantic segmentation annotation. Default: True.

  • file_client_args (dict) – Arguments to instantiate a FileClient. See mmcv.fileio.FileClient for details. Defaults to dict(backend='disk').

__init__(with_bbox=True, with_label=True, with_mask=True, with_seg=True, file_client_args={'backend': 'disk'})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.MMMultiScaleFlipAug(transforms, img_scale=None, scale_factor=None, flip=False, flip_direction='horizontal')[source]

Bases: object

Test-time augmentation with multiple scales and flipping. An example configuration is as followed: .. code-block:

img_scale=[(1333, 400), (1333, 800)],
flip=True,
transforms=[
    dict(type='Resize', keep_ratio=True),
    dict(type='RandomFlip'),
    dict(type='Normalize', **img_norm_cfg),
    dict(type='Pad', size_divisor=32),
    dict(type='ImageToTensor', keys=['img']),
    dict(type='Collect', keys=['img']),
]

After MultiScaleFLipAug with above configuration, the results are wrapped into lists of the same length as followed: .. code-block:

dict(
    img=[...],
    img_shape=[...],
    scale=[(1333, 400), (1333, 400), (1333, 800), (1333, 800)]
    flip=[False, True, False, True]
    ...
)
Parameters
  • transforms (list[dict]) – Transforms to apply in each augmentation.

  • img_scale (tuple | list[tuple] | None) – Images scales for resizing.

  • scale_factor (float | list[float] | None) – Scale factors for resizing.

  • flip (bool) – Whether apply flip augmentation. Default: False.

  • flip_direction (str | list[str]) – Flip augmentation directions, options are “horizontal”, “vertical” and “diagonal”. If flip_direction is a list, multiple flip augmentations will be applied. It has no effect when flip == False. Default: “horizontal”.

__init__(transforms, img_scale=None, scale_factor=None, flip=False, flip_direction='horizontal')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.detection.pipelines.mm_transforms.MMFilterAnnotations(min_gt_bbox_wh=(1.0, 1.0), min_gt_mask_area=1, by_box=True, by_mask=False, keep_empty=True)[source]

Bases: object

Filter invalid annotations. :param min_gt_bbox_wh: Minimum width and height of ground truth

boxes. Default: (1., 1.)

Parameters
  • min_gt_mask_area (int) – Minimum foreground area of ground truth masks. Default: 1

  • by_box (bool) – Filter instances with bounding boxes not meeting the min_gt_bbox_wh threshold. Default: True

  • by_mask (bool) – Filter instances with masks not meeting min_gt_mask_area threshold. Default: False

  • keep_empty (bool) – Whether to return None when it becomes an empty bbox after filtering. Default: True

__init__(min_gt_bbox_wh=(1.0, 1.0), min_gt_mask_area=1, by_box=True, by_mask=False, keep_empty=True)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.detection.mix module
class easycv.datasets.detection.mix.DetImagesMixDataset(data_source, pipeline, dynamic_scale=None, skip_type_keys=None, profiling=False, classes=None, yolo_format=True, label_padding=True)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

A wrapper of multiple images mixed dataset.

Suitable for training on multiple images mixed data augmentation like mosaic and mixup. For the augmentation pipeline of mixed image data, the get_indexes method needs to be provided to obtain the image indexes, and you can set skip_flags to change the pipeline running process. At the same time, we provide the dynamic_scale parameter to dynamically change the output image size.

output boxes format: cx, cy, w, h

Parameters
  • data_source (DetSourceCoco) – The dataset to be mixed.

  • pipeline (Sequence[dict]) – Sequence of transform object or config dict to be composed.

  • dynamic_scale (tuple[int], optional) – The image scale can be changed dynamically. Default to None.

  • skip_type_keys (list[str], optional) – Sequence of type string to be skip pipeline. Default to None.

  • label_padding – out labeling padding [N, 120, 5]

__init__(data_source, pipeline, dynamic_scale=None, skip_type_keys=None, profiling=False, classes=None, yolo_format=True, label_padding=True)[source]

Args: data_source: Data_source config dict pipeline: Pipeline config list profiling: If set True, will print pipeline time classes: A list of class names, used in evaluation for result and groundtruth visualization

update_skip_type_keys(skip_type_keys)[source]

Update skip_type_keys. It is called by an external hook.

Parameters

skip_type_keys (list[str], optional) – Sequence of type string to be skip pipeline.

update_dynamic_scale(dynamic_scale)[source]

Update dynamic_scale. It is called by an external hook.

Parameters

dynamic_scale (tuple[int]) – The image scale can be changed dynamically.

results2json(results, outfile_prefix)[source]

Dump the detection results to a COCO style json file.

There are 3 types of results: proposals, bbox predictions, mask predictions, and they have different data types. This method will automatically recognize the type, and dump them to json files.

Parameters
  • results (list[list | tuple | ndarray]) – Testing results of the dataset.

  • outfile_prefix (str) – The filename prefix of the json files. If the prefix is “somepath/xxx”, the json files will be named “somepath/xxx.bbox.json”, “somepath/xxx.segm.json”, “somepath/xxx.proposal.json”.

Returns

str]: Possible keys are “bbox”, “segm”, “proposal”, and values are corresponding filenames.

Return type

dict[str

format_results(results, jsonfile_prefix=None, **kwargs)[source]

Format the results to json (standard format for COCO evaluation).

Parameters
  • results (list[tuple | numpy.ndarray]) – Testing results of the dataset.

  • jsonfile_prefix (str | None) – The prefix of json files. It includes the file path and the prefix of filename, e.g., “a/b/prefix”. If not specified, a temp file will be created. Default: None.

Returns

(result_files, tmp_dir), result_files is a dict containing the json filepaths, tmp_dir is the temporal directory created for saving json files when jsonfile_prefix is not specified.

Return type

tuple

easycv.datasets.detection.raw module
class easycv.datasets.detection.raw.DetDataset(data_source, pipeline, profiling=False, classes=None)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Dataset for Detection

__init__(data_source, pipeline, profiling=False, classes=None)[source]
Parameters
  • data_source – Data_source config dict

  • pipeline – Pipeline config list

  • profiling – If set True, will print pipeline time

  • classes – A list of class names, used in evaluation for result and groundtruth visualization

evaluate(results, evaluators=None, logger=None)[source]

Evaluates the detection boxes. :param results: A dictionary containing

detection_boxes: List of length number of test images.

Float32 numpy array of shape [num_boxes, 4] and format [ymin, xmin, ymax, xmax] in absolute image coordinates.

detection_scores: List of length number of test images,

detection scores for the boxes, float32 numpy array of shape [num_boxes].

detection_classes: List of length number of test images,

integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.

img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

Parameters

evaluators – evaluators to calculate metric with results and groundtruth_dict

visualize(results, vis_num=10, score_thr=0.3, **kwargs)[source]

Visulaize the model output on validation data. :param results: A dictionary containing

detection_boxes: List of length number of test images.

Float32 numpy array of shape [num_boxes, 4] and format [ymin, xmin, ymax, xmax] in absolute image coordinates.

detection_scores: List of length number of test images,

detection scores for the boxes, float32 numpy array of shape [num_boxes].

detection_classes: List of length number of test images,

integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.

img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

Parameters
  • vis_num – number of images visualized

  • score_thr – The threshold to filter box, boxes with scores greater than score_thr will be kept.

Returns: A dictionary containing

images: Visulaized images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

easycv.datasets.loader package

class easycv.datasets.loader.GroupSampler(dataset, samples_per_gpu=1)[source]

Bases: Generic[torch.utils.data.sampler.T_co]

__init__(dataset, samples_per_gpu=1)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.loader.DistributedGroupSampler(dataset, samples_per_gpu=1, seed=0, num_replicas=None, rank=None)[source]

Bases: Generic[torch.utils.data.sampler.T_co]

Sampler that restricts data loading to a subset of the dataset. It is especially useful in conjunction with torch.nn.parallel.DistributedDataParallel. In such case, each process can pass a DistributedSampler instance as a DataLoader sampler, and load a subset of the original dataset that is exclusive to it. .. note:

Dataset is assumed to be of constant size.
Parameters
  • dataset – Dataset used for sampling.

  • seed (int, Optional) – The seed. Default to 0.

  • num_replicas (optional) – Number of processes participating in distributed training.

  • rank (optional) – Rank of the current process within num_replicas.

__init__(dataset, samples_per_gpu=1, seed=0, num_replicas=None, rank=None)[source]

Initialize self. See help(type(self)) for accurate signature.

set_epoch(epoch)[source]
easycv.datasets.loader.build_dataloader(dataset, imgs_per_gpu, workers_per_gpu, num_gpus=1, dist=True, shuffle=True, replace=False, seed=None, reuse_worker_cache=False, odps_config=None, persistent_workers=False, collate_hooks=None, use_repeated_augment_sampler=False, sampler=None, pin_memory=False, **kwargs)[source]

Build PyTorch DataLoader. In distributed training, each GPU/process has a dataloader. In non-distributed training, there is only one dataloader for all GPUs. :param dataset: A PyTorch dataset. :type dataset: Dataset :param imgs_per_gpu: Number of images on each GPU, i.e., batch size of

each GPU.

Parameters
  • workers_per_gpu (int) – How many subprocesses to use for data loading for each GPU.

  • num_gpus (int) – Number of GPUs. Only used in non-distributed training.

  • dist (bool) – Distributed training/test or not. Default: True.

  • shuffle (bool) – Whether to shuffle the data at every epoch. Default: True.

  • replace (bool) – Replace or not in random shuffle. It works on when shuffle is True.

  • seed (int, Optional) – The seed. Default to None.

  • reuse_worker_cache (bool) – If set true, will reuse worker process so that cached data in worker process can be reused.

  • persistent_workers (bool) – After pytorch1.7, could use persistent_workers=True to avoid reconstruct dataworker before each epoch, speed up before epoch

  • use_repeated_augment_sampler (bool) – If set true, it will use RASampler. Default: False.

  • kwargs – any keyword argument to be used to initialize DataLoader

Returns

A PyTorch dataloader.

Return type

DataLoader

class easycv.datasets.loader.DistributedGivenIterationSampler(dataset, total_iter, batch_size, num_replicas=None, rank=None, last_iter=- 1)[source]

Bases: Generic[torch.utils.data.sampler.T_co]

__init__(dataset, total_iter, batch_size, num_replicas=None, rank=None, last_iter=- 1)[source]

Initialize self. See help(type(self)) for accurate signature.

set_uniform_indices(labels, num_classes)[source]
gen_new_list()[source]
set_epoch(epoch)[source]
class easycv.datasets.loader.DistributedMPSampler(dataset, num_replicas=None, rank=None, shuffle=True, split_huge_listfile_byrank=False, **kwargs)[source]

Bases: torch.utils.data.sampler.Sampler[torch.utils.data.distributed.T_co]

__init__(dataset, num_replicas=None, rank=None, shuffle=True, split_huge_listfile_byrank=False, **kwargs)[source]

A Distribute sampler which support sample m instance from one class once for classification dataset dataset: pytorch dataset object num_replicas (optional): Number of processes participating in

distributed training.

rank (optional): Rank of the current process within num_replicas. shuffle (optional): If true (default), sampler will shuffle the indices split_huge_listfile_byrank: if split, return all indice for each rank, because list for each rank has been

split before build dataset in dist training

generate_indice()[source]
get_label_dict()[source]
calculate_this_label_list()[source]
class easycv.datasets.loader.RASampler(dataset, num_replicas=None, rank=None, shuffle=True, num_repeats: int = 3, **kwargs)[source]

Bases: Generic[torch.utils.data.sampler.T_co]

Sampler that restricts data loading to a subset of the dataset for distributed, with repeated augmentation. It ensures that different each augmented version of a sample will be visible to a different process (GPU) Heavily based on torch.utils.data.DistributedSampler

__init__(dataset, num_replicas=None, rank=None, shuffle=True, num_repeats: int = 3, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

set_epoch(epoch)[source]
class easycv.datasets.loader.DistributedSampler(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, replace=False, split_huge_listfile_byrank=False)[source]

Bases: torch.utils.data.sampler.Sampler[torch.utils.data.distributed.T_co]

__init__(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, replace=False, split_huge_listfile_byrank=False)[source]

A Distribute sampler which support sample m instance from one class once for classification dataset :param dataset: pytorch dataset object :param num_replicas: Number of processes participating in

distributed training.

Parameters
  • rank (optional) – Rank of the current process within num_replicas.

  • shuffle (optional) – If true (default), sampler will shuffle the indices

  • seed (int, Optional) – The seed. Default to 0.

  • split_huge_listfile_byrank – if split, return all indice for each rank, because list for each rank has been split before build dataset in dist training

generate_new_list()[source]
set_uniform_indices(labels, num_classes)[source]
Submodules
easycv.datasets.loader.build_loader module
easycv.datasets.loader.build_loader.build_dataloader(dataset, imgs_per_gpu, workers_per_gpu, num_gpus=1, dist=True, shuffle=True, replace=False, seed=None, reuse_worker_cache=False, odps_config=None, persistent_workers=False, collate_hooks=None, use_repeated_augment_sampler=False, sampler=None, pin_memory=False, **kwargs)[source]

Build PyTorch DataLoader. In distributed training, each GPU/process has a dataloader. In non-distributed training, there is only one dataloader for all GPUs. :param dataset: A PyTorch dataset. :type dataset: Dataset :param imgs_per_gpu: Number of images on each GPU, i.e., batch size of

each GPU.

Parameters
  • workers_per_gpu (int) – How many subprocesses to use for data loading for each GPU.

  • num_gpus (int) – Number of GPUs. Only used in non-distributed training.

  • dist (bool) – Distributed training/test or not. Default: True.

  • shuffle (bool) – Whether to shuffle the data at every epoch. Default: True.

  • replace (bool) – Replace or not in random shuffle. It works on when shuffle is True.

  • seed (int, Optional) – The seed. Default to None.

  • reuse_worker_cache (bool) – If set true, will reuse worker process so that cached data in worker process can be reused.

  • persistent_workers (bool) – After pytorch1.7, could use persistent_workers=True to avoid reconstruct dataworker before each epoch, speed up before epoch

  • use_repeated_augment_sampler (bool) – If set true, it will use RASampler. Default: False.

  • kwargs – any keyword argument to be used to initialize DataLoader

Returns

A PyTorch dataloader.

Return type

DataLoader

easycv.datasets.loader.build_loader.worker_init_fn(worker_id, num_workers, rank, seed, odps_config=None)[source]
class easycv.datasets.loader.build_loader.InfiniteDataLoader(*args, **kwargs)[source]

Bases: Generic[torch.utils.data.dataloader.T_co]

Dataloader that reuses workers. https://github.com/pytorch/pytorch/issues/15849 Uses same syntax as vanilla DataLoader.

__init__(*args, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

dataset: torch.utils.data.dataset.Dataset[torch.utils.data.dataloader.T_co]
batch_size: Optional[int]
num_workers: int
pin_memory: bool
drop_last: bool
timeout: float
sampler: Union[torch.utils.data.sampler.Sampler, Iterable]
pin_memory_device: str
prefetch_factor: int
easycv.datasets.loader.sampler module
class easycv.datasets.loader.sampler.DistributedMPSampler(dataset, num_replicas=None, rank=None, shuffle=True, split_huge_listfile_byrank=False, **kwargs)[source]

Bases: torch.utils.data.sampler.Sampler[torch.utils.data.distributed.T_co]

__init__(dataset, num_replicas=None, rank=None, shuffle=True, split_huge_listfile_byrank=False, **kwargs)[source]

A Distribute sampler which support sample m instance from one class once for classification dataset dataset: pytorch dataset object num_replicas (optional): Number of processes participating in

distributed training.

rank (optional): Rank of the current process within num_replicas. shuffle (optional): If true (default), sampler will shuffle the indices split_huge_listfile_byrank: if split, return all indice for each rank, because list for each rank has been

split before build dataset in dist training

generate_indice()[source]
get_label_dict()[source]
calculate_this_label_list()[source]
class easycv.datasets.loader.sampler.DistributedSampler(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, replace=False, split_huge_listfile_byrank=False)[source]

Bases: torch.utils.data.sampler.Sampler[torch.utils.data.distributed.T_co]

__init__(dataset, num_replicas=None, rank=None, shuffle=True, seed=0, replace=False, split_huge_listfile_byrank=False)[source]

A Distribute sampler which support sample m instance from one class once for classification dataset :param dataset: pytorch dataset object :param num_replicas: Number of processes participating in

distributed training.

Parameters
  • rank (optional) – Rank of the current process within num_replicas.

  • shuffle (optional) – If true (default), sampler will shuffle the indices

  • seed (int, Optional) – The seed. Default to 0.

  • split_huge_listfile_byrank – if split, return all indice for each rank, because list for each rank has been split before build dataset in dist training

generate_new_list()[source]
set_uniform_indices(labels, num_classes)[source]
class easycv.datasets.loader.sampler.GroupSampler(dataset, samples_per_gpu=1)[source]

Bases: Generic[torch.utils.data.sampler.T_co]

__init__(dataset, samples_per_gpu=1)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.loader.sampler.DistributedGroupSampler(dataset, samples_per_gpu=1, seed=0, num_replicas=None, rank=None)[source]

Bases: Generic[torch.utils.data.sampler.T_co]

Sampler that restricts data loading to a subset of the dataset. It is especially useful in conjunction with torch.nn.parallel.DistributedDataParallel. In such case, each process can pass a DistributedSampler instance as a DataLoader sampler, and load a subset of the original dataset that is exclusive to it. .. note:

Dataset is assumed to be of constant size.
Parameters
  • dataset – Dataset used for sampling.

  • seed (int, Optional) – The seed. Default to 0.

  • num_replicas (optional) – Number of processes participating in distributed training.

  • rank (optional) – Rank of the current process within num_replicas.

__init__(dataset, samples_per_gpu=1, seed=0, num_replicas=None, rank=None)[source]

Initialize self. See help(type(self)) for accurate signature.

set_epoch(epoch)[source]
class easycv.datasets.loader.sampler.DistributedGivenIterationSampler(dataset, total_iter, batch_size, num_replicas=None, rank=None, last_iter=- 1)[source]

Bases: Generic[torch.utils.data.sampler.T_co]

__init__(dataset, total_iter, batch_size, num_replicas=None, rank=None, last_iter=- 1)[source]

Initialize self. See help(type(self)) for accurate signature.

set_uniform_indices(labels, num_classes)[source]
gen_new_list()[source]
set_epoch(epoch)[source]
class easycv.datasets.loader.sampler.RASampler(dataset, num_replicas=None, rank=None, shuffle=True, num_repeats: int = 3, **kwargs)[source]

Bases: Generic[torch.utils.data.sampler.T_co]

Sampler that restricts data loading to a subset of the dataset for distributed, with repeated augmentation. It ensures that different each augmented version of a sample will be visible to a different process (GPU) Heavily based on torch.utils.data.DistributedSampler

__init__(dataset, num_replicas=None, rank=None, shuffle=True, num_repeats: int = 3, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

set_epoch(epoch)[source]

easycv.datasets.pose package

class easycv.datasets.pose.PoseTopDownDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

PoseTopDownDataset dataset for top-down pose estimation. The dataset loads raw features and apply specified transforms to return a dict containing the image tensors and other information.

Parameters
  • data_source – Data_source config dict

  • pipeline – Pipeline config list

  • profiling – If set True, will print pipeline time

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(outputs, evaluators, **kwargs)[source]
class easycv.datasets.pose.HandCocoWholeBodyDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

CocoWholeBodyDataset for top-down hand pose estimation.

Parameters
  • data_source – Data_source config dict

  • pipeline – Pipeline config list

  • profiling – If set True, will print pipeline time

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(outputs, evaluators, **kwargs)[source]
class easycv.datasets.pose.WholeBodyCocoTopDownDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

CocoWholeBodyDataset dataset for top-down pose estimation.

Parameters
  • data_source – Data_source config dict

  • pipeline – Pipeline config list

  • profiling – If set True, will print pipeline time

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(outputs, evaluators, **kwargs)[source]
Subpackages
easycv.datasets.pose.data_sources package
class easycv.datasets.pose.data_sources.PoseTopDownSourceCoco(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]

Bases: easycv.datasets.pose.data_sources.top_down.PoseTopDownSource

CocoSource for top-down pose estimation.

Microsoft COCO: Common Objects in Context’ ECCV’2014 More details can be found in the `paper .

The source loads raw features to build a data meta object containing the image info, annotation info and others.

COCO keypoint indexes:

0: 'nose',
1: 'left_eye',
2: 'right_eye',
3: 'left_ear',
4: 'right_ear',
5: 'left_shoulder',
6: 'right_shoulder',
7: 'left_elbow',
8: 'right_elbow',
9: 'left_wrist',
10: 'right_wrist',
11: 'left_hip',
12: 'right_hip',
13: 'left_knee',
14: 'right_knee',
15: 'left_ankle',
16: 'right_ankle'
Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held. Default: None.

  • data_cfg (dict) – config

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • test_mode (bool) – Store True when building test or

  • dataset. Default (validation) – False.

__init__(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.data_sources.PoseTopDownSource(ann_file, img_prefix, data_cfg, dataset_info, coco_style=True, test_mode=False)[source]

Bases: object

Class for keypoint 2D top-down pose estimation with single-view RGB image as the data source.

Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held. Default: None.

  • data_cfg (dict) – config

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • coco_style (bool) – Whether the annotation json is coco-style. Default: True

  • test_mode (bool) – Store True when building test or validation dataset. Default: False.

__init__(ann_file, img_prefix, data_cfg, dataset_info, coco_style=True, test_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

load_image(image_file)[source]
class easycv.datasets.pose.data_sources.HandCocoPoseTopDownSource(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]

Bases: easycv.datasets.pose.data_sources.top_down.PoseTopDownSource

Coco Whole-Body-Hand Source for top-down hand pose estimation.

“Whole-Body Human Pose Estimation in the Wild”, ECCV’2020. More details can be found in the paper .

The dataset loads raw features and apply specified transforms to return a dict containing the image tensors and other information.

COCO-WholeBody Hand keypoint indexes:

0: 'wrist',
1: 'thumb1',
2: 'thumb2',
3: 'thumb3',
4: 'thumb4',
5: 'forefinger1',
6: 'forefinger2',
7: 'forefinger3',
8: 'forefinger4',
9: 'middle_finger1',
10: 'middle_finger2',
11: 'middle_finger3',
12: 'middle_finger4',
13: 'ring_finger1',
14: 'ring_finger2',
15: 'ring_finger3',
16: 'ring_finger4',
17: 'pinky_finger1',
18: 'pinky_finger2',
19: 'pinky_finger3',
20: 'pinky_finger4'
Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held. Default: None.

  • data_cfg (dict) – config

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • test_mode (bool) – Store True when building test or validation dataset. Default: False.

__init__(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.data_sources.WholeBodyCocoTopDownSource(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]

Bases: easycv.datasets.pose.data_sources.top_down.PoseTopDownSource

CocoWholeBodyDataset dataset for top-down pose estimation.

“Whole-Body Human Pose Estimation in the Wild”, ECCV’2020. More details can be found in the paper .

The dataset loads raw features and apply specified transforms to return a dict containing the image tensors and other information.

COCO-WholeBody keypoint indexes:

0-16: 17 body keypoints,
17-22: 6 foot keypoints,
23-90: 68 face keypoints,
91-132: 42 hand keypoints

In total, we have 133 keypoints for wholebody pose estimation.
Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held. Default: None.

  • data_cfg (dict) – config

  • test_mode (bool) – Store True when building test or validation dataset. Default: False.

__init__(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.data_sources.PoseTopDownSourceCoco2017(data_cfg, path='', download=True, split='train', dataset_info=None, test_mode=False)[source]

Bases: easycv.datasets.pose.data_sources.coco.PoseTopDownSourceCoco

Parameters
  • path – target dir

  • download – whether download

  • split – train or val

  • data_cfg (dict) – config

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • test_mode (bool) – Store True when building test or

  • dataset. Default (validation) – False.

__init__(data_cfg, path='', download=True, split='train', dataset_info=None, test_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.data_sources.PoseTopDownSourceCrowdPose(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False, **kwargs)[source]

Bases: easycv.datasets.pose.data_sources.top_down.PoseTopDownSource

CrowdPose keypoint indexes:

0 ‘left_shoulder’, 1 ‘right_shoulder’, 2 ‘left_elbow’, 3 ‘right_elbow’, 4 ‘left_wrist’, 5 ‘right_wrist’, 6 ‘left_hip’, 7 ‘right_hip’, 8 ‘left_knee’, 9 ‘right_knee’, 10 ‘left_ankle’, 11 ‘right_ankle’, 12 ‘head’, 13 ‘neck’

Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held. Default: None.

  • data_cfg (dict) – config

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • test_mode (bool) – Store True when building test or

  • dataset. Default (validation) – False.

__init__(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.data_sources.PoseTopDownSourceChHuman(ann_file, img_prefix, data_cfg, subset=None, dataset_info=None, test_mode=False, **kwargs)[source]

Bases: easycv.datasets.pose.data_sources.top_down.PoseTopDownSource

Oc Human Source for top-down pose estimation.

Pose2Seg: Detection Free Human Instance Segmentation’ ECCV’2019 More details can be found in the `paper .

The source loads raw features to build a data meta object containing the image info, annotation info and others.

Oc Human keypoint indexes:

0: 'nose',
1: 'left_eye',
2: 'right_eye',
3: 'left_ear',
4: 'right_ear',
5: 'left_shoulder',
6: 'right_shoulder',
7: 'left_elbow',
8: 'right_elbow',
9: 'left_wrist',
10: 'right_wrist',
11: 'left_hip',
12: 'right_hip',
13: 'left_knee',
14: 'right_knee',
15: 'left_ankle',
16: 'right_ankle'
Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held. Default: None.

  • data_cfg (dict) – config

  • subset – Applicable to non-coco or coco style data sets, if subset == train or val or test, in non-coco style else subset == None , in coco style

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • test_mode (bool) – Store True when building test or

__init__(ann_file, img_prefix, data_cfg, subset=None, dataset_info=None, test_mode=False, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.data_sources.PoseTopDownSourceMpii(data_cfg, path='/home/docs/.cache/easycv/', download=False, dataset_info=None, test_mode=False, **kwargs)[source]

Bases: easycv.datasets.pose.data_sources.top_down.PoseTopDownSource

Oc Human Source for top-down pose estimation.

Pose2Seg: Detection Free Human Instance Segmentation’ ECCV’2019 More details can be found in the `paper .

The source loads raw features to build a data meta object containing the image info, annotation info and others.

Oc Human keypoint indexes:

0: 'right_ankle',
1: 'right_knee',
2: 'right_hip',
3: 'left_hip',
4: 'right_ear',
5: 'left_ankle',
6: 'pelvis',
7: 'thorax',
8: 'neck',
9: 'head',
10: 'right_wrist',
11: 'right_elbow',
12: 'right_shoulder',
13: 'left_shoulder',
14: 'left_elbow',
15: 'left_wrist'
Parameters
  • data_cfg (dict) – config

  • path – This parameter is optional. If download is True and path is not provided, a temporary directory is automatically created for downloading

  • download – If the value is True, the file is automatically downloaded to the path directory. If False, automatic download is not supported and data in the path is used

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • test_mode (bool) – Store True when building test or

__init__(data_cfg, path='/home/docs/.cache/easycv/', download=False, dataset_info=None, test_mode=False, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

load_points_bbox(scale, objpos, points)[source]
deal_annolist(num_list, char)[source]
download()[source]
Submodules
easycv.datasets.pose.data_sources.coco module
class easycv.datasets.pose.data_sources.coco.PoseTopDownSourceCoco(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]

Bases: easycv.datasets.pose.data_sources.top_down.PoseTopDownSource

CocoSource for top-down pose estimation.

Microsoft COCO: Common Objects in Context’ ECCV’2014 More details can be found in the `paper .

The source loads raw features to build a data meta object containing the image info, annotation info and others.

COCO keypoint indexes:

0: 'nose',
1: 'left_eye',
2: 'right_eye',
3: 'left_ear',
4: 'right_ear',
5: 'left_shoulder',
6: 'right_shoulder',
7: 'left_elbow',
8: 'right_elbow',
9: 'left_wrist',
10: 'right_wrist',
11: 'left_hip',
12: 'right_hip',
13: 'left_knee',
14: 'right_knee',
15: 'left_ankle',
16: 'right_ankle'
Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held. Default: None.

  • data_cfg (dict) – config

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • test_mode (bool) – Store True when building test or

  • dataset. Default (validation) – False.

__init__(ann_file, img_prefix, data_cfg, dataset_info=None, test_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.data_sources.coco.PoseTopDownSourceCoco2017(data_cfg, path='', download=True, split='train', dataset_info=None, test_mode=False)[source]

Bases: easycv.datasets.pose.data_sources.coco.PoseTopDownSourceCoco

Parameters
  • path – target dir

  • download – whether download

  • split – train or val

  • data_cfg (dict) – config

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • test_mode (bool) – Store True when building test or

  • dataset. Default (validation) – False.

__init__(data_cfg, path='', download=True, split='train', dataset_info=None, test_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.pose.data_sources.top_down module
class easycv.datasets.pose.data_sources.top_down.DatasetInfo(dataset_info)[source]

Bases: object

__init__(dataset_info)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.data_sources.top_down.PoseTopDownSource(ann_file, img_prefix, data_cfg, dataset_info, coco_style=True, test_mode=False)[source]

Bases: object

Class for keypoint 2D top-down pose estimation with single-view RGB image as the data source.

Parameters
  • ann_file (str) – Path to the annotation file.

  • img_prefix (str) – Path to a directory where images are held. Default: None.

  • data_cfg (dict) – config

  • dataset_info (DatasetInfo) – A class containing all dataset info.

  • coco_style (bool) – Whether the annotation json is coco-style. Default: True

  • test_mode (bool) – Store True when building test or validation dataset. Default: False.

__init__(ann_file, img_prefix, data_cfg, dataset_info, coco_style=True, test_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

load_image(image_file)[source]
easycv.datasets.pose.pipelines package
class easycv.datasets.pose.pipelines.PoseCollect(keys, meta_keys, meta_name='img_metas')[source]

Bases: object

Collect data from the loader relevant to the specific task.

This keeps the items in keys as it is, and collect items in meta_keys into a meta item called meta_name.This is usually the last stage of the data loader pipeline. For example, when keys=’imgs’, meta_keys=(‘filename’, ‘label’, ‘original_shape’), meta_name=’img_metas’, the results will be a dict with keys ‘imgs’ and ‘img_metas’, where ‘img_metas’ is a DataContainer of another dict with keys ‘filename’, ‘label’, ‘original_shape’.

Parameters
  • keys (Sequence[str|tuple]) – Required keys to be collected. If a tuple (key, key_new) is given as an element, the item retrieved by key will be renamed as key_new in collected data.

  • meta_name (str) – The name of the key that contains meta information. This key is always populated. Default: “img_metas”.

  • meta_keys (Sequence[str|tuple]) – Keys that are collected under meta_name. The contents of the meta_name dictionary depends on meta_keys.

__init__(keys, meta_keys, meta_name='img_metas')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.TopDownRandomFlip(flip_prob=0.5)[source]

Bases: object

Data augmentation with random image flip.

Required keys: ‘img’, ‘joints_3d’, ‘joints_3d_visible’, ‘center’ and ‘ann_info’. Modifies key: ‘img’, ‘joints_3d’, ‘joints_3d_visible’, ‘center’ and ‘flipped’.

Parameters
  • flip (bool) – Option to perform random flip.

  • flip_prob (float) – Probability of flip.

__init__(flip_prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.TopDownHalfBodyTransform(num_joints_half_body=8, prob_half_body=0.3)[source]

Bases: object

Data augmentation with half-body transform. Keep only the upper body or the lower body at random.

Required keys: ‘joints_3d’, ‘joints_3d_visible’, and ‘ann_info’. Modifies key: ‘scale’ and ‘center’.

Parameters
  • num_joints_half_body (int) – Threshold of performing half-body transform. If the body has fewer number of joints (< num_joints_half_body), ignore this step.

  • prob_half_body (float) – Probability of half-body transform.

__init__(num_joints_half_body=8, prob_half_body=0.3)[source]

Initialize self. See help(type(self)) for accurate signature.

static half_body_transform(cfg, joints_3d, joints_3d_visible)[source]

Get center&scale for half-body transform.

class easycv.datasets.pose.pipelines.TopDownGetRandomScaleRotation(rot_factor=40, scale_factor=0.5, rot_prob=0.6)[source]

Bases: object

Data augmentation with random scaling & rotating.

Required key: ‘scale’. Modifies key: ‘scale’ and ‘rotation’.

Parameters
  • rot_factor (int) – Rotating to [-2*rot_factor, 2*rot_factor].

  • scale_factor (float) – Scaling to [1-scale_factor, 1+scale_factor].

  • rot_prob (float) – Probability of random rotation.

__init__(rot_factor=40, scale_factor=0.5, rot_prob=0.6)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.TopDownAffine(use_udp=False)[source]

Bases: object

Affine transform the image to make input.

Required keys:’img’, ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’,’scale’, ‘rotation’ and ‘center’. Modified keys:’img’, ‘joints_3d’, and ‘joints_3d_visible’.

Parameters

use_udp (bool) – To use unbiased data processing. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

__init__(use_udp=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.TopDownGenerateTarget(sigma=2, kernel=(11, 11), valid_radius_factor=0.0546875, target_type='GaussianHeatmap', encoding='MSRA', unbiased_encoding=False)[source]

Bases: object

Generate the target heatmap.

Required keys: ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’. Modified keys: ‘target’, and ‘target_weight’.

Parameters
  • sigma – Sigma of heatmap gaussian for ‘MSRA’ approach.

  • kernel – Kernel of heatmap gaussian for ‘Megvii’ approach.

  • encoding (str) – Approach to generate target heatmaps. Currently supported approaches: ‘MSRA’, ‘Megvii’, ‘UDP’. Default:’MSRA’

  • unbiased_encoding (bool) – Option to use unbiased encoding methods. Paper ref: Zhang et al. Distribution-Aware Coordinate Representation for Human Pose Estimation (CVPR 2020).

  • keypoint_pose_distance – Keypoint pose distance for UDP. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

  • target_type (str) – supported targets: ‘GaussianHeatmap’, ‘CombinedTarget’. Default:’GaussianHeatmap’ CombinedTarget: The combination of classification target (response map) and regression target (offset map). Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

__init__(sigma=2, kernel=(11, 11), valid_radius_factor=0.0546875, target_type='GaussianHeatmap', encoding='MSRA', unbiased_encoding=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.TopDownGenerateTargetRegression[source]

Bases: object

Generate the target regression vector (coordinates).

Required keys: ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’. Modified keys: ‘target’, and ‘target_weight’.

__init__()[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.TopDownRandomTranslation(trans_factor=0.15, trans_prob=1.0)[source]

Bases: object

Data augmentation with random translation.

Required key: ‘scale’ and ‘center’. Modifies key: ‘center’.

Notes

bbox height: H bbox width: W

Parameters
  • trans_factor (float) – Translating center to [-trans_factor, trans_factor] * [W, H] + center.

  • trans_prob (float) – Probability of random translation.

__init__(trans_factor=0.15, trans_prob=1.0)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.TopDownRandomShiftBboxCenter(shift_factor: float = 0.16, prob: float = 0.3)[source]

Bases: object

Random shift the bbox center.

Required key: ‘center’, ‘scale’

Modifies key: ‘center’

Parameters
  • shift_factor (float) – The factor to control the shift range, which is scale*pixel_std*scale_factor. Default: 0.16

  • prob (float) – Probability of applying random shift. Default: 0.3

pixel_std: float = 200.0
__init__(shift_factor: float = 0.16, prob: float = 0.3)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.TopDownGetBboxCenterScale(padding: float = 1.25)[source]

Bases: object

Convert bbox from [x, y, w, h] to center and scale.

The center is the coordinates of the bbox center, and the scale is the bbox width and height normalized by a scale factor.

Required key: ‘bbox’, ‘ann_info’

Modifies key: ‘center’, ‘scale’

Parameters

padding (float) – bbox padding scale that will be multilied to scale. Default: 1.25

pixel_std: float = 200.0
__init__(padding: float = 1.25)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.pose.pipelines.transforms module
class easycv.datasets.pose.pipelines.transforms.PoseCollect(keys, meta_keys, meta_name='img_metas')[source]

Bases: object

Collect data from the loader relevant to the specific task.

This keeps the items in keys as it is, and collect items in meta_keys into a meta item called meta_name.This is usually the last stage of the data loader pipeline. For example, when keys=’imgs’, meta_keys=(‘filename’, ‘label’, ‘original_shape’), meta_name=’img_metas’, the results will be a dict with keys ‘imgs’ and ‘img_metas’, where ‘img_metas’ is a DataContainer of another dict with keys ‘filename’, ‘label’, ‘original_shape’.

Parameters
  • keys (Sequence[str|tuple]) – Required keys to be collected. If a tuple (key, key_new) is given as an element, the item retrieved by key will be renamed as key_new in collected data.

  • meta_name (str) – The name of the key that contains meta information. This key is always populated. Default: “img_metas”.

  • meta_keys (Sequence[str|tuple]) – Keys that are collected under meta_name. The contents of the meta_name dictionary depends on meta_keys.

__init__(keys, meta_keys, meta_name='img_metas')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.transforms.TopDownRandomFlip(flip_prob=0.5)[source]

Bases: object

Data augmentation with random image flip.

Required keys: ‘img’, ‘joints_3d’, ‘joints_3d_visible’, ‘center’ and ‘ann_info’. Modifies key: ‘img’, ‘joints_3d’, ‘joints_3d_visible’, ‘center’ and ‘flipped’.

Parameters
  • flip (bool) – Option to perform random flip.

  • flip_prob (float) – Probability of flip.

__init__(flip_prob=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.transforms.TopDownHalfBodyTransform(num_joints_half_body=8, prob_half_body=0.3)[source]

Bases: object

Data augmentation with half-body transform. Keep only the upper body or the lower body at random.

Required keys: ‘joints_3d’, ‘joints_3d_visible’, and ‘ann_info’. Modifies key: ‘scale’ and ‘center’.

Parameters
  • num_joints_half_body (int) – Threshold of performing half-body transform. If the body has fewer number of joints (< num_joints_half_body), ignore this step.

  • prob_half_body (float) – Probability of half-body transform.

__init__(num_joints_half_body=8, prob_half_body=0.3)[source]

Initialize self. See help(type(self)) for accurate signature.

static half_body_transform(cfg, joints_3d, joints_3d_visible)[source]

Get center&scale for half-body transform.

class easycv.datasets.pose.pipelines.transforms.TopDownGetRandomScaleRotation(rot_factor=40, scale_factor=0.5, rot_prob=0.6)[source]

Bases: object

Data augmentation with random scaling & rotating.

Required key: ‘scale’. Modifies key: ‘scale’ and ‘rotation’.

Parameters
  • rot_factor (int) – Rotating to [-2*rot_factor, 2*rot_factor].

  • scale_factor (float) – Scaling to [1-scale_factor, 1+scale_factor].

  • rot_prob (float) – Probability of random rotation.

__init__(rot_factor=40, scale_factor=0.5, rot_prob=0.6)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.transforms.TopDownAffine(use_udp=False)[source]

Bases: object

Affine transform the image to make input.

Required keys:’img’, ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’,’scale’, ‘rotation’ and ‘center’. Modified keys:’img’, ‘joints_3d’, and ‘joints_3d_visible’.

Parameters

use_udp (bool) – To use unbiased data processing. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

__init__(use_udp=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.transforms.TopDownGenerateTarget(sigma=2, kernel=(11, 11), valid_radius_factor=0.0546875, target_type='GaussianHeatmap', encoding='MSRA', unbiased_encoding=False)[source]

Bases: object

Generate the target heatmap.

Required keys: ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’. Modified keys: ‘target’, and ‘target_weight’.

Parameters
  • sigma – Sigma of heatmap gaussian for ‘MSRA’ approach.

  • kernel – Kernel of heatmap gaussian for ‘Megvii’ approach.

  • encoding (str) – Approach to generate target heatmaps. Currently supported approaches: ‘MSRA’, ‘Megvii’, ‘UDP’. Default:’MSRA’

  • unbiased_encoding (bool) – Option to use unbiased encoding methods. Paper ref: Zhang et al. Distribution-Aware Coordinate Representation for Human Pose Estimation (CVPR 2020).

  • keypoint_pose_distance – Keypoint pose distance for UDP. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

  • target_type (str) – supported targets: ‘GaussianHeatmap’, ‘CombinedTarget’. Default:’GaussianHeatmap’ CombinedTarget: The combination of classification target (response map) and regression target (offset map). Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

__init__(sigma=2, kernel=(11, 11), valid_radius_factor=0.0546875, target_type='GaussianHeatmap', encoding='MSRA', unbiased_encoding=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.transforms.TopDownGenerateTargetRegression[source]

Bases: object

Generate the target regression vector (coordinates).

Required keys: ‘joints_3d’, ‘joints_3d_visible’, ‘ann_info’. Modified keys: ‘target’, and ‘target_weight’.

__init__()[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.transforms.TopDownRandomTranslation(trans_factor=0.15, trans_prob=1.0)[source]

Bases: object

Data augmentation with random translation.

Required key: ‘scale’ and ‘center’. Modifies key: ‘center’.

Notes

bbox height: H bbox width: W

Parameters
  • trans_factor (float) – Translating center to [-trans_factor, trans_factor] * [W, H] + center.

  • trans_prob (float) – Probability of random translation.

__init__(trans_factor=0.15, trans_prob=1.0)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.pose.pipelines.transforms.bbox_xywh2cs(bbox, aspect_ratio, padding=1.0, pixel_std=200.0)[source]

Transform the bbox format from (x,y,w,h) into (center, scale)

Parameters
  • bbox (ndarray) – Single bbox in (x, y, w, h)

  • aspect_ratio (float) – The expected bbox aspect ratio (w over h)

  • padding (float) – Bbox padding factor that will be multilied to scale. Default: 1.0

  • pixel_std (float) – The scale normalization factor. Default: 200.0

Returns

A tuple containing center and scale. - np.ndarray[float32](2,): Center of the bbox (x, y). - np.ndarray[float32](2,): Scale of the bbox w & h.

Return type

tuple

easycv.datasets.pose.pipelines.transforms.bbox_cs2xyxy(center, scale, padding=1.0, pixel_std=200.0)[source]
class easycv.datasets.pose.pipelines.transforms.TopDownGetBboxCenterScale(padding: float = 1.25)[source]

Bases: object

Convert bbox from [x, y, w, h] to center and scale.

The center is the coordinates of the bbox center, and the scale is the bbox width and height normalized by a scale factor.

Required key: ‘bbox’, ‘ann_info’

Modifies key: ‘center’, ‘scale’

Parameters

padding (float) – bbox padding scale that will be multilied to scale. Default: 1.25

pixel_std: float = 200.0
__init__(padding: float = 1.25)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.pose.pipelines.transforms.TopDownRandomShiftBboxCenter(shift_factor: float = 0.16, prob: float = 0.3)[source]

Bases: object

Random shift the bbox center.

Required key: ‘center’, ‘scale’

Modifies key: ‘center’

Parameters
  • shift_factor (float) – The factor to control the shift range, which is scale*pixel_std*scale_factor. Default: 0.16

  • prob (float) – Probability of applying random shift. Default: 0.3

pixel_std: float = 200.0
__init__(shift_factor: float = 0.16, prob: float = 0.3)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.pose.top_down module
class easycv.datasets.pose.top_down.PoseTopDownDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

PoseTopDownDataset dataset for top-down pose estimation. The dataset loads raw features and apply specified transforms to return a dict containing the image tensors and other information.

Parameters
  • data_source – Data_source config dict

  • pipeline – Pipeline config list

  • profiling – If set True, will print pipeline time

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(outputs, evaluators, **kwargs)[source]

easycv.datasets.selfsup package

Subpackages
easycv.datasets.selfsup.data_sources package
class easycv.datasets.selfsup.data_sources.SSLSourceImageList(list_file, root='', max_try=20)[source]

Bases: object

datasource for classification

Parameters
  • list_file – str / list(str), str means a input image list file path, this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label

  • root – str / list(str), root path for image_path, each list_file will need a root, if len(root) < len(list_file), we will use root[-1] to fill root list.

  • max_try – int, max try numbers of reading image

__init__(list_file, root='', max_try=20)[source]

Initialize self. See help(type(self)) for accurate signature.

static parse_list_file(list_file, root)[source]
class easycv.datasets.selfsup.data_sources.SSLSourceImageNetFeature(root_path, training=True, data_keyword='feat1', label_keyword='label', dynamic_load=True)[source]

Bases: object

__init__(root_path, training=True, data_keyword='feat1', label_keyword='label', dynamic_load=True)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.selfsup.data_sources.image_list module
class easycv.datasets.selfsup.data_sources.image_list.SSLSourceImageList(list_file, root='', max_try=20)[source]

Bases: object

datasource for classification

Parameters
  • list_file – str / list(str), str means a input image list file path, this file contains records as image_path label in list_file list(str) means multi image list, each one contains some records as image_path label

  • root – str / list(str), root path for image_path, each list_file will need a root, if len(root) < len(list_file), we will use root[-1] to fill root list.

  • max_try – int, max try numbers of reading image

__init__(list_file, root='', max_try=20)[source]

Initialize self. See help(type(self)) for accurate signature.

static parse_list_file(list_file, root)[source]
easycv.datasets.selfsup.data_sources.imagenet_feature module
class easycv.datasets.selfsup.data_sources.imagenet_feature.SSLSourceImageNetFeature(root_path, training=True, data_keyword='feat1', label_keyword='label', dynamic_load=True)[source]

Bases: object

__init__(root_path, training=True, data_keyword='feat1', label_keyword='label', dynamic_load=True)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.selfsup.pipelines package
class easycv.datasets.selfsup.pipelines.RandomAppliedTrans(transforms, p=0.5)[source]

Bases: object

Randomly applied transformations. :param transforms: List of transformations in dictionaries. :type transforms: List[Dict]

__init__(transforms, p=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.selfsup.pipelines.Lighting[source]

Bases: object

Lighting noise(AlexNet - style PCA - based noise)

__init__()[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.selfsup.pipelines.Solarization(threshold=128)[source]

Bases: object

__init__(threshold=128)[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.selfsup.pipelines.transforms module
class easycv.datasets.selfsup.pipelines.transforms.MAEFtAugment(input_size=None, color_jitter=None, auto_augment=None, interpolation=None, re_prob=None, re_mode=None, re_count=None, mean=None, std=None, is_train=True)[source]

Bases: object

RandAugment data augmentation method based on “RandAugment: Practical automated data augmentation with a reduced search space”. This code is borrowed from <https://github.com/pengzhiliang/MAE-pytorch> :param input_size: images input size :type input_size: int :param color_jitter: Color jitter factor :type color_jitter: float :param auto_augment: Use AutoAugment policy :param iterpolation: Training interpolation :param re_prob: Random erase prob :param re_mode: Random erase mode :param re_count: Random erase count :param mean: mean used for normalization :param std: std used for normalization :param is_train: If True use all augmentation strategy

__init__(input_size=None, color_jitter=None, auto_augment=None, interpolation=None, re_prob=None, re_mode=None, re_count=None, mean=None, std=None, is_train=True)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.selfsup.pipelines.transforms.RandomAppliedTrans(transforms, p=0.5)[source]

Bases: object

Randomly applied transformations. :param transforms: List of transformations in dictionaries. :type transforms: List[Dict]

__init__(transforms, p=0.5)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.selfsup.pipelines.transforms.Lighting[source]

Bases: object

Lighting noise(AlexNet - style PCA - based noise)

__init__()[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.selfsup.pipelines.transforms.Solarization(threshold=128)[source]

Bases: object

__init__(threshold=128)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.shared package

class easycv.datasets.shared.ConcatDataset(datasets)[source]

Bases: torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]

A wrapper of concatenated dataset.

Same as torch.utils.data.dataset.ConcatDataset, but concat the group flag for image aspect ratio.

Parameters

datasets (list[Dataset]) – A list of datasets.

__init__(datasets)[source]

Initialize self. See help(type(self)) for accurate signature.

datasets: List[torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]]
cumulative_sizes: List[int]
class easycv.datasets.shared.RepeatDataset(dataset, times)[source]

Bases: object

A wrapper of repeated dataset.

The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.

Parameters
  • dataset (Dataset) – The dataset to be repeated.

  • times (int) – Repeat times.

__init__(dataset, times)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.OdpsReader(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]

Bases: object

__init__(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]

Init odps reader and datasource set to load data from odps table

Parameters
  • table_name (str) – odps table to load

  • selected_cols (list(str)) – select column

  • excluded_cols (list(str)) – exclude column

  • random_start (bool) – random start for odps table

  • odps_io_config (dict) – odps config contains access_id, access_key, endpoint

  • image_col (list(str)) – image column names

  • image_type (list(str)) – image column types support url/base64, must be same length with image type or 0

Returns :

None

reset_reader(dataloader_workid, dataloader_worknum)[source]
b64_decode()[source]
class easycv.datasets.shared.RawDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(scores, keyword, logger=None)[source]
class easycv.datasets.shared.BaseDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Base Dataset

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

abstract evaluate(results, evaluators, logger=None, **kwargs)[source]
visualize(results, **kwargs)[source]

Visulaize the model output results on validation data. Returns: A dictionary

If add image visualization, return dict containing

images: List of visulaized images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

class easycv.datasets.shared.MultiViewDataset(data_source, num_views, pipelines)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

The dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]
Subpackages
easycv.datasets.shared.data_sources package
class easycv.datasets.shared.data_sources.ImageNpy(image_file, label_file=None, cache_root='data_cache/')[source]

Bases: object

__init__(image_file, label_file=None, cache_root='data_cache/')[source]

image_file: (local or oss) image data saved in one .npy data [cv2.img, cv2.img,…] label_file: (local or oss) label data saved in one .npy data

class easycv.datasets.shared.data_sources.SourceConcat(data_source_list)[source]

Bases: object

Concat multi data source config.

__init__(data_source_list)[source]

Initialize self. See help(type(self)) for accurate signature.

cumsum_length()[source]
Submodules
easycv.datasets.shared.data_sources.concat module
class easycv.datasets.shared.data_sources.concat.SourceConcat(data_source_list)[source]

Bases: object

Concat multi data source config.

__init__(data_source_list)[source]

Initialize self. See help(type(self)) for accurate signature.

cumsum_length()[source]
easycv.datasets.shared.data_sources.image_npy module
class easycv.datasets.shared.data_sources.image_npy.ImageNpy(image_file, label_file=None, cache_root='data_cache/')[source]

Bases: object

__init__(image_file, label_file=None, cache_root='data_cache/')[source]

image_file: (local or oss) image data saved in one .npy data [cv2.img, cv2.img,…] label_file: (local or oss) label data saved in one .npy data

easycv.datasets.shared.pipelines package
Submodules
easycv.datasets.shared.pipelines.dali_transforms module
class easycv.datasets.shared.pipelines.dali_transforms.DaliImageDecoder(device='mixed', **kwargs)[source]

Bases: object

refer to: https://docs.nvidia.com/deeplearning/dali/archives/dali_0250/user-guide/docs/supported_ops.html#nvidia.dali.ops.ImageDecoder

__init__(device='mixed', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.pipelines.dali_transforms.DaliRandomResizedCrop(size, random_area, device='gpu', **kwargs)[source]

Bases: object

refer to: https://docs.nvidia.com/deeplearning/dali/archives/dali_0250/user-guide/docs/supported_ops.html#nvidia.dali.ops.RandomResizedCrop

__init__(size, random_area, device='gpu', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.pipelines.dali_transforms.DaliResize(resize_shorter, device='gpu', **kwargs)[source]

Bases: object

refer to: https://docs.nvidia.com/deeplearning/dali/archives/dali_0250/user-guide/docs/supported_ops.html#nvidia.dali.ops.Resize

__init__(resize_shorter, device='gpu', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.pipelines.dali_transforms.DaliColorTwist(prob, saturation, contrast, brightness, hue, device='gpu', center=1)[source]

Bases: object

refer to: https://docs.nvidia.com/deeplearning/dali/archives/dali_0250/user-guide/docs/supported_ops.html#nvidia.dali.ops.ColorTwist

__init__(prob, saturation, contrast, brightness, hue, device='gpu', center=1)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.pipelines.dali_transforms.DaliRandomGrayscale(prob, device='gpu')[source]

Bases: object

refer to: https://docs.nvidia.com/deeplearning/dali/archives/dali_0250/user-guide/docs/supported_ops.html#nvidia.dali.ops.Hsv Create RandomGrayscale op with ops.Hsv. when saturation=0, it represents a grayscale image

__init__(prob, device='gpu')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.pipelines.dali_transforms.DaliCropMirrorNormalize(crop, mean, std, prob=0.0, device='gpu', crop_pos_x=0.5, crop_pos_y=0.5, **kwargs)[source]

Bases: object

refer to: https://docs.nvidia.com/deeplearning/dali/archives/dali_0250/user-guide/docs/supported_ops.html#nvidia.dali.ops.CropMirrorNormalize

__init__(crop, mean, std, prob=0.0, device='gpu', crop_pos_x=0.5, crop_pos_y=0.5, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.shared.pipelines.format module
easycv.datasets.shared.pipelines.format.to_tensor(data)[source]

Convert objects of various python types to torch.Tensor.

Supported types are: numpy.ndarray, torch.Tensor, Sequence, int and float.

Parameters

data (torch.Tensor | numpy.ndarray | Sequence | int | float) – Data to be converted.

class easycv.datasets.shared.pipelines.format.ImageToTensor(keys)[source]

Bases: object

Convert image to torch.Tensor by given keys.

The dimension order of input image is (H, W, C). The pipeline will convert it to (C, H, W). If only 2 dimension (H, W) is given, the output would be (1, H, W).

Parameters

keys (Sequence[str]) – Key of images to be converted to Tensor.

__init__(keys)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.pipelines.format.Collect(keys, meta_keys=('filename', 'ori_filename', 'ori_img_shape', 'img_shape', 'scale_factor', 'pad', 'flip', 'flip_direction', 'img_norm_cfg'))[source]

Bases: object

Collect data from the loader relevant to the specific task.

This is usually the last stage of the data loader pipeline. Typically keys is set to some subset of “img”, “proposals”, “gt_bboxes”, “gt_bboxes_ignore”, “gt_labels”, and/or “gt_masks”.

The “img_meta” item is always populated. The contents of the “img_meta” dictionary depends on “meta_keys”. By default this includes:

  • “img_shape”: shape of the image input to the network as a tuple (h, w). Note that images may be zero padded on the bottom/right if the batch tensor is larger than this shape.

  • “scale_factor”: a float indicating the preprocessing scale

  • “flip”: a boolean indicating if image flip transform was used

  • “filename”: path to the image file

  • “ori_img_shape”: original shape of the image as a tuple (h, w, c)

  • “img_norm_cfg”: a dict of normalization information:

    • mean - per channel mean subtraction

    • std - per channel std divisor

    • to_rgb - bool indicating if bgr was converted to rgb

Parameters
  • keys (Sequence[str]) – Keys of results to be collected in data.

  • meta_keys (Sequence[str], optional) –

    Meta keys to be converted to mmcv.DataContainer and collected in data[img_metas]. Default: ``(‘filename’, ‘ori_filename’, ‘ori_img_shape’, ‘img_shape’,

    ’scale_factor’, ‘flip’, ‘flip_direction’, ‘img_norm_cfg’)``

__init__(keys, meta_keys=('filename', 'ori_filename', 'ori_img_shape', 'img_shape', 'scale_factor', 'pad', 'flip', 'flip_direction', 'img_norm_cfg'))[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.pipelines.format.DefaultFormatBundle[source]

Bases: object

Default formatting bundle. It simplifies the pipeline of formatting common fields, including “img”, “proposals”, “gt_bboxes”, “gt_labels”, “gt_masks” and “gt_semantic_seg”. These fields are formatted as follows. - img: (1)transpose, (2)to tensor, (3)to DataContainer (stack=True) - proposals: (1)to tensor, (2)to DataContainer - gt_bboxes: (1)to tensor, (2)to DataContainer - gt_bboxes_ignore: (1)to tensor, (2)to DataContainer - gt_labels: (1)to tensor, (2)to DataContainer - gt_masks: (1)to tensor, (2)to DataContainer (cpu_only=True) - gt_semantic_seg: (1)unsqueeze dim-0 (2)to tensor, (3)to DataContainer (stack=True)

easycv.datasets.shared.pipelines.third_transforms_wrapper module
easycv.datasets.shared.pipelines.third_transforms_wrapper.is_child_of(obj, cls)[source]
easycv.datasets.shared.pipelines.third_transforms_wrapper.get_args(obj)[source]
easycv.datasets.shared.pipelines.third_transforms_wrapper.wrap_torchvision_transforms(transform_obj)[source]
class easycv.datasets.shared.pipelines.third_transforms_wrapper.AugMix(severity: int = 3, mixture_width: int = 3, chain_depth: int = -1, alpha: float = 1.0, all_ops: bool = True, interpolation: torchvision.transforms.functional.InterpolationMode = <InterpolationMode.BILINEAR: 'bilinear'>, fill: Optional[List[float]] = None)

Bases: torchvision.transforms.autoaugment.AugMix

forward(results)

img (PIL Image or Tensor): Image to be transformed.

Returns

Transformed image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.AutoAugment(policy: torchvision.transforms.autoaugment.AutoAugmentPolicy = <AutoAugmentPolicy.IMAGENET: 'imagenet'>, interpolation: torchvision.transforms.functional.InterpolationMode = <InterpolationMode.NEAREST: 'nearest'>, fill: Optional[List[float]] = None)

Bases: torchvision.transforms.autoaugment.AutoAugment

forward(results)

img (PIL Image or Tensor): Image to be transformed.

Returns

AutoAugmented image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.CenterCrop(size)

Bases: torchvision.transforms.transforms.CenterCrop

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be cropped.

Returns

Cropped image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.ColorJitter(brightness=0, contrast=0, saturation=0, hue=0)

Bases: torchvision.transforms.transforms.ColorJitter

forward(results)
Parameters

img (PIL Image or Tensor) – Input image.

Returns

Color jittered image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.ConvertImageDtype(dtype: torch.dtype)

Bases: torchvision.transforms.transforms.ConvertImageDtype

forward(results)

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.ElasticTransform(alpha=50.0, sigma=5.0, interpolation=<InterpolationMode.BILINEAR: 'bilinear'>, fill=0)

Bases: torchvision.transforms.transforms.ElasticTransform

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be transformed.

Returns

Transformed image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.FiveCrop(size)

Bases: torchvision.transforms.transforms.FiveCrop

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be cropped.

Returns

tuple of 5 images. Image can be PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.GaussianBlur(kernel_size, sigma=(0.1, 2.0))

Bases: torchvision.transforms.transforms.GaussianBlur

forward(results)
Parameters

img (PIL Image or Tensor) – image to be blurred.

Returns

Gaussian blurred image

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.Grayscale(num_output_channels=1)

Bases: torchvision.transforms.transforms.Grayscale

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be converted to grayscale.

Returns

Grayscaled image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.Lambda(lambd)

Bases: torchvision.transforms.transforms.Lambda

class easycv.datasets.shared.pipelines.third_transforms_wrapper.LinearTransformation(transformation_matrix, mean_vector)

Bases: torchvision.transforms.transforms.LinearTransformation

forward(results)
Parameters

tensor (Tensor) – Tensor image to be whitened.

Returns

Transformed image.

Return type

Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.Normalize(mean, std, inplace=False)

Bases: torchvision.transforms.transforms.Normalize

forward(results)
Parameters

tensor (Tensor) – Tensor image to be normalized.

Returns

Normalized Tensor image.

Return type

Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.PILToTensor

Bases: torchvision.transforms.transforms.PILToTensor

class easycv.datasets.shared.pipelines.third_transforms_wrapper.Pad(padding, fill=0, padding_mode='constant')

Bases: torchvision.transforms.transforms.Pad

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be padded.

Returns

Padded image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandAugment(num_ops: int = 2, magnitude: int = 9, num_magnitude_bins: int = 31, interpolation: torchvision.transforms.functional.InterpolationMode = <InterpolationMode.NEAREST: 'nearest'>, fill: Optional[List[float]] = None)

Bases: torchvision.transforms.autoaugment.RandAugment

forward(results)

img (PIL Image or Tensor): Image to be transformed.

Returns

Transformed image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomAdjustSharpness(sharpness_factor, p=0.5)

Bases: torchvision.transforms.transforms.RandomAdjustSharpness

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be sharpened.

Returns

Randomly sharpened image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomAffine(degrees, translate=None, scale=None, shear=None, interpolation=<InterpolationMode.NEAREST: 'nearest'>, fill=0, center=None)

Bases: torchvision.transforms.transforms.RandomAffine

forward(results)

img (PIL Image or Tensor): Image to be transformed.

Returns

Affine transformed image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomAutocontrast(p=0.5)

Bases: torchvision.transforms.transforms.RandomAutocontrast

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be autocontrasted.

Returns

Randomly autocontrasted image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomChoice(transforms, p=None)

Bases: torchvision.transforms.transforms.RandomChoice

class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomCrop(size, padding=None, pad_if_needed=False, fill=0, padding_mode='constant')

Bases: torchvision.transforms.transforms.RandomCrop

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be cropped.

Returns

Cropped image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomEqualize(p=0.5)

Bases: torchvision.transforms.transforms.RandomEqualize

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be equalized.

Returns

Randomly equalized image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False)

Bases: torchvision.transforms.transforms.RandomErasing

forward(results)
Parameters

img (Tensor) – Tensor image to be erased.

Returns

Erased Tensor image.

Return type

img (Tensor)

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomGrayscale(p=0.1)

Bases: torchvision.transforms.transforms.RandomGrayscale

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be converted to grayscale.

Returns

Randomly grayscaled image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomHorizontalFlip(p=0.5)

Bases: torchvision.transforms.transforms.RandomHorizontalFlip

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be flipped.

Returns

Randomly flipped image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomInvert(p=0.5)

Bases: torchvision.transforms.transforms.RandomInvert

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be inverted.

Returns

Randomly color inverted image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomOrder(transforms)

Bases: torchvision.transforms.transforms.RandomOrder

class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomPerspective(distortion_scale=0.5, p=0.5, interpolation=<InterpolationMode.BILINEAR: 'bilinear'>, fill=0)

Bases: torchvision.transforms.transforms.RandomPerspective

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be Perspectively transformed.

Returns

Randomly transformed image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomPosterize(bits, p=0.5)

Bases: torchvision.transforms.transforms.RandomPosterize

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be posterized.

Returns

Randomly posterized image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=<InterpolationMode.BILINEAR: 'bilinear'>, antialias: Optional[bool] = None)

Bases: torchvision.transforms.transforms.RandomResizedCrop

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be cropped and resized.

Returns

Randomly cropped and resized image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomRotation(degrees, interpolation=<InterpolationMode.NEAREST: 'nearest'>, expand=False, center=None, fill=0)

Bases: torchvision.transforms.transforms.RandomRotation

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be rotated.

Returns

Rotated image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomSolarize(threshold, p=0.5)

Bases: torchvision.transforms.transforms.RandomSolarize

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be solarized.

Returns

Randomly solarized image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.RandomVerticalFlip(p=0.5)

Bases: torchvision.transforms.transforms.RandomVerticalFlip

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be flipped.

Returns

Randomly flipped image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.Resize(size, interpolation=<InterpolationMode.BILINEAR: 'bilinear'>, max_size=None, antialias=None)

Bases: torchvision.transforms.transforms.Resize

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be scaled.

Returns

Rescaled image.

Return type

PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.TenCrop(size, vertical_flip=False)

Bases: torchvision.transforms.transforms.TenCrop

forward(results)
Parameters

img (PIL Image or Tensor) – Image to be cropped.

Returns

tuple of 10 images. Image can be PIL Image or Tensor

training: bool
class easycv.datasets.shared.pipelines.third_transforms_wrapper.ToPILImage(mode=None)

Bases: torchvision.transforms.transforms.ToPILImage

class easycv.datasets.shared.pipelines.third_transforms_wrapper.ToTensor

Bases: torchvision.transforms.transforms.ToTensor

class easycv.datasets.shared.pipelines.third_transforms_wrapper.TrivialAugmentWide(num_magnitude_bins: int = 31, interpolation: torchvision.transforms.functional.InterpolationMode = <InterpolationMode.NEAREST: 'nearest'>, fill: Optional[List[float]] = None)

Bases: torchvision.transforms.autoaugment.TrivialAugmentWide

forward(results)

img (PIL Image or Tensor): Image to be transformed.

Returns

Transformed image.

Return type

PIL Image or Tensor

training: bool
easycv.datasets.shared.pipelines.transforms module
class easycv.datasets.shared.pipelines.transforms.Compose(transforms, profiling=False)[source]

Bases: object

Compose a data pipeline with a sequence of transforms. :param transforms: Either config dicts of transforms or transform objects. :type transforms: list[dict | callable]

__init__(transforms, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.pipelines.transforms.LoadImage(to_float32=False, mode='bgr')[source]

Bases: object

Load an image from file or numpy or PIL object. :param to_float32: Whether to convert the loaded image to a float32

numpy array. If set to False, the loaded image is an uint8 array. Defaults to False.

__init__(to_float32=False, mode='bgr')[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.datasets.shared.base module
class easycv.datasets.shared.base.BaseDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

Base Dataset

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

abstract evaluate(results, evaluators, logger=None, **kwargs)[source]
visualize(results, **kwargs)[source]

Visulaize the model output results on validation data. Returns: A dictionary

If add image visualization, return dict containing

images: List of visulaized images. img_metas: List of length number of test images,

dict of image meta info, containing filename, img_shape, origin_img_shape, scale_factor and so on.

easycv.datasets.shared.dali_tfrecord_imagenet module
class easycv.datasets.shared.dali_tfrecord_imagenet.DaliLoaderWrapper(dali_loader, batch_size, label_offset=0)[source]

Bases: object

__init__(dali_loader, batch_size, label_offset=0)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]

evaluate classification task

Parameters
  • results – a dict of list of tensor, including prediction and groundtruth info, where prediction tensor is NxC,and the same with groundtruth labels.

  • evaluators – a list of evaluator

Returns

a dict of float, different metric values

Return type

eval_result

class easycv.datasets.shared.dali_tfrecord_imagenet.DaliImageNetTFRecordDataSet(data_source, pipeline, distributed, batch_size, label_offset=0, random_shuffle=True, workers_per_gpu=2)[source]

Bases: object

__init__(data_source, pipeline, distributed, batch_size, label_offset=0, random_shuffle=True, workers_per_gpu=2)[source]

Initialize self. See help(type(self)) for accurate signature.

get_dataloader()[source]
easycv.datasets.shared.dali_tfrecord_multi_view module
class easycv.datasets.shared.dali_tfrecord_multi_view.DaliLoaderWrapper(dali_loader, batch_size, return_list)[source]

Bases: object

__init__(dali_loader, batch_size, return_list)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.datasets.shared.dali_tfrecord_multi_view.DaliTFRecordMultiViewDataset(data_source, num_views, pipelines, distributed, batch_size, random_shuffle=True, workers_per_gpu=2)[source]

Bases: object

Adapt to dali, the dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines, distributed, batch_size, random_shuffle=True, workers_per_gpu=2)[source]

Initialize self. See help(type(self)) for accurate signature.

get_dataloader()[source]
easycv.datasets.shared.dataset_wrappers module
class easycv.datasets.shared.dataset_wrappers.ConcatDataset(datasets)[source]

Bases: torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]

A wrapper of concatenated dataset.

Same as torch.utils.data.dataset.ConcatDataset, but concat the group flag for image aspect ratio.

Parameters

datasets (list[Dataset]) – A list of datasets.

__init__(datasets)[source]

Initialize self. See help(type(self)) for accurate signature.

datasets: List[torch.utils.data.dataset.Dataset[torch.utils.data.dataset.T_co]]
cumulative_sizes: List[int]
class easycv.datasets.shared.dataset_wrappers.RepeatDataset(dataset, times)[source]

Bases: object

A wrapper of repeated dataset.

The length of repeated dataset will be times larger than the original dataset. This is useful when the data loading time is long but the dataset is small. Using RepeatDataset can reduce the data loading time between epochs.

Parameters
  • dataset (Dataset) – The dataset to be repeated.

  • times (int) – Repeat times.

__init__(dataset, times)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.datasets.shared.multi_view module
class easycv.datasets.shared.multi_view.MultiViewDataset(data_source, num_views, pipelines)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

The dataset outputs multiple views of an image. The number of views in the output dict depends on num_views. The image can be processed by one pipeline or multiple piepelines. :param num_views: The number of different views. :type num_views: list :param pipelines: A list of pipelines. :type pipelines: list[list[dict]]

__init__(data_source, num_views, pipelines)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(results, evaluators, logger=None)[source]
easycv.datasets.shared.odps_reader module
easycv.datasets.shared.odps_reader.set_dataloader_workid(value)[source]
easycv.datasets.shared.odps_reader.set_dataloader_worknum(value)[source]
easycv.datasets.shared.odps_reader.get_dist_image(img_url, max_try=10)[source]
class easycv.datasets.shared.odps_reader.OdpsReader(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]

Bases: object

__init__(table_name, selected_cols=[], excluded_cols=[], random_start=False, odps_io_config=None, image_col=['url_image'], image_type=['url'])[source]

Init odps reader and datasource set to load data from odps table

Parameters
  • table_name (str) – odps table to load

  • selected_cols (list(str)) – select column

  • excluded_cols (list(str)) – exclude column

  • random_start (bool) – random start for odps table

  • odps_io_config (dict) – odps config contains access_id, access_key, endpoint

  • image_col (list(str)) – image column names

  • image_type (list(str)) – image column types support url/base64, must be same length with image type or 0

Returns :

None

reset_reader(dataloader_workid, dataloader_worknum)[source]
b64_decode()[source]
easycv.datasets.shared.raw module
class easycv.datasets.shared.raw.RawDataset(data_source, pipeline, profiling=False)[source]

Bases: Generic[torch.utils.data.dataset.T_co]

__init__(data_source, pipeline, profiling=False)[source]

Initialize self. See help(type(self)) for accurate signature.

evaluate(scores, keyword, logger=None)[source]

easycv.datasets.utils package

Submodules
easycv.datasets.utils.tfrecord_util module
easycv.datasets.utils.tfrecord_util.get_imagenet_dali_tfrecord_feature()[source]
easycv.datasets.utils.tfrecord_util.get_path_and_index(file_list_or_path)[source]
easycv.datasets.utils.tfrecord_util.download_tfrecord(file_list_or_path, target_path, slice_count=1, slice_id=0, force=False)[source]

Download data from oss. Use the processes on the gpus to slice download, each gpu process downloads part of the data. The number of slices is the same as the number of gpu processes. Support tfrecord of ImageNet style. tfrecord_dir

|—train1 |—train1.idx |—train2 |—train2.idx |—…

Parameters
  • file_list_or_path – A list of absolute data path or a path str type(file_list) == list means this is the list type(file_list) == str means open(file_list).readlines()

  • target_path – A str, download path

  • slice_count – Download worker num

  • slice_id – Download worker ID

  • force – If false, skip download if the file already exists in the target path. If true, recopy and replace the original file.

Returns

list of str, download tfrecord path index_path: list of str, download tfrecord idx path

Return type

path

easycv.datasets.utils.type_util module
easycv.datasets.utils.type_util.is_dali_dataset_type(type_name)[source]

Submodules

easycv.datasets.builder module

easycv.datasets.builder.build_dataset(cfg, default_args=None)[source]
easycv.datasets.builder.build_dali_dataset(cfg, default_args=None)[source]
easycv.datasets.builder.build_datasource(cfg)[source]
easycv.datasets.builder.build_sampler(cfg, default_args=None)[source]

easycv.datasets.registry module

easycv.hooks package

class easycv.hooks.BestCkptSaverHook(by_epoch=True, save_optimizer=True, best_metric_name=[], best_metric_type=[], **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Save checkpoints periodically.

Parameters
  • by_epoch (bool) – Saving checkpoints by epoch or by iteration. Default: True.

  • save_optimizer (bool) – Whether to save optimizer state_dict in the checkpoint. It is usually used for resuming experiments. Default: True.

  • best_metric_name (List(str)) – metric name to save best, such as “neck_top1”… Default: [], do not save anything

  • best_metric_type (List(str)) – metric type to define best, should be “max”, “min” if len(best_metric_type) <= len(best_metric_type), use “max” to append.

__init__(by_epoch=True, save_optimizer=True, best_metric_name=[], best_metric_type=[], **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_run(runner)[source]
after_train_epoch(runner)[source]
easycv.hooks.build_hook(cfg, default_args=None)[source]
class easycv.hooks.BYOLHook(end_momentum=1.0, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Hook in BYOL

This hook including momentum adjustment in BYOL following:

m = 1 - ( 1- m_0) * (cos(pi * k / K) + 1) / 2, k: current step, K: total steps.

__init__(end_momentum=1.0, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_iter(runner)[source]
class easycv.hooks.DINOHook(momentum_teacher=0.996, weight_decay=0.04, weight_decay_end=0.4, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Hook in DINO

__init__(momentum_teacher=0.996, weight_decay=0.04, weight_decay_end=0.4, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_run(runner)[source]
before_train_iter(runner)[source]
after_train_iter(runner)[source]
before_train_epoch(runner)[source]
class easycv.hooks.EMAHook(decay=0.9999, copy_model_attr=())[source]

Bases: mmcv.runner.hooks.hook.Hook

Hook to carry out Exponential Moving Average

__init__(decay=0.9999, copy_model_attr=())[source]
Parameters
  • decay – decay rate for exponetial moving average

  • copy_model_attr – attribute to copy from origin model to ema model

before_run(runner)[source]
before_train_epoch(runner)[source]
after_train_iter(runner)[source]
class easycv.hooks.DistEvalHook(dataloader, interval=1, mode='test', initial=False, gpu_collect=False, flush_buffer=True, broadcast_bn_buffer=True, **eval_kwargs)[source]

Bases: easycv.hooks.eval_hook.EvalHook

Distributed evaluation hook.

dataloader

A PyTorch dataloader.

Type

DataLoader

interval

Evaluation interval (by epochs). Default: 1.

Type

int

mode

model forward mode

Type

str

tmpdir

Temporary directory to save the results of all processes. Default: None.

Type

str | None

gpu_collect

Whether to use gpu or cpu to collect results. Default: False.

Type

bool

broadcast_bn_buffer

Whether to broadcast the buffer(running_mean and running_var) of rank 0 to other rank before evaluation. Default: True.

Type

bool

__init__(dataloader, interval=1, mode='test', initial=False, gpu_collect=False, flush_buffer=True, broadcast_bn_buffer=True, **eval_kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

after_train_epoch(runner)[source]
class easycv.hooks.EvalHook(dataloader, initial=False, interval=1, mode='test', flush_buffer=True, **eval_kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Evaluation hook.

dataloader

A PyTorch dataloader.

Type

DataLoader

interval

Evaluation interval (by epochs). Default: 1.

Type

int

mode

model forward mode

Type

str

flush_buffer

flush log buffer

Type

bool

__init__(dataloader, initial=False, interval=1, mode='test', flush_buffer=True, **eval_kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_run(runner)[source]
after_train_epoch(runner)[source]
add_visualization_info(runner, results)[source]
evaluate(runner, results)[source]
class easycv.hooks.ExportHook(cfg, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', export_after_each_ckpt=False)[source]

Bases: mmcv.runner.hooks.hook.Hook

export model when training on pai

__init__(cfg, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', export_after_each_ckpt=False)[source]
Parameters
  • cfg – config dict

  • ckpt_filename_tmpl – checkpoint filename template

export_model(runner, epoch)[source]
after_train_iter(runner)[source]
after_train_epoch(runner)[source]
after_run(runner)[source]
class easycv.hooks.Extractor(dataset, imgs_per_gpu, workers_per_gpu, dist_mode=False)[source]

Bases: object

__init__(dataset, imgs_per_gpu, workers_per_gpu, dist_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.hooks.OptimizerHook(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], multiply_key=[], multiply_rate=[])[source]

Bases: mmcv.runner.hooks.optimizer.OptimizerHook

__init__(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], multiply_key=[], multiply_rate=[])[source]

ignore_key: [str,…], ignore_key[i], name of parameters, which’s gradient will be set to zero before every optimizer step when epoch < ignore_key_epoch[i] ignore_key_epoch: [int,…], epoch < ignore_key_epoch[i], ignore_key[i]’s gradient will be set to zero. multiply_key:[str,…] multiply_key[i], name of parameters, which will set different learning rate ratio by multipy_rate multiply_rate:[float,…] multiply_rate[i], different ratio

skip_ignore_key(runner)[source]
multiply_grad(runner)[source]
adapt_torchacc(runner)[source]
after_train_iter(runner)[source]
class easycv.hooks.OSSSyncHook(work_dir, oss_work_dir, interval=1, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', other_file_list=[], iter_interval=None)[source]

Bases: mmcv.runner.hooks.hook.Hook

upload log files and checkpoints to oss when training on pai

__init__(work_dir, oss_work_dir, interval=1, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', other_file_list=[], iter_interval=None)[source]
Parameters
  • work_dir – work_dir in cfg

  • oss_work_dir – oss directory where to upload local files in work_dir

  • interval – upload frequency

  • ckpt_filename_tmpl – checkpoint filename template

  • other_file_list – other file need to be upload to oss

  • iter_interval – upload frequency by iter interval, default to be None, means do it with certain assignment

upload_file(runner)[source]
after_train_iter(runner)[source]
after_train_epoch(runner)[source]
after_run(runner)[source]
class easycv.hooks.TIMEHook(end_momentum=1.0, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

This hook to show time for runner running process

__init__(end_momentum=1.0, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_iter(runner)[source]
after_train_iter(runner)[source]
class easycv.hooks.SWAVHook(gpu_batch_size=32, dump_path='data/', **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Hook in SWAV

__init__(gpu_batch_size=32, dump_path='data/', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_run(runner)[source]
before_train_epoch(runner)[source]
after_train_epoch(runner)[source]
class easycv.hooks.SyncNormHook(no_aug_epochs=15, interval=1, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Synchronize Norm states after training epoch, currently used in YOLOX.

Parameters
  • no_aug_epochs (int) – The number of latter epochs in the end of the training to switch to synchronizing norm interval. Default: 15.

  • interval (int) – Synchronizing norm interval. Default: 1.

__init__(no_aug_epochs=15, interval=1, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_epoch(runner)[source]
after_train_epoch(runner)[source]

Synchronizing norm.

class easycv.hooks.SyncRandomSizeHook(ratio_range=(14, 26), img_scale=(640, 640), interval=10, device='cuda', **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Change and synchronize the random image size across ranks, currently used in YOLOX.

Parameters
  • ratio_range (tuple[int]) – Random ratio range. It will be multiplied by 32, and then change the dataset output image size. Default: (14, 26).

  • img_scale (tuple[int]) – Size of input image. Default: (640, 640).

  • interval (int) – The interval of change image size. Default: 10.

  • device (torch.device | str) – device for returned tensors. Default: ‘cuda’.

__init__(ratio_range=(14, 26), img_scale=(640, 640), interval=10, device='cuda', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

after_train_iter(runner)[source]

Change the dataset output image size.

class easycv.hooks.TensorboardLoggerHookV2(log_dir=None, interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[source]

Bases: mmcv.runner.hooks.logger.tensorboard.TensorboardLoggerHook

visualization_log(runner)[source]

Images Visulization. visualization_buffer is a dictionary containing:

images (list): list of visulaized images. img_metas (list of dict, optional): dict containing ori_filename and so on.

ori_filename will be displayed as the tag of the image by default.

log(runner)[source]
after_train_iter(runner)[source]
class easycv.hooks.WandbLoggerHookV2(init_kwargs=None, interval=10, ignore_last=True, reset_flag=False, commit=True, by_epoch=True, with_step=True)[source]

Bases: mmcv.runner.hooks.logger.wandb.WandbLoggerHook

visualization_log(runner)[source]

Images Visulization. visualization_buffer is a dictionary containing:

images (list): list of visulaized images. img_metas (list of dict, optional): dict containing ori_filename and so on.

ori_filename will be displayed as the tag of the image by default.

log(runner)[source]
after_train_iter(runner)[source]
class easycv.hooks.YOLOXLrUpdaterHook(num_last_epochs, **kwargs)[source]

Bases: mmcv.runner.hooks.lr_updater.CosineAnnealingLrUpdaterHook

YOLOX learning rate scheme.

There are two main differences between YOLOXLrUpdaterHook and CosineAnnealingLrUpdaterHook.

  1. When the current running epoch is greater than

    max_epoch-last_epoch, a fixed learning rate will be used

  2. The exp warmup scheme is different with LrUpdaterHook in MMCV

Parameters

num_last_epochs (int) – The number of epochs with a fixed learning rate before the end of the training.

__init__(num_last_epochs, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

get_warmup_lr(cur_iters)[source]
get_lr(runner, base_lr)[source]
class easycv.hooks.YOLOXModeSwitchHook(no_aug_epochs=15, skip_type_keys=('MMMosaic', 'MMRandomAffine', 'MMMixUp'), **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Switch the mode of YOLOX during training.

This hook turns off the mosaic and mixup data augmentation and switches to use L1 loss in bbox_head.

Parameters

no_aug_epochs – The number of latter epochs in the end of the training to close the data augmentation and switch to L1 loss. Default: 15.

__init__(no_aug_epochs=15, skip_type_keys=('MMMosaic', 'MMRandomAffine', 'MMMixUp'), **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_epoch(runner)[source]

Close mosaic and mixup augmentation and switches to use L1 loss.

class easycv.hooks.MixupCollateHook(**kwargs)[source]

Bases: easycv.hooks.collate_hook.BaseCollateHook

Mixedup data batch, should be used after merges a list of samples to form a mini-batch of Tensor(s).

__init__(**kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

after_collate(results)[source]
class easycv.hooks.PreLoggerHook(interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[source]

Bases: mmcv.runner.hooks.logger.base.LoggerHook

fetch_tensor(runner, n=0)[source]

Fetch latest n values or all values, process tensor type, convert to numpy for dump logs.

after_train_iter(runner)[source]
after_val_epoch(runner)[source]
class easycv.hooks.StepFixCosineAnnealingLrUpdaterHook(min_lr=None, min_lr_ratio=None, **kwargs)[source]

Bases: mmcv.runner.hooks.lr_updater.CosineAnnealingLrUpdaterHook

get_warmup_lr(cur_iters)[source]
get_lr(runner, base_lr)[source]
class easycv.hooks.CosineAnnealingWarmupByEpochLrUpdaterHook(min_lr=None, min_lr_ratio=None, **kwargs)[source]

Bases: mmcv.runner.hooks.lr_updater.CosineAnnealingLrUpdaterHook

before_train_iter(runner: mmcv.runner.base_runner.BaseRunner)[source]
class easycv.hooks.ThroughputHook(warmup_iters=0, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Count the throughput per second of all steps in the history. warmup_iters can be set to skip the calculation of the first few steps, if the initialization of the first few steps is slow.

__init__(warmup_iters=0, **kwargs)None[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_epoch(runner)[source]

reset per epoch

before_train_iter(runner)[source]
after_train_iter(runner)[source]
class easycv.hooks.AMPFP16OptimizerHook(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], loss_scale={})[source]

Bases: easycv.hooks.optimizer_hook.OptimizerHook

__init__(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], loss_scale={})[source]

ignore_key: [str,…], ignore_key[i], name of parameters, which’s gradient will be set to zero before every optimizer step when epoch < ignore_key_epoch[i] ignore_key_epoch: [int,…], epoch < ignore_key_epoch[i], ignore_key[i]’s gradient will be set to zero. loss_scale (float | dict): grade scale config. If loss_scale is a float, static loss scaling will be used with the specified scale.

It can also be a dict containing arguments of GradScalar. For Pytorch >= 1.6, we use official torch.cuda.amp.GradScaler. please refer to: https://pytorch.org/docs/stable/amp.html#torch.cuda.amp.GradScaler for the parameters.

before_run(runner)[source]
after_train_iter(runner)[source]

Submodules

easycv.hooks.best_ckpt_saver_hook module

class easycv.hooks.best_ckpt_saver_hook.BestCkptSaverHook(by_epoch=True, save_optimizer=True, best_metric_name=[], best_metric_type=[], **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Save checkpoints periodically.

Parameters
  • by_epoch (bool) – Saving checkpoints by epoch or by iteration. Default: True.

  • save_optimizer (bool) – Whether to save optimizer state_dict in the checkpoint. It is usually used for resuming experiments. Default: True.

  • best_metric_name (List(str)) – metric name to save best, such as “neck_top1”… Default: [], do not save anything

  • best_metric_type (List(str)) – metric type to define best, should be “max”, “min” if len(best_metric_type) <= len(best_metric_type), use “max” to append.

__init__(by_epoch=True, save_optimizer=True, best_metric_name=[], best_metric_type=[], **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_run(runner)[source]
after_train_epoch(runner)[source]

easycv.hooks.builder module

easycv.hooks.builder.build_hook(cfg, default_args=None)[source]

easycv.hooks.byol_hook module

class easycv.hooks.byol_hook.BYOLHook(end_momentum=1.0, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Hook in BYOL

This hook including momentum adjustment in BYOL following:

m = 1 - ( 1- m_0) * (cos(pi * k / K) + 1) / 2, k: current step, K: total steps.

__init__(end_momentum=1.0, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_iter(runner)[source]

easycv.hooks.dino_hook module

easycv.hooks.dino_hook.cosine_scheduler(base_value, final_value, epochs, niter_per_ep, warmup_epochs=0, start_warmup_value=0)[source]
class easycv.hooks.dino_hook.DINOHook(momentum_teacher=0.996, weight_decay=0.04, weight_decay_end=0.4, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Hook in DINO

__init__(momentum_teacher=0.996, weight_decay=0.04, weight_decay_end=0.4, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_run(runner)[source]
before_train_iter(runner)[source]
after_train_iter(runner)[source]
before_train_epoch(runner)[source]

easycv.hooks.ema_hook module

class easycv.hooks.ema_hook.ModelEMA(model, decay=0.9999, updates=0)[source]

Bases: object

Model Exponential Moving Average from https://github.com/rwightman/pytorch-image-models Keep a moving average of everything in the model state_dict (parameters and buffers). This is intended to allow functionality like https://www.tensorflow.org/api_docs/python/tf/train/ExponentialMovingAverage A smoothed version of the weights is necessary for some training schemes to perform well. This class is sensitive where it is initialized in the sequence of model init, GPU assignment and distributed training wrappers.

In Yolo5s, ema help increase mAP from 0.27 to 0.353

__init__(model, decay=0.9999, updates=0)[source]

Initialize self. See help(type(self)) for accurate signature.

update(model)[source]
update_attr(model, include=(), exclude=('process_group', 'reducer'))[source]
class easycv.hooks.ema_hook.EMAHook(decay=0.9999, copy_model_attr=())[source]

Bases: mmcv.runner.hooks.hook.Hook

Hook to carry out Exponential Moving Average

__init__(decay=0.9999, copy_model_attr=())[source]
Parameters
  • decay – decay rate for exponetial moving average

  • copy_model_attr – attribute to copy from origin model to ema model

before_run(runner)[source]
before_train_epoch(runner)[source]
after_train_iter(runner)[source]

easycv.hooks.eval_hook module

class easycv.hooks.eval_hook.EvalHook(dataloader, initial=False, interval=1, mode='test', flush_buffer=True, **eval_kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Evaluation hook.

dataloader

A PyTorch dataloader.

Type

DataLoader

interval

Evaluation interval (by epochs). Default: 1.

Type

int

mode

model forward mode

Type

str

flush_buffer

flush log buffer

Type

bool

__init__(dataloader, initial=False, interval=1, mode='test', flush_buffer=True, **eval_kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_run(runner)[source]
after_train_epoch(runner)[source]
add_visualization_info(runner, results)[source]
evaluate(runner, results)[source]
class easycv.hooks.eval_hook.DistEvalHook(dataloader, interval=1, mode='test', initial=False, gpu_collect=False, flush_buffer=True, broadcast_bn_buffer=True, **eval_kwargs)[source]

Bases: easycv.hooks.eval_hook.EvalHook

Distributed evaluation hook.

dataloader

A PyTorch dataloader.

Type

DataLoader

interval

Evaluation interval (by epochs). Default: 1.

Type

int

mode

model forward mode

Type

str

tmpdir

Temporary directory to save the results of all processes. Default: None.

Type

str | None

gpu_collect

Whether to use gpu or cpu to collect results. Default: False.

Type

bool

broadcast_bn_buffer

Whether to broadcast the buffer(running_mean and running_var) of rank 0 to other rank before evaluation. Default: True.

Type

bool

__init__(dataloader, interval=1, mode='test', initial=False, gpu_collect=False, flush_buffer=True, broadcast_bn_buffer=True, **eval_kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

after_train_epoch(runner)[source]

easycv.hooks.export_hook module

class easycv.hooks.export_hook.ExportHook(cfg, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', export_after_each_ckpt=False)[source]

Bases: mmcv.runner.hooks.hook.Hook

export model when training on pai

__init__(cfg, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', export_after_each_ckpt=False)[source]
Parameters
  • cfg – config dict

  • ckpt_filename_tmpl – checkpoint filename template

export_model(runner, epoch)[source]
after_train_iter(runner)[source]
after_train_epoch(runner)[source]
after_run(runner)[source]

easycv.hooks.extractor module

class easycv.hooks.extractor.Extractor(dataset, imgs_per_gpu, workers_per_gpu, dist_mode=False)[source]

Bases: object

__init__(dataset, imgs_per_gpu, workers_per_gpu, dist_mode=False)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.hooks.optimizer_hook module

class easycv.hooks.optimizer_hook.OptimizerHook(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], multiply_key=[], multiply_rate=[])[source]

Bases: mmcv.runner.hooks.optimizer.OptimizerHook

__init__(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], multiply_key=[], multiply_rate=[])[source]

ignore_key: [str,…], ignore_key[i], name of parameters, which’s gradient will be set to zero before every optimizer step when epoch < ignore_key_epoch[i] ignore_key_epoch: [int,…], epoch < ignore_key_epoch[i], ignore_key[i]’s gradient will be set to zero. multiply_key:[str,…] multiply_key[i], name of parameters, which will set different learning rate ratio by multipy_rate multiply_rate:[float,…] multiply_rate[i], different ratio

skip_ignore_key(runner)[source]
multiply_grad(runner)[source]
adapt_torchacc(runner)[source]
after_train_iter(runner)[source]
class easycv.hooks.optimizer_hook.AMPFP16OptimizerHook(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], loss_scale={})[source]

Bases: easycv.hooks.optimizer_hook.OptimizerHook

__init__(update_interval=1, grad_clip=None, coalesce=True, bucket_size_mb=- 1, ignore_key=[], ignore_key_epoch=[], loss_scale={})[source]

ignore_key: [str,…], ignore_key[i], name of parameters, which’s gradient will be set to zero before every optimizer step when epoch < ignore_key_epoch[i] ignore_key_epoch: [int,…], epoch < ignore_key_epoch[i], ignore_key[i]’s gradient will be set to zero. loss_scale (float | dict): grade scale config. If loss_scale is a float, static loss scaling will be used with the specified scale.

It can also be a dict containing arguments of GradScalar. For Pytorch >= 1.6, we use official torch.cuda.amp.GradScaler. please refer to: https://pytorch.org/docs/stable/amp.html#torch.cuda.amp.GradScaler for the parameters.

before_run(runner)[source]
after_train_iter(runner)[source]

easycv.hooks.oss_sync_hook module

class easycv.hooks.oss_sync_hook.OSSSyncHook(work_dir, oss_work_dir, interval=1, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', other_file_list=[], iter_interval=None)[source]

Bases: mmcv.runner.hooks.hook.Hook

upload log files and checkpoints to oss when training on pai

__init__(work_dir, oss_work_dir, interval=1, ckpt_filename_tmpl='epoch_{}.pth', export_ckpt_filename_tmpl='epoch_{}_export.pt', other_file_list=[], iter_interval=None)[source]
Parameters
  • work_dir – work_dir in cfg

  • oss_work_dir – oss directory where to upload local files in work_dir

  • interval – upload frequency

  • ckpt_filename_tmpl – checkpoint filename template

  • other_file_list – other file need to be upload to oss

  • iter_interval – upload frequency by iter interval, default to be None, means do it with certain assignment

upload_file(runner)[source]
after_train_iter(runner)[source]
after_train_epoch(runner)[source]
after_run(runner)[source]

easycv.hooks.registry module

easycv.hooks.show_time_hook module

class easycv.hooks.show_time_hook.TIMEHook(end_momentum=1.0, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

This hook to show time for runner running process

__init__(end_momentum=1.0, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_iter(runner)[source]
after_train_iter(runner)[source]

easycv.hooks.swav_hook module

class easycv.hooks.swav_hook.SWAVHook(gpu_batch_size=32, dump_path='data/', **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Hook in SWAV

__init__(gpu_batch_size=32, dump_path='data/', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_run(runner)[source]
before_train_epoch(runner)[source]
after_train_epoch(runner)[source]

easycv.hooks.sync_norm_hook module

easycv.hooks.sync_norm_hook.get_norm_states(module)[source]
class easycv.hooks.sync_norm_hook.SyncNormHook(no_aug_epochs=15, interval=1, **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Synchronize Norm states after training epoch, currently used in YOLOX.

Parameters
  • no_aug_epochs (int) – The number of latter epochs in the end of the training to switch to synchronizing norm interval. Default: 15.

  • interval (int) – Synchronizing norm interval. Default: 1.

__init__(no_aug_epochs=15, interval=1, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_epoch(runner)[source]
after_train_epoch(runner)[source]

Synchronizing norm.

easycv.hooks.sync_random_size_hook module

class easycv.hooks.sync_random_size_hook.SyncRandomSizeHook(ratio_range=(14, 26), img_scale=(640, 640), interval=10, device='cuda', **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Change and synchronize the random image size across ranks, currently used in YOLOX.

Parameters
  • ratio_range (tuple[int]) – Random ratio range. It will be multiplied by 32, and then change the dataset output image size. Default: (14, 26).

  • img_scale (tuple[int]) – Size of input image. Default: (640, 640).

  • interval (int) – The interval of change image size. Default: 10.

  • device (torch.device | str) – device for returned tensors. Default: ‘cuda’.

__init__(ratio_range=(14, 26), img_scale=(640, 640), interval=10, device='cuda', **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

after_train_iter(runner)[source]

Change the dataset output image size.

easycv.hooks.tensorboard module

class easycv.hooks.tensorboard.TensorboardLoggerHookV2(log_dir=None, interval=10, ignore_last=True, reset_flag=False, by_epoch=True)[source]

Bases: mmcv.runner.hooks.logger.tensorboard.TensorboardLoggerHook

visualization_log(runner)[source]

Images Visulization. visualization_buffer is a dictionary containing:

images (list): list of visulaized images. img_metas (list of dict, optional): dict containing ori_filename and so on.

ori_filename will be displayed as the tag of the image by default.

log(runner)[source]
after_train_iter(runner)[source]

easycv.hooks.wandb module

class easycv.hooks.wandb.WandbLoggerHookV2(init_kwargs=None, interval=10, ignore_last=True, reset_flag=False, commit=True, by_epoch=True, with_step=True)[source]

Bases: mmcv.runner.hooks.logger.wandb.WandbLoggerHook

visualization_log(runner)[source]

Images Visulization. visualization_buffer is a dictionary containing:

images (list): list of visulaized images. img_metas (list of dict, optional): dict containing ori_filename and so on.

ori_filename will be displayed as the tag of the image by default.

log(runner)[source]
after_train_iter(runner)[source]

easycv.hooks.yolox_lr_hook module

class easycv.hooks.yolox_lr_hook.YOLOXLrUpdaterHook(num_last_epochs, **kwargs)[source]

Bases: mmcv.runner.hooks.lr_updater.CosineAnnealingLrUpdaterHook

YOLOX learning rate scheme.

There are two main differences between YOLOXLrUpdaterHook and CosineAnnealingLrUpdaterHook.

  1. When the current running epoch is greater than

    max_epoch-last_epoch, a fixed learning rate will be used

  2. The exp warmup scheme is different with LrUpdaterHook in MMCV

Parameters

num_last_epochs (int) – The number of epochs with a fixed learning rate before the end of the training.

__init__(num_last_epochs, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

get_warmup_lr(cur_iters)[source]
get_lr(runner, base_lr)[source]

easycv.hooks.yolox_mode_switch_hook module

class easycv.hooks.yolox_mode_switch_hook.YOLOXModeSwitchHook(no_aug_epochs=15, skip_type_keys=('MMMosaic', 'MMRandomAffine', 'MMMixUp'), **kwargs)[source]

Bases: mmcv.runner.hooks.hook.Hook

Switch the mode of YOLOX during training.

This hook turns off the mosaic and mixup data augmentation and switches to use L1 loss in bbox_head.

Parameters

no_aug_epochs – The number of latter epochs in the end of the training to close the data augmentation and switch to L1 loss. Default: 15.

__init__(no_aug_epochs=15, skip_type_keys=('MMMosaic', 'MMRandomAffine', 'MMMixUp'), **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

before_train_epoch(runner)[source]

Close mosaic and mixup augmentation and switches to use L1 loss.

easycv.predictors package

Submodules

easycv.predictors.base module

class easycv.predictors.base.NumpyToPIL[source]

Bases: object

class easycv.predictors.base.Predictor(model_path, numpy_to_pil=True)[source]

Bases: object

__init__(model_path, numpy_to_pil=True)[source]

Initialize self. See help(type(self)) for accurate signature.

preprocess(image_list)[source]
predict_batch(image_batch, **forward_kwargs)[source]

predict using batched data :param image_batch: tensor with shape [N, 3, H, W] :type image_batch: torch.Tensor :param forward_kwargs: kwargs for additional parameters

Returns

the output of model.forward, list or tuple

Return type

output

class easycv.predictors.base.InputProcessor(cfg, pipelines=None, batch_size=1, threads=8, mode='BGR')[source]

Bases: object

Base input processor for processing input samples. :param cfg: Config instance. :type cfg: Config :param pipelines: Data pipeline configs. :type pipelines: list[dict] :param batch_size: batch size for forward. :type batch_size: int :param threads: Number of processes to process inputs. :type threads: int :param mode: The image mode into the model. :type mode: str

__init__(cfg, pipelines=None, batch_size=1, threads=8, mode='BGR')[source]

Initialize self. See help(type(self)) for accurate signature.

build_processor()[source]

Build processor to process loaded input. If you need custom preprocessing ops, you need to reimplement it.

process_single(input)[source]

Process single input sample. If you need custom ops to load or process a single input sample, you need to reimplement it.

class easycv.predictors.base.OutputProcessor[source]

Bases: object

Base output processor for processing model outputs.

__init__()[source]

Initialize self. See help(type(self)) for accurate signature.

process_single(inputs)[source]

Process outputs of single sample. If you need add some processing ops, you need to reimplement it.

class easycv.predictors.base.PredictorV2(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, input_processor_threads=8, mode='BGR')[source]

Bases: object

Base predict pipeline. :param model_path: Path of model path. :type model_path: str :param config_file: config file path for model and processor to init. Defaults to None. :type config_file: Optinal[str] :param batch_size: batch size for forward. :type batch_size: int :param device: Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically. :type device: str | torch.device :param save_results: Whether to save predict results. :type save_results: bool :param save_path: File path for saving results, only valid when save_results is True. :type save_path: str :param pipelines: Data pipeline configs. :type pipelines: list[dict] :param input_processor_threads: Number of processes to process inputs. :type input_processor_threads: int :param mode: The image mode into the model. :type mode: str

__init__(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, input_processor_threads=8, mode='BGR')[source]

Initialize self. See help(type(self)) for accurate signature.

get_input_processor()[source]
get_output_processor()[source]
prepare_model()[source]

Build model from config file by default. If the model is not loaded from a configuration file, e.g. torch jit model, you need to reimplement it.

model_forward(inputs)[source]

Model forward. If you need refactor model forward, you need to reimplement it.

dump(obj, save_path, mode='wb')[source]

easycv.predictors.builder module

easycv.predictors.builder.build_predictor(cfg, default_args=None)[source]

easycv.predictors.classifier module

class easycv.predictors.classifier.ClsInputProcessor(cfg, pipelines=None, batch_size=1, pil_input=True, threads=8, mode='BGR')[source]

Bases: easycv.predictors.base.InputProcessor

Process inputs for classification models.

Parameters
  • cfg (Config) – Config instance.

  • pipelines (list[dict]) – Data pipeline configs.

  • batch_size (int) – batch size for forward.

  • pil_input (bool) – Whether use PIL image. If processor need PIL input, set true, default false.

  • threads (int) – Number of processes to process inputs.

  • mode (str) – The image mode into the model.

__init__(cfg, pipelines=None, batch_size=1, pil_input=True, threads=8, mode='BGR')[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.predictors.classifier.ClsOutputProcessor(topk=1, label_map={})[source]

Bases: easycv.predictors.base.OutputProcessor

Output processor for processing classification model outputs.

Parameters
  • topk (int) – Return top-k results. Default: 1.

  • label_map (dict) – Dict of class id to class name.

__init__(topk=1, label_map={})[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.predictors.classifier.ClassificationPredictor(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, topk=1, pil_input=True, label_map_path=None, input_processor_threads=8, mode='BGR', *args, **kwargs)[source]

Bases: easycv.predictors.base.PredictorV2

Predictor for classification. :param model_path: Path of model path. :type model_path: str :param config_file: config file path for model and processor to init. Defaults to None. :type config_file: Optinal[str] :param batch_size: batch size for forward. :type batch_size: int :param device: Support ‘cuda’ or ‘cpu’, if is None, detect device automatically. :type device: str :param save_results: Whether to save predict results. :type save_results: bool :param save_path: File path for saving results, only valid when save_results is True. :type save_path: str :param pipelines: Data pipeline configs. :type pipelines: list[dict] :param topk: Return top-k results. Default: 1. :type topk: int :param pil_input: Whether use PIL image. If processor need PIL input, set true, default false. :type pil_input: bool :param label_map_path: File path of saving labels list. :type label_map_path: str :param input_processor_threads: Number of processes to process inputs. :type input_processor_threads: int :param mode: The image mode into the model. :type mode: str

__init__(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, topk=1, pil_input=True, label_map_path=None, input_processor_threads=8, mode='BGR', *args, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

get_input_processor()[source]
get_output_processor()[source]

easycv.predictors.detector module

easycv.predictors.detector.onnx_to_numpy(tensor)[source]
class easycv.predictors.detector.DetInputProcessor(cfg, pipelines=None, batch_size=1, threads=8, mode='BGR')[source]

Bases: easycv.predictors.base.InputProcessor

build_processor()[source]

Build processor to process loaded input. If you need custom preprocessing ops, you need to reimplement it.

class easycv.predictors.detector.DetOutputProcessor(score_thresh, classes=None)[source]

Bases: easycv.predictors.base.OutputProcessor

__init__(score_thresh, classes=None)[source]

Initialize self. See help(type(self)) for accurate signature.

process_single(inputs)[source]

Process outputs of single sample. If you need add some processing ops, you need to reimplement it.

class easycv.predictors.detector.DetectionPredictor(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, score_threshold=0.5, input_processor_threads=8, mode='BGR', *arg, **kwargs)[source]

Bases: easycv.predictors.base.PredictorV2

Generic Detection Predictor, it will filter bbox results by score_threshold .

Parameters
  • model_path (str) – Path of model path.

  • config_file (Optinal[str]) – config file path for model and processor to init. Defaults to None.

  • batch_size (int) – batch size for forward.

  • device (str | torch.device) – Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically.

  • save_results (bool) – Whether to save predict results.

  • save_path (str) – File path for saving results, only valid when save_results is True.

  • pipelines (list[dict]) – Data pipeline configs.

  • input_processor_threads (int) – Number of processes to process inputs.

  • mode (str) – The image mode into the model.

__init__(model_path, config_file=None, batch_size=1, device=None, save_results=False, save_path=None, pipelines=None, score_threshold=0.5, input_processor_threads=8, mode='BGR', *arg, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

get_input_processor()[source]
get_output_processor()[source]
visualize(img, results, show=False, out_file=None)[source]

Only support show one sample now.

class easycv.predictors.detector.YoloXInputProcessor(cfg, pipelines=None, batch_size=1, model_type='raw', jit_processor_path=None, device=None, threads=8, mode='BGR')[source]

Bases: easycv.predictors.detector.DetInputProcessor

Input processor for yolox.

Parameters
  • cfg (Config) – Config instance.

  • pipelines (list[dict]) – Data pipeline configs.

  • batch_size (int) – batch size for forward.

  • model_type (str) – “raw” or “jit” or “blade”

  • jit_processor_path (str) – File of the saved processing operator of torch jit type.

  • device (str | torch.device) – Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically.

  • threads (int) – Number of processes to process inputs.

  • mode (str) – The image mode into the model.

__init__(cfg, pipelines=None, batch_size=1, model_type='raw', jit_processor_path=None, device=None, threads=8, mode='BGR')[source]

Initialize self. See help(type(self)) for accurate signature.

build_processor()[source]

Build processor to process loaded input. If you need custom preprocessing ops, you need to reimplement it.

class easycv.predictors.detector.YoloXOutputProcessor(score_thresh=0.5, model_type='raw', test_conf=0.01, nms_thre=0.65, use_trt_efficientnms=False, classes=None)[source]

Bases: easycv.predictors.detector.DetOutputProcessor

__init__(score_thresh=0.5, model_type='raw', test_conf=0.01, nms_thre=0.65, use_trt_efficientnms=False, classes=None)[source]

Initialize self. See help(type(self)) for accurate signature.

post_assign(outputs, img_metas)[source]
process_single(inputs)[source]

Process outputs of single sample. If you need add some processing ops, you need to reimplement it.

class easycv.predictors.detector.YoloXPredictor(model_path, config_file=None, batch_size=1, use_trt_efficientnms=False, device=None, save_results=False, save_path=None, pipelines=None, max_det=100, score_thresh=0.5, nms_thresh=None, test_conf=None, input_processor_threads=8, mode='BGR', model_type=None)[source]

Bases: easycv.predictors.detector.DetectionPredictor

Detection predictor for Yolox.

Parameters
  • model_path (str) – Path of model path.

  • config_file (Optinal[str]) – config file path for model and processor to init. Defaults to None.

  • batch_size (int) – batch size for forward.

  • use_trt_efficientnms (bool) – Whether used tensorrt efficient nms operation in the saved model.

  • device (str | torch.device) – Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically.

  • save_results (bool) – Whether to save predict results.

  • save_path (str) – File path for saving results, only valid when save_results is True.

  • pipelines (list[dict]) – Data pipeline configs.

  • max_det (int) – Maximum number of detection output boxes.

  • score_thresh (float) – Score threshold to filter box.

  • nms_thresh (float) – Nms threshold to filter box.

  • input_processor_threads (int) – Number of processes to process inputs.

  • mode (str) – The image mode into the model.

__init__(model_path, config_file=None, batch_size=1, use_trt_efficientnms=False, device=None, save_results=False, save_path=None, pipelines=None, max_det=100, score_thresh=0.5, nms_thresh=None, test_conf=None, input_processor_threads=8, mode='BGR', model_type=None)[source]

Initialize self. See help(type(self)) for accurate signature.

prepare_model()[source]

Build model from config file by default. If the model is not loaded from a configuration file, e.g. torch jit model, you need to reimplement it.

model_forward(inputs)[source]

Model forward. If you need refactor model forward, you need to reimplement it.

get_input_processor()[source]
get_output_processor()[source]
class easycv.predictors.detector.TorchFaceDetector(model_path=None, model_config=None)[source]

Bases: easycv.predictors.interface.PredictorInterface

__init__(model_path=None, model_config=None)[source]

init model, add a facedetect and align for img input.

Parameters
  • model_path – model file path

  • model_config – config string for model to init, in json format

get_output_type()[source]

in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str

  • type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty

  • type video, data will be converted to encode video binary and write to oss file,

:: return {

‘image’: ‘image’, ‘feature’: ‘json’

} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json

batch(image_tensor_list)[source]
predict(input_data_list, batch_size=- 1, threshold=0.95)[source]

using session run predict a number of samples using batch_size

Parameters
  • input_data_list – a list of numpy array, each array is a sample to be predicted

  • batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime

Returns

a list of dict, each dict is the prediction result of one sample

eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array

Return type

result

Raises

if detect !=1 face in a img, then do nothing for this image

class easycv.predictors.detector.TorchYoloXClassifierPredictor(models_root_dir, max_det=100, cls_score_thresh=0.01, det_model_config=None, cls_model_config=None)[source]

Bases: easycv.predictors.interface.PredictorInterface

__init__(models_root_dir, max_det=100, cls_score_thresh=0.01, det_model_config=None, cls_model_config=None)[source]

init model, add a yolox and classification predictor for img input.

Parameters
  • models_root_dir – models_root_dir/detection/.pth and models_root_dir/classification/.pth

  • det_model_config – config string for detection model to init, in json format

  • cls_model_config – config string for classification model to init, in json format

predict(input_data_list, batch_size=- 1)[source]

using session run predict a number of samples using batch_size

Parameters
  • input_data_list – a list of numpy array(in rgb order), each array is a sample to be predicted

  • batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime

Returns

a list of dict, each dict is the prediction result of one sample

eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array

Return type

result

easycv.predictors.feature_extractor module

class easycv.predictors.feature_extractor.TorchFeatureExtractor(model_path, model_config=None)[source]

Bases: easycv.predictors.interface.PredictorInterface

__init__(model_path, model_config=None)[source]

init model

Parameters
  • model_path – model file path

  • model_config – config string for model to init, in json format

get_output_type()[source]

in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str

  • type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty

  • type video, data will be converted to encode video binary and write to oss file,

:: return {

‘image’: ‘image’, ‘feature’: ‘json’

} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json

batch(image_tensor_list)[source]
predict(input_data_list, batch_size=- 1)[source]

using session run predict a number of samples using batch_size

Parameters
  • input_data_list – a list of numpy array, each array is a sample to be predicted

  • batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime

Returns

a list of dict, each dict is the prediction result of one sample

eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array

Return type

result

class easycv.predictors.feature_extractor.TorchFaceFeatureExtractor(model_path, model_config=None)[source]

Bases: easycv.predictors.interface.PredictorInterface

__init__(model_path, model_config=None)[source]

init model, add a facedetect and align for img input.

Parameters
  • model_path – model file path

  • model_config – config string for model to init, in json format

get_output_type()[source]

in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str

  • type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty

  • type video, data will be converted to encode video binary and write to oss file,

:: return {

‘image’: ‘image’, ‘feature’: ‘json’

} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json

batch(image_tensor_list)[source]
predict(input_data_list, batch_size=- 1, detect_and_align=True)[source]

using session run predict a number of samples using batch_size

Parameters
  • input_data_list – a list of numpy array or PIL.Image, each array is a sample to be predicted

  • batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime

  • detect_and_align – True to detect and align before feature extractor

Returns

a list of dict, each dict is the prediction result of one sample

eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array

Return type

result

Raises

if detect !=1 face in a img, then do nothing for this image

class easycv.predictors.feature_extractor.TorchMultiFaceFeatureExtractor(model_path, model_config=None)[source]

Bases: easycv.predictors.interface.PredictorInterface

__init__(model_path, model_config=None)[source]

init model, add a facedetect and align for img input.

Parameters
  • model_path – model file path

  • model_config – config string for model to init, in json format

get_output_type()[source]

in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str

  • type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty

  • type video, data will be converted to encode video binary and write to oss file,

:: return {

‘image’: ‘image’, ‘feature’: ‘json’

} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json

batch(image_tensor_list)[source]
predict(input_data_list, batch_size=- 1, detect_and_align=True)[source]

using session run predict a number of samples using batch_size

Parameters
  • input_data_list – a list of numpy array or PIL.Image, each array is a sample to be predicted

  • batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime

  • detect_and_align – True to detect and align before feature extractor

Returns

a list of dict, each dict is the prediction result of one sample

eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array

Return type

result

Raises

if detect !=1 face in a img, then do nothing for this image

class easycv.predictors.feature_extractor.TorchFaceAttrExtractor(model_path, model_config=None, face_threshold=0.95, attr_method=['distribute_sum', 'softmax', 'softmax'], attr_name=['age', 'gender', 'emo'])[source]

Bases: easycv.predictors.interface.PredictorInterface

__init__(model_path, model_config=None, face_threshold=0.95, attr_method=['distribute_sum', 'softmax', 'softmax'], attr_name=['age', 'gender', 'emo'])[source]

init model

Parameters
  • model_path – model file path

  • model_config – config string for model to init, in json format

  • attr_method

    • softmax: do softmax for feature_dim 1

    • distribute_sum: do softmax and prob sum

get_output_type()[source]

in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str

  • type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty

  • type video, data will be converted to encode video binary and write to oss file,

:: return {

‘image’: ‘image’, ‘feature’: ‘json’

} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json

batch(image_tensor_list)[source]
predict(input_data_list, batch_size=- 1)[source]

using session run predict a number of samples using batch_size

Parameters
  • input_data_list – a list of numpy array, each array is a sample to be predicted

  • batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime

Returns

a list of dict, each dict is the prediction result of one sample

eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array

Return type

result

easycv.predictors.interface module

class easycv.predictors.interface.PredictorInterface(model_path, model_config=None)[source]

Bases: object

version = 1
__init__(model_path, model_config=None)[source]

init model

Parameters
  • model_path – init model from this directory

  • model_config – config string for model to init, in json format

abstract predict(input_data, batch_size)[source]

using session run predict a number of samples using batch_size

Parameters
  • input_data – a list of numpy array, each array is a sample to be predicted

  • batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime

Returns

a list of dict, each dict is the prediction result of one sample

eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array

Return type

result

get_output_type()[source]

in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str

  • type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty

  • type video, data will be converted to encode video binary and write to oss file,

:: return {

‘image’: ‘image’, ‘feature’: ‘json’

} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json

class easycv.predictors.interface.PredictorInterfaceV2(model_path, model_config=None)[source]

Bases: easycv.predictors.interface.PredictorInterface

version = 2
__init__(model_path, model_config=None)[source]

init model

Parameters
  • model_path – init model from this directory

  • model_config – config string for model to init, in json format

get_output_type()[source]

in this function user should return a type dict, which indicates which type of data should the output of predictor be converted to * type json, data will be serialized to json str

  • type image, data will be converted to encode image binary and write to oss file, whose name is output_dir/${key}/${input_filename}_${idx}.jpg, where input_filename is the base filename extracted from url, key corresponds to the key in the dict of output_type, if the type of data indexed by key is a list, idx is the index of element in list, otherwhile ${idx} will be empty

  • type video, data will be converted to encode video binary and write to oss file,

:: return {

‘image’: ‘image’, ‘feature’: ‘json’

} indicating that the image data in the output dict will be save to image file and feature in output dict will be converted to json

abstract predict(input_data_dict_list, batch_size)[source]

using session run predict a number of samples using batch_size

Parameters
  • input_data_dict_list – a list of dict, each dict is a sample data to be predicted

  • batch_size – batch_size passed by the caller, you can also ignore this param and use a fixed number if you do not want to adjust batch_size in runtime

Returns

a list of dict, each dict is the prediction result of one sample

eg, {“output1”: value1, “output2”: value2}, the value type can be python int str float, and numpy array

Return type

result

easycv.predictors.pose_predictor module

easycv.predictors.pose_predictor.vis_pose_result(model, img, result, radius=4, thickness=1, kpt_score_thr=0.3, bbox_color='green', dataset_info=None, out_file=None, pose_kpt_color=None, pose_link_color=None, text_color='white', font_scale=0.5, bbox_thickness=1, win_name='', show=False, wait_time=0)[source]

Visualize the detection results on the image.

Parameters
  • model (nn.Module) – The loaded detector.

  • img (str | np.ndarray) – Image filename or loaded image.

  • result (list[dict]) – The results to draw over img (bbox_result, pose_result).

  • radius (int) – Radius of circles.

  • thickness (int) – Thickness of lines.

  • kpt_score_thr (float) – The threshold to visualize the keypoints.

  • skeleton (list[tuple()]) – Default None.

  • out_file (str or None) – The filename of the output visualization image.

  • show (bool) – Whether to show the image. Default: False.

  • wait_time (int) – Value of waitKey param. Default: 0.

  • out_file – The filename to write the image. Default: None.

class easycv.predictors.pose_predictor.PoseTopDownInputProcessor(cfg, dataset_info, detection_predictor_config, bbox_thr=None, pipelines=None, batch_size=1, cat_id=None, mode='BGR')[source]

Bases: easycv.predictors.base.InputProcessor

__init__(cfg, dataset_info, detection_predictor_config, bbox_thr=None, pipelines=None, batch_size=1, cat_id=None, mode='BGR')[source]

Initialize self. See help(type(self)) for accurate signature.

get_detection_outputs(input, cat_id=None)[source]
process_single(input)[source]

Process single input sample. If you need custom ops to load or process a single input sample, you need to reimplement it.

class easycv.predictors.pose_predictor.PoseTopDownOutputProcessor[source]

Bases: easycv.predictors.base.OutputProcessor

class easycv.predictors.pose_predictor.PoseTopDownPredictor(model_path, config_file=None, detection_predictor_config=None, batch_size=1, bbox_thr=None, cat_id=None, device=None, pipelines=None, save_results=False, save_path=None, mode='BGR', model_type=None, *args, **kwargs)[source]

Bases: easycv.predictors.base.PredictorV2

Pose topdown predictor. :param model_path: Path of model path. :type model_path: str :param config_file: Config file path for model and processor to init. Defaults to None. :type config_file: Optinal[str] :param detection_model_config: Dict of person detection model predictor config,

example like dict(type="", model_path="", config_file="", ......)

Parameters
  • batch_size (int) – Batch size for forward.

  • bbox_thr (float) – Bounding box threshold to filter output results of detection model

  • cat_id (int | str) – Category id or name to filter target objects.

  • device (str | torch.device) – Support str(‘cuda’ or ‘cpu’) or torch.device, if is None, detect device automatically.

  • save_results (bool) – Whether to save predict results.

  • save_path (str) – File path for saving results, only valid when save_results is True.

  • pipelines (list[dict]) – Data pipeline configs.

  • mode (str) – The image mode into the model.

__init__(model_path, config_file=None, detection_predictor_config=None, batch_size=1, bbox_thr=None, cat_id=None, device=None, pipelines=None, save_results=False, save_path=None, mode='BGR', model_type=None, *args, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

prepare_model()[source]

Build model from config file by default. If the model is not loaded from a configuration file, e.g. torch jit model, you need to reimplement it.

model_forward(inputs, return_heatmap=False)[source]

Model forward. If you need refactor model forward, you need to reimplement it.

get_input_processor()[source]
get_output_processor()[source]
show_result(image, keypoints, radius=4, thickness=3, kpt_score_thr=0.3, bbox_color='green', show=False, save_path=None)[source]

easycv.core package

Subpackages

easycv.core.evaluation package

Subpackages
easycv.core.evaluation.custom_cocotools package
Submodules
easycv.core.evaluation.custom_cocotools.cocoeval module
class easycv.core.evaluation.custom_cocotools.cocoeval.COCOeval(cocoGt=None, cocoDt=None, iouType='segm', sigmas=None)[source]

Bases: object

__init__(cocoGt=None, cocoDt=None, iouType='segm', sigmas=None)[source]

Initialize CocoEval using coco APIs for gt and dt :param cocoGt: coco object with ground truth annotations :param cocoDt: coco object with detection results :param iouType: type of iou to be computed, bbox for detection task,

segm for segmentation task

Parameters

sigmas – keypoint labelling sigmas.

Returns

None

evaluate()[source]

Run per image evaluation on given images and store results (a list of dict) in self.evalImgs :returns: None

computeIoU(imgId, catId)[source]
computeOks(imgId, catId)[source]
evaluateImg(imgId, catId, aRng, maxDet)[source]

perform evaluation for single category and image :param imgId: image id, string :param catId: category id, string :param aRng: area range, tuple :param maxDet: maximum detection number

Returns

dict (single image results)

accumulate(p=None)[source]

Accumulate per image evaluation results and store the result in self.eval :param param p: input params for evaluation

Returns

None

summarize()[source]

Compute and display summary metrics for evaluation results. Note this functin can only be applied on the default parameter setting

summarize_per_category()[source]

Compute and display summary metrics for evaluation results per category. Note this functin can only be applied on the default parameter setting

filter_annotations(annotations, catIds)[source]
makeplot(recThrs, precisions, name, save_dir=None)[source]
analyze()[source]

Analyze errors

class easycv.core.evaluation.custom_cocotools.cocoeval.Params(iouType='segm')[source]

Bases: object

Params for coco evaluation api

setDetParams()[source]
setKpParams()[source]
__init__(iouType='segm')[source]

Initialize self. See help(type(self)) for accurate signature.

Submodules
easycv.core.evaluation.ap module
easycv.core.evaluation.auc_eval module
class easycv.core.evaluation.auc_eval.AucEvaluator(dataset_name=None, metric_names=['neck_auc'], neck_num=None)[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

AUC evaluator for binary classification only.

__init__(dataset_name=None, metric_names=['neck_auc'], neck_num=None)[source]
Parameters
  • dataset_name – eval dataset name

  • metric_names – eval metrics name

  • neck_num – some model contains multi-neck to support multitask, neck_num means use the no.neck_num neck output of model to eval

easycv.core.evaluation.base_evaluator module
class easycv.core.evaluation.base_evaluator.Evaluator(dataset_name=None, metric_names=[])[source]

Bases: object

Evaluator interface

__init__(dataset_name=None, metric_names=[])[source]

Construct eval ops from tensor

Parameters
  • dataset_name (str) – dataset name to be evaluated

  • metric_names (List[str]) – metric names this evaluator will return

evaluate(prediction_dict, groundtruth_dict, **kwargs)[source]
property metric_names
easycv.core.evaluation.builder module
easycv.core.evaluation.builder.build_evaluator(evaluator_cfg_list)[source]

build evaluator according to metric name

Parameters

evaluator_cfg_list – list of evaluator config dict

Returns

return a list of evaluator

easycv.core.evaluation.classification_eval module
class easycv.core.evaluation.classification_eval.ClsEvaluator(topk=(1, 5), dataset_name=None, metric_names=['neck_top1'], neck_num=None, class_list=None)[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

Classification evaluator.

__init__(topk=(1, 5), dataset_name=None, metric_names=['neck_top1'], neck_num=None, class_list=None)[source]
Parameters
  • top_k (int, tuple) – int or tuple of int, evaluate top_k acc

  • dataset_name – eval dataset name

  • metric_names – eval metrics name

  • neck_num – some model contains multi-neck to support multitask, neck_num means use the no.neck_num neck output of model to eval

class easycv.core.evaluation.classification_eval.MultiLabelEvaluator(dataset_name=None, metric_names=['mAP'])[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

Multilabel Classification evaluator.

__init__(dataset_name=None, metric_names=['mAP'])[source]
Parameters
  • dataset_name – eval dataset name

  • metric_names – eval metrics name

mAP(pred, target)[source]

Calculate the mean average precision with respect of classes. :param pred: The model prediction with shape

(N, C), where C is the number of classes.

Parameters

target (torch.Tensor | np.ndarray) – The target of each prediction with shape (N, C), where C is the number of classes. 1 stands for positive examples, 0 stands for negative examples and -1 stands for difficult examples.

Returns

A single float as mAP value.

Return type

float

average_precision(pred, target)[source]

Calculate the average precision for a single class. AP summarizes a precision-recall curve as the weighted mean of maximum precisions obtained for any r’>r, where r is the recall: .. math:

\text{AP} = \sum_n (R_n - R_{n-1}) P_n

Note that no approximation is involved since the curve is piecewise constant. :param pred: The model prediction with shape (N, ). :type pred: np.ndarray :param target: The target of each prediction with shape (N, ). :type target: np.ndarray

Returns

a single float as average precision value.

Return type

float

easycv.core.evaluation.coco_evaluation module

Class for evaluating object detections with COCO metrics.

class easycv.core.evaluation.coco_evaluation.CocoDetectionEvaluator(classes, include_metrics_per_category=False, all_metrics_per_category=False, coco_analyze=False, dataset_name=None, metric_names=['DetectionBoxes_Precision/mAP'])[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

Class to evaluate COCO detection metrics.

__init__(classes, include_metrics_per_category=False, all_metrics_per_category=False, coco_analyze=False, dataset_name=None, metric_names=['DetectionBoxes_Precision/mAP'])[source]

Constructor.

Parameters
  • classes – a list of class name

  • include_metrics_per_category – If True, include metrics for each category.

  • all_metrics_per_category – Whether to include all the summary metrics for each category in per_category_ap. Be careful with setting it to true if you have more than handful of ∏, because it will pollute your mldash.

  • coco_analyze – If True, will analyze the detection result using coco analysis.

  • dataset_name – If not None, dataset_name will be inserted to each metric name.

clear()[source]

Clears the state to prepare for a fresh evaluation.

add_single_ground_truth_image_info(image_id, groundtruth_dict)[source]

Adds groundtruth for a single image to be used for evaluation.

If the image has already been added, a warning is logged, and groundtruth is ignored.

Parameters
  • image_id – A unique string/integer identifier for the image.

  • groundtruth_dict

    A dictionary containing

    InputDataFields.groundtruth_boxes

    float32 numpy array of shape [num_boxes, 4] containing num_boxes groundtruth boxes of the format [ymin, xmin, ymax, xmax] in absolute image coordinates.

    InputDataFields.groundtruth_classes

    integer numpy array of shape [num_boxes] containing 1-indexed groundtruth classes for the boxes. InputDataFields.groundtruth_is_crowd (optional): integer numpy array of shape [num_boxes] containing iscrowd flag for groundtruth boxes.

add_single_detected_image_info(image_id, detections_dict)[source]

Adds detections for a single image to be used for evaluation.

If a detection has already been added for this image id, a warning is logged, and the detection is skipped.

Parameters
  • image_id – A unique string/integer identifier for the image.

  • detections_dict

    A dictionary containing

    DetectionResultFields.detection_boxes

    float32 numpy array of shape [num_boxes, 4] containing num_boxes detection boxes of the format [ymin, xmin, ymax, xmax] in absolute image coordinates.

    DetectionResultFields.detection_scores

    float32 numpy array of shape [num_boxes] containing detection scores for the boxes.

    DetectionResultFields.detection_classes

    integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes.

Raises

ValueError – If groundtruth for the image_id is not available.

class easycv.core.evaluation.coco_evaluation.CocoMaskEvaluator(classes, include_metrics_per_category=False, dataset_name=None, metric_names=['DetectionMasks_Precision/mAP'])[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

Class to evaluate COCO detection metrics.

__init__(classes, include_metrics_per_category=False, dataset_name=None, metric_names=['DetectionMasks_Precision/mAP'])[source]

Constructor.

Parameters
  • categories – A list of dicts, each of which has the following keys :id: (required) an integer id uniquely identifying this category. :name: (required) string representing category name e.g., ‘cat’, ‘dog’.

  • include_metrics_per_category – If True, include metrics for each category.

clear()[source]

Clears the state to prepare for a fresh evaluation.

add_single_ground_truth_image_info(image_id, groundtruth_dict)[source]

Adds groundtruth for a single image to be used for evaluation.

If the image has already been added, a warning is logged, and groundtruth is ignored.

Parameters
  • image_id – A unique string/integer identifier for the image.

  • groundtruth_dict

    A dictionary containing :InputDataFields.groundtruth_boxes: float32 numpy array of shape

    [num_boxes, 4] containing num_boxes groundtruth boxes of the format [ymin, xmin, ymax, xmax] in absolute image coordinates.

    InputDataFields.groundtruth_classes

    integer numpy array of shape [num_boxes] containing 1-indexed groundtruth classes for the boxes.

    InputDataFields.groundtruth_instance_masks

    uint8 numpy array of shape [num_boxes, image_height, image_width] containing groundtruth masks corresponding to the boxes. The elements of the array must be in {0, 1}.

add_single_detected_image_info(image_id, detections_dict)[source]

Adds detections for a single image to be used for evaluation.

If a detection has already been added for this image id, a warning is logged, and the detection is skipped.

Parameters
  • image_id – A unique string/integer identifier for the image.

  • detections_dict – A dictionary containing - DetectionResultFields.detection_scores: float32 numpy array of shape [num_boxes] containing detection scores for the boxes. DetectionResultFields.detection_classes: integer numpy array of shape [num_boxes] containing 1-indexed detection classes for the boxes. DetectionResultFields.detection_masks: optional uint8 numpy array of shape [num_boxes, image_height, image_width] containing instance masks corresponding to the boxes. The elements of the array must be in {0, 1}.

Raises

ValueError – If groundtruth for the image_id is not available or if spatial shapes of groundtruth_instance_masks and detection_masks are incompatible.

class easycv.core.evaluation.coco_evaluation.CoCoPoseTopDownEvaluator(dataset_name=None, metric_names=['AP'], **kwargs)[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

Class to evaluate COCO keypoint topdown metrics.

__init__(dataset_name=None, metric_names=['AP'], **kwargs)[source]

Construct eval ops from tensor

Parameters
  • dataset_name (str) – dataset name to be evaluated

  • metric_names (List[str]) – metric names this evaluator will return

class easycv.core.evaluation.coco_evaluation.CocoPanopticEvaluator(dataset_name=None, metric_names=['PQ'], classes=None, file_client_args={'backend': 'disk'}, **kwargs)[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

Class to evaluate COCO panoptic metrics.

__init__(dataset_name=None, metric_names=['PQ'], classes=None, file_client_args={'backend': 'disk'}, **kwargs)[source]

Construct eval ops from tensor

Parameters
  • dataset_name (str) – dataset name to be evaluated

  • metric_names (List[str]) – metric names this evaluator will return

evaluate(gt_json, gt_folder, pred_json, pred_folder, categories, nproc=32, classwise=False, **kwargs)[source]
parse_pq_results(pq_results)[source]

Parse the Panoptic Quality results.

easycv.core.evaluation.coco_evaluation.pq_compute_single_core(proc_id, annotation_set, gt_folder, pred_folder, categories, file_client=None, print_log=False)[source]

The single core function to evaluate the metric of Panoptic Segmentation.

Same as the function with the same name in panopticapi. Only the function to load the images is changed to use the file client.

Parameters
  • proc_id (int) – The id of the mini process.

  • gt_folder (str) – The path of the ground truth images.

  • pred_folder (str) – The path of the prediction images.

  • categories (str) – The categories of the dataset.

  • file_client (object) – The file client of the dataset. If None, the backend will be set to disk.

  • print_log (bool) – Whether to print the log. Defaults to False.

easycv.core.evaluation.coco_evaluation.pq_compute_multi_core(matched_annotations_list, gt_folder, pred_folder, categories, file_client=None, nproc=32)[source]

Evaluate the metrics of Panoptic Segmentation with multithreading.

Same as the function with the same name in panopticapi.

Parameters
  • matched_annotations_list (list) – The matched annotation list. Each element is a tuple of annotations of the same image with the format (gt_anns, pred_anns).

  • gt_folder (str) – The path of the ground truth images.

  • pred_folder (str) – The path of the prediction images.

  • categories (str) – The categories of the dataset.

  • file_client (object) – The file client of the dataset. If None, the backend will be set to disk.

  • nproc (int) – Number of processes for panoptic quality computing. Defaults to 32. When nproc exceeds the number of cpu cores, the number of cpu cores is used.

easycv.core.evaluation.coco_tools module

Wrappers for third party pycocotools to be used within object_detection.

Note that nothing in this file is tensorflow related and thus cannot be called directly as a slim metric, for example.

TODO(jonathanhuang): wrap as a slim metric in metrics.py

Usage example: given a set of images with ids in the list image_ids and corresponding lists of numpy arrays encoding groundtruth (boxes and classes) and detections (boxes, scores and classes), where elements of each list correspond to detections/annotations of a single image, then evaluation (in multi-class mode) can be invoked as follows:

groundtruth_dict = coco_tools.ExportGroundtruthToCOCO(

image_ids, groundtruth_boxes_list, groundtruth_classes_list, max_num_classes, output_path=None)

detections_list = coco_tools.ExportDetectionsToCOCO(

image_ids, detection_boxes_list, detection_scores_list, detection_classes_list, output_path=None)

groundtruth = coco_tools.COCOWrapper(groundtruth_dict) detections = groundtruth.LoadAnnotations(detections_list) evaluator = coco_tools.COCOEvalWrapper(groundtruth, detections,

agnostic_mode=False)

metrics = evaluator.ComputeMetrics()

class easycv.core.evaluation.coco_tools.COCOWrapper(dataset, detection_type='bbox')[source]

Bases: xtcocotools.coco.COCO

Wrapper for the pycocotools COCO class.

__init__(dataset, detection_type='bbox')[source]

COCOWrapper constructor.

See http://mscoco.org/dataset/#format for a description of the format. By default, the coco.COCO class constructor reads from a JSON file. This function duplicates the same behavior but loads from a dictionary, allowing us to perform evaluation without writing to external storage.

Parameters
  • dataset – a dictionary holding bounding box annotations in the COCO format.

  • detection_type – type of detections being wrapped. Can be one of [‘bbox’, ‘segmentation’]

Raises

ValueError – if detection_type is unsupported.

LoadAnnotations(annotations)[source]

Load annotations dictionary into COCO datastructure.

See http://mscoco.org/dataset/#format for a description of the annotations format. As above, this function replicates the default behavior of the API but does not require writing to external storage.

Parameters

annotations – python list holding object detection results where each detection is encoded as a dict with required keys [‘image_id’, ‘category_id’, ‘score’] and one of [‘bbox’, ‘segmentation’] based on detection_type.

Returns

a coco.COCO datastructure holding object detection annotations results

Raises
  • ValueError – if annotations is not a list

  • ValueError – if annotations do not correspond to the images contained in self.

class easycv.core.evaluation.coco_tools.COCOEvalWrapper(groundtruth=None, detections=None, agnostic_mode=False, iou_type='bbox')[source]

Bases: easycv.core.evaluation.custom_cocotools.cocoeval.COCOeval

Wrapper for the pycocotools COCOeval class.

To evaluate, create two objects (groundtruth_dict and detections_list) using the conventions listed at http://mscoco.org/dataset/#format. Then call evaluation as follows:

groundtruth = coco_tools.COCOWrapper(groundtruth_dict) detections = groundtruth.LoadAnnotations(detections_list) evaluator = coco_tools.COCOEvalWrapper(groundtruth, detections,

agnostic_mode=False)

metrics = evaluator.ComputeMetrics()

__init__(groundtruth=None, detections=None, agnostic_mode=False, iou_type='bbox')[source]

COCOEvalWrapper constructor.

Note that for the area-based metrics to be meaningful, detection and groundtruth boxes must be in image coordinates measured in pixels.

Parameters
  • groundtruth – a coco.COCO (or coco_tools.COCOWrapper) object holding groundtruth annotations

  • detections – a coco.COCO (or coco_tools.COCOWrapper) object holding detections

  • agnostic_mode – boolean (default: False). If True, evaluation ignores class labels, treating all detections as proposals.

  • iou_type – IOU type to use for evaluation. Supports bbox or segm.

GetCategory(category_id)[source]

Fetches dictionary holding category information given category id.

Parameters

category_id – integer id

Returns

dictionary holding ‘id’, ‘name’.

GetAgnosticMode()[source]

Returns true if COCO Eval is configured to evaluate in agnostic mode.

GetCategoryIdList()[source]

Returns list of valid category ids.

ComputeMetrics(include_metrics_per_category=False, all_metrics_per_category=False)[source]

Computes detection metrics.

Parameters
  • include_metrics_per_category – If True, will include metrics per category.

  • all_metrics_per_category – If true, include all the summery metrics for each category in per_category_ap. Be careful with setting it to true if you have more than handful of categories, because it will pollute your mldash.

Returns

a dictionary holding:
’Precision/mAP’: mean average precision over classes averaged over IOU

thresholds ranging from .5 to .95 with .05 increments

’Precision/mAP@.50IOU’: mean average precision at 50% IOU ‘Precision/mAP@.75IOU’: mean average precision at 75% IOU ‘Precision/mAP (small)’: mean average precision for small objects

(area < 32^2 pixels)

’Precision/mAP (medium)’: mean average precision for medium sized

objects (32^2 pixels < area < 96^2 pixels)

’Precision/mAP (large)’: mean average precision for large objects

(96^2 pixels < area < 10000^2 pixels)

’Recall/AR@1’: average recall with 1 detection ‘Recall/AR@10’: average recall with 10 detections ‘Recall/AR@100’: average recall with 100 detections ‘Recall/AR@100 (small)’: average recall for small objects with 100

detections

’Recall/AR@100 (medium)’: average recall for medium objects with 100

detections

’Recall/AR@100 (large)’: average recall for large objects with 100

detections

  1. per_category_ap: a dictionary holding category specific results with

keys of the form: ‘Precision mAP ByCategory/category’ (without the supercategory part if no supercategories exist). For backward compatibility ‘PerformanceByCategory’ is included in the output regardless of all_metrics_per_category. If evaluating class-agnostic mode, per_category_ap is an empty dictionary.

Return type

  1. summary_metrics

Raises

ValueError – If category_stats does not exist.

Analyze()[source]

Analyze detection results.

Args:

Returns

A dictionary containing images of analyzing result images,

key is the image name, value is a [H,W,3] numpy array which represent the image content. You can refer to http://cocodataset.org/#detection-eval section 4 Analysis code.

easycv.core.evaluation.coco_tools.ExportSingleImageGroundtruthToCoco(image_id, next_annotation_id, category_id_set, groundtruth_boxes, groundtruth_classes, groundtruth_masks=None, groundtruth_is_crowd=None, super_categories=None)[source]

Export groundtruth of a single image to COCO format.

This function converts groundtruth detection annotations represented as numpy arrays to dictionaries that can be ingested by the COCO evaluation API. Note that the image_ids provided here must match the ones given to ExportSingleImageDetectionsToCoco. We assume that boxes and classes are in correspondence - that is: groundtruth_boxes[i, :], and groundtruth_classes[i] are associated with the same groundtruth annotation.

In the exported result, “area” fields are always set to the area of the groundtruth bounding box.

Parameters
  • image_id – a unique image identifier either of type integer or string.

  • next_annotation_id – integer specifying the first id to use for the groundtruth annotations. All annotations are assigned a continuous integer id starting from this value.

  • category_id_set – A set of valid class ids. Groundtruth with classes not in category_id_set are dropped.

  • groundtruth_boxes – numpy array (float32) with shape [num_gt_boxes, 4]

  • groundtruth_classes – numpy array (int) with shape [num_gt_boxes]

  • groundtruth_masks – optional uint8 numpy array of shape [num_detections, image_height, image_width] containing detection_masks.

  • groundtruth_is_crowd – optional numpy array (int) with shape [num_gt_boxes] indicating whether groundtruth boxes are crowd.

  • super_categories – optional list of str indicating each box super category

Returns

a list of groundtruth annotations for a single image in the COCO format.

Raises

ValueError – if (1) groundtruth_boxes and groundtruth_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers

easycv.core.evaluation.coco_tools.ExportGroundtruthToCOCO(image_ids, groundtruth_boxes, groundtruth_classes, categories, output_path=None)[source]

Export groundtruth detection annotations in numpy arrays to COCO API.

This function converts a set of groundtruth detection annotations represented as numpy arrays to dictionaries that can be ingested by the COCO API. Inputs to this function are three lists: image ids for each groundtruth image, groundtruth boxes for each image and groundtruth classes respectively. Note that the image_ids provided here must match the ones given to the ExportDetectionsToCOCO function in order for evaluation to work properly. We assume that for each image, boxes, scores and classes are in correspondence — that is: image_id[i], groundtruth_boxes[i, :] and groundtruth_classes[i] are associated with the same groundtruth annotation.

In the exported result, “area” fields are always set to the area of the groundtruth bounding box and “iscrowd” fields are always set to 0. TODO(jonathanhuang): pass in “iscrowd” array for evaluating on COCO dataset.

Parameters
  • image_ids – a list of unique image identifier either of type integer or string.

  • groundtruth_boxes – list of numpy arrays with shape [num_gt_boxes, 4] (note that num_gt_boxes can be different for each entry in the list)

  • groundtruth_classes – list of numpy arrays (int) with shape [num_gt_boxes] (note that num_gt_boxes can be different for each entry in the list)

  • categories

    a list of dictionaries representing all possible categories. Each dict in this list has the following keys:

    ’id’: (required) an integer id uniquely identifying this category ‘name’: (required) string representing category name

    e.g., ‘cat’, ‘dog’, ‘pizza’

    ’supercategory’: (optional) string representing the supercategory

    e.g., ‘animal’, ‘vehicle’, ‘food’, etc

  • output_path – (optional) path for exporting result to JSON

Returns

dictionary that can be read by COCO API

Raises

ValueError – if (1) groundtruth_boxes and groundtruth_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers

easycv.core.evaluation.coco_tools.ExportSingleImageDetectionBoxesToCoco(image_id, category_id_set, detection_boxes, detection_scores, detection_classes)[source]

Export detections of a single image to COCO format.

This function converts detections represented as numpy arrays to dictionaries that can be ingested by the COCO evaluation API. Note that the image_ids provided here must match the ones given to the ExporSingleImageDetectionBoxesToCoco. We assume that boxes, and classes are in correspondence - that is: boxes[i, :], and classes[i] are associated with the same groundtruth annotation.

Parameters
  • image_id – unique image identifier either of type integer or string.

  • category_id_set – A set of valid class ids. Detections with classes not in category_id_set are dropped.

  • detection_boxes – float numpy array of shape [num_detections, 4] containing detection boxes.

  • detection_scores – float numpy array of shape [num_detections] containing scored for the detection boxes.

  • detection_classes – integer numpy array of shape [num_detections] containing the classes for detection boxes.

Returns

a list of detection annotations for a single image in the COCO format.

Raises

ValueError – if (1) detection_boxes, detection_scores and detection_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers.

easycv.core.evaluation.coco_tools.ExportSingleImageDetectionMasksToCoco(image_id, category_id_set, detection_masks, detection_scores, detection_classes)[source]

Export detection masks of a single image to COCO format.

This function converts detections represented as numpy arrays to dictionaries that can be ingested by the COCO evaluation API. We assume that detection_masks, detection_scores, and detection_classes are in correspondence - that is: detection_masks[i, :], detection_classes[i] and detection_scores[i]

are associated with the same annotation.

Parameters
  • image_id – unique image identifier either of type integer or string.

  • category_id_set – A set of valid class ids. Detections with classes not in category_id_set are dropped.

  • detection_masks – uint8 numpy array of shape [num_detections, image_height, image_width] containing detection_masks.

  • detection_scores – float numpy array of shape [num_detections] containing scores for detection masks.

  • detection_classes – integer numpy array of shape [num_detections] containing the classes for detection masks.

Returns

a list of detection mask annotations for a single image in the COCO format.

Raises

ValueError – if (1) detection_masks, detection_scores and detection_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers.

easycv.core.evaluation.coco_tools.ExportDetectionsToCOCO(image_ids, detection_boxes, detection_scores, detection_classes, categories, output_path=None)[source]

Export detection annotations in numpy arrays to COCO API.

This function converts a set of predicted detections represented as numpy arrays to dictionaries that can be ingested by the COCO API. Inputs to this function are lists, consisting of boxes, scores and classes, respectively, corresponding to each image for which detections have been produced. Note that the image_ids provided here must match the ones given to the ExportGroundtruthToCOCO function in order for evaluation to work properly.

We assume that for each image, boxes, scores and classes are in correspondence — that is: detection_boxes[i, :], detection_scores[i] and detection_classes[i] are associated with the same detection.

Parameters
  • image_ids – a list of unique image identifier either of type integer or string.

  • detection_boxes – list of numpy arrays with shape [num_detection_boxes, 4]

  • detection_scores – list of numpy arrays (float) with shape [num_detection_boxes]. Note that num_detection_boxes can be different for each entry in the list.

  • detection_classes – list of numpy arrays (int) with shape [num_detection_boxes]. Note that num_detection_boxes can be different for each entry in the list.

  • categories – a list of dictionaries representing all possible categories. Each dict in this list must have an integer ‘id’ key uniquely identifying this category.

  • output_path – (optional) path for exporting result to JSON

Returns

list of dictionaries that can be read by COCO API, where each entry corresponds to a single detection and has keys from: [‘image_id’, ‘category_id’, ‘bbox’, ‘score’].

Raises

ValueError – if (1) detection_boxes and detection_classes do not have the right lengths or (2) if each of the elements inside these lists do not have the correct shapes or (3) if image_ids are not integers.

easycv.core.evaluation.coco_tools.ExportSegmentsToCOCO(image_ids, detection_masks, detection_scores, detection_classes, categories, output_path=None)[source]

Export segmentation masks in numpy arrays to COCO API.

This function converts a set of predicted instance masks represented as numpy arrays to dictionaries that can be ingested by the COCO API. Inputs to this function are lists, consisting of segments, scores and classes, respectively, corresponding to each image for which detections have been produced.

Note this function is recommended to use for small dataset. For large dataset, it should be used with a merge function (e.g. in map reduce), otherwise the memory consumption is large.

We assume that for each image, masks, scores and classes are in correspondence — that is: detection_masks[i, :, :, :], detection_scores[i] and detection_classes[i] are associated with the same detection.

Parameters
  • image_ids – list of image ids (typically ints or strings)

  • detection_masks – list of numpy arrays with shape [num_detection, h, w, 1] and type uint8. The height and width should match the shape of corresponding image.

  • detection_scores – list of numpy arrays (float) with shape [num_detection]. Note that num_detection can be different for each entry in the list.

  • detection_classes – list of numpy arrays (int) with shape [num_detection]. Note that num_detection can be different for each entry in the list.

  • categories – a list of dictionaries representing all possible categories. Each dict in this list must have an integer ‘id’ key uniquely identifying this category.

  • output_path – (optional) path for exporting result to JSON

Returns

list of dictionaries that can be read by COCO API, where each entry corresponds to a single detection and has keys from: [‘image_id’, ‘category_id’, ‘segmentation’, ‘score’].

Raises

ValueError – if detection_masks and detection_classes do not have the right lengths or if each of the elements inside these lists do not have the correct shapes.

easycv.core.evaluation.coco_tools.ExportKeypointsToCOCO(image_ids, detection_keypoints, detection_scores, detection_classes, categories, output_path=None)[source]

Exports keypoints in numpy arrays to COCO API.

This function converts a set of predicted keypoints represented as numpy arrays to dictionaries that can be ingested by the COCO API. Inputs to this function are lists, consisting of keypoints, scores and classes, respectively, corresponding to each image for which detections have been produced.

We assume that for each image, keypoints, scores and classes are in correspondence — that is: detection_keypoints[i, :, :, :], detection_scores[i] and detection_classes[i] are associated with the same detection.

Parameters
  • image_ids – list of image ids (typically ints or strings)

  • detection_keypoints – list of numpy arrays with shape [num_detection, num_keypoints, 2] and type float32 in absolute x-y coordinates.

  • detection_scores – list of numpy arrays (float) with shape [num_detection]. Note that num_detection can be different for each entry in the list.

  • detection_classes – list of numpy arrays (int) with shape [num_detection]. Note that num_detection can be different for each entry in the list.

  • categories – a list of dictionaries representing all possible categories. Each dict in this list must have an integer ‘id’ key uniquely identifying this category and an integer ‘num_keypoints’ key specifying the number of keypoints the category has.

  • output_path – (optional) path for exporting result to JSON

Returns

list of dictionaries that can be read by COCO API, where each entry corresponds to a single detection and has keys from: [‘image_id’, ‘category_id’, ‘keypoints’, ‘score’].

Raises

ValueError – if detection_keypoints and detection_classes do not have the right lengths or if each of the elements inside these lists do not have the correct shapes.

easycv.core.evaluation.faceid_pair_eval module
easycv.core.evaluation.faceid_pair_eval.calculate_roc(thresholds, embeddings1, embeddings2, actual_issame, nrof_folds=10, pca=0)[source]
easycv.core.evaluation.faceid_pair_eval.calculate_accuracy(threshold, dist, actual_issame)[source]
easycv.core.evaluation.faceid_pair_eval.calculate_val(thresholds, embeddings1, embeddings2, actual_issame, far_target, nrof_folds=10)[source]
easycv.core.evaluation.faceid_pair_eval.calculate_val_far(threshold, dist, actual_issame)[source]
easycv.core.evaluation.faceid_pair_eval.faceid_evaluate(embeddings, actual_issame, nrof_folds=10, pca=0)[source]

Do Kfold=nrof_folds faceid pair-match test for embeddings :param embeddings: [N x C] inputs embedding of all dataset :param actual_issame: [N/2, 1] label of is match :param nrof_folds: KFold number :param pca: > 0 means, do pca and trans embedding to [N, pca] feature

Returns

KFold average best accuracy and best threshold

class easycv.core.evaluation.faceid_pair_eval.FaceIDPairEvaluator(dataset_name=None, metric_names=['acc'], kfold=10, pca=0)[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

FaceIDPairEvaluator evaluator. Input nx2 pairs and label, kfold thresholds search and return average best accuracy

__init__(dataset_name=None, metric_names=['acc'], kfold=10, pca=0)[source]

Faceid small dataset evaluator, do pair match validation :param dataset_name: faceid small validate set name, include [lfw, agedb_30, cfp_ff, cfp_fw, calfw] :param kfold: Kfold for train/val split :param pca: pca dimensions, if > 0, do PCA for input feature, transfer to [n, pca]

Returns

None

easycv.core.evaluation.metric_registry module
class easycv.core.evaluation.metric_registry.MetricRegistry[source]

Bases: object

__init__()[source]

Initialize self. See help(type(self)) for accurate signature.

get(evaluator_type)[source]
register_default_best_metric(cls, metric_name, metric_cmp_op='max')[source]

Register default best metric for each evaluator

Parameters
  • cls (object) – class object

  • metric_name (str or List[str]) – default best metric name

  • metric_cmp_op (str or List[str]) – metric compare operation, should be one of [“max”, “min”]

easycv.core.evaluation.mse_eval module
class easycv.core.evaluation.mse_eval.MSEEvaluator(dataset_name=None, metric_names=['avg_mse'], neck_num=None)[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

MSEEvaluator evaluator,

__init__(dataset_name=None, metric_names=['avg_mse'], neck_num=None)[source]
easycv.core.evaluation.retrival_topk_eval module
class easycv.core.evaluation.retrival_topk_eval.RetrivalTopKEvaluator(topk=(1, 2, 4, 8), norm=0, metric='cos', pca=0, dataset_name=None, metric_names=['R@K=1'], save_results=False, save_results_dir='', feature_keyword=['neck'])[source]

Bases: easycv.core.evaluation.base_evaluator.Evaluator

RetrivalTopK evaluator, Retrival evaluate do the topK retrival, by measuring the distance of every 1 vs other. get the topK nearest, and count the match of ID. if Retrival = 1, Miss = 0. Finally average all RetrivalRate.

__init__(topk=(1, 2, 4, 8), norm=0, metric='cos', pca=0, dataset_name=None, metric_names=['R@K=1'], save_results=False, save_results_dir='', feature_keyword=['neck'])[source]
Parameters

top_k – tuple of int, evaluate top_k acc

easycv.core.evaluation.top_down_eval module
easycv.core.evaluation.top_down_eval.pose_pck_accuracy(output, target, mask, thr=0.05, normalize=None)[source]

Calculate the pose accuracy of PCK for each individual keypoint and the averaged accuracy across all keypoints from heatmaps.

Note

PCK metric measures accuracy of the localization of the body joints. The distances between predicted positions and the ground-truth ones are typically normalized by the bounding box size. The threshold (thr) of the normalized distance is commonly set as 0.05, 0.1 or 0.2 etc.

batch_size: N num_keypoints: K heatmap height: H heatmap width: W

Parameters
  • output (np.ndarray[N, K, H, W]) – Model output heatmaps.

  • target (np.ndarray[N, K, H, W]) – Groundtruth heatmaps.

  • mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.

  • thr (float) – Threshold of PCK calculation. Default 0.05.

  • normalize (np.ndarray[N, 2]) – Normalization factor for H&W.

Returns

A tuple containing keypoint accuracy.

  • np.ndarray[K]: Accuracy of each keypoint.

  • float: Averaged accuracy across all keypoints.

  • int: Number of valid keypoints.

Return type

tuple

easycv.core.evaluation.top_down_eval.keypoint_pck_accuracy(pred, gt, mask, thr, normalize)[source]

Calculate the pose accuracy of PCK for each individual keypoint and the averaged accuracy across all keypoints for coordinates.

Note

PCK metric measures accuracy of the localization of the body joints. The distances between predicted positions and the ground-truth ones are typically normalized by the bounding box size. The threshold (thr) of the normalized distance is commonly set as 0.05, 0.1 or 0.2 etc.

batch_size: N num_keypoints: K

Parameters
  • pred (np.ndarray[N, K, 2]) – Predicted keypoint location.

  • gt (np.ndarray[N, K, 2]) – Groundtruth keypoint location.

  • mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.

  • thr (float) – Threshold of PCK calculation.

  • normalize (np.ndarray[N, 2]) – Normalization factor for H&W.

Returns

A tuple containing keypoint accuracy.

  • acc (np.ndarray[K]): Accuracy of each keypoint.

  • avg_acc (float): Averaged accuracy across all keypoints.

  • cnt (int): Number of valid keypoints.

Return type

tuple

easycv.core.evaluation.top_down_eval.keypoint_auc(pred, gt, mask, normalize, num_step=20)[source]

Calculate the pose accuracy of PCK for each individual keypoint and the averaged accuracy across all keypoints for coordinates.

Note

  • batch_size: N

  • num_keypoints: K

Parameters
  • pred (np.ndarray[N, K, 2]) – Predicted keypoint location.

  • gt (np.ndarray[N, K, 2]) – Groundtruth keypoint location.

  • mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.

  • normalize (float) – Normalization factor.

Returns

Area under curve.

Return type

float

easycv.core.evaluation.top_down_eval.keypoint_nme(pred, gt, mask, normalize_factor)[source]

Calculate the normalized mean error (NME).

Note

  • batch_size: N

  • num_keypoints: K

Parameters
  • pred (np.ndarray[N, K, 2]) – Predicted keypoint location.

  • gt (np.ndarray[N, K, 2]) – Groundtruth keypoint location.

  • mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.

  • normalize_factor (np.ndarray[N, 2]) – Normalization factor.

Returns

normalized mean error

Return type

float

easycv.core.evaluation.top_down_eval.keypoint_epe(pred, gt, mask)[source]

Calculate the end-point error.

Note

  • batch_size: N

  • num_keypoints: K

Parameters
  • pred (np.ndarray[N, K, 2]) – Predicted keypoint location.

  • gt (np.ndarray[N, K, 2]) – Groundtruth keypoint location.

  • mask (np.ndarray[N, K]) – Visibility of the target. False for invisible joints, and True for visible. Invisible joints will be ignored for accuracy calculation.

Returns

Average end-point error.

Return type

float

easycv.core.evaluation.top_down_eval.post_dark_udp(coords, batch_heatmaps, kernel=3)[source]

DARK post-pocessing. Implemented by udp. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020). Zhang et al. Distribution-Aware Coordinate Representation for Human Pose Estimation (CVPR 2020).

Note

batch size: B num keypoints: K num persons: N height of heatmaps: H width of heatmaps: W B=1 for bottom_up paradigm where all persons share the same heatmap. B=N for top_down paradigm where each person has its own heatmaps.

Parameters
  • coords (np.ndarray[N, K, 2]) – Initial coordinates of human pose.

  • batch_heatmaps (np.ndarray[B, K, H, W]) – batch_heatmaps

  • kernel (int) – Gaussian kernel size (K) for modulation.

Returns

Refined coordinates.

Return type

res (np.ndarray[N, K, 2])

easycv.core.evaluation.top_down_eval.keypoints_from_heatmaps(heatmaps, center, scale, unbiased=False, post_process='default', kernel=11, valid_radius_factor=0.0546875, use_udp=False, target_type='GaussianHeatmap')[source]

Get final keypoint predictions from heatmaps and transform them back to the image.

Note

batch size: N num keypoints: K heatmap height: H heatmap width: W

Parameters
  • heatmaps (np.ndarray[N, K, H, W], dtype=float32) – model predicted heatmaps.

  • center (np.ndarray[N, 2]) – Center of the bounding box (x, y).

  • scale (np.ndarray[N, 2]) – Scale of the bounding box wrt height/width.

  • post_process (str/None) – Choice of methods to post-process heatmaps. Currently supported: None, ‘default’, ‘unbiased’, ‘megvii’.

  • unbiased (bool) – Option to use unbiased decoding. Mutually exclusive with megvii. Note: this arg is deprecated and unbiased=True can be replaced by post_process=’unbiased’ Paper ref: Zhang et al. Distribution-Aware Coordinate Representation for Human Pose Estimation (CVPR 2020).

  • kernel (int) – Gaussian kernel size (K) for modulation, which should match the heatmap gaussian sigma when training. K=17 for sigma=3 and k=11 for sigma=2.

  • valid_radius_factor (float) – The radius factor of the positive area in classification heatmap for UDP.

  • use_udp (bool) – Use unbiased data processing.

  • target_type (str) – ‘GaussianHeatmap’ or ‘CombinedTarget’. GaussianHeatmap: Classification target with gaussian distribution. CombinedTarget: The combination of classification target (response map) and regression target (offset map). Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

Returns

A tuple containing keypoint predictions and scores.

  • preds (np.ndarray[N, K, 2]): Predicted keypoint location in images.

  • maxvals (np.ndarray[N, K, 1]): Scores (confidence) of the keypoints.

Return type

tuple

easycv.core.optimizer package

Submodules
easycv.core.optimizer.lars module
class easycv.core.optimizer.lars.LARS(params, lr=<required parameter>, momentum=0, dampening=0, weight_decay=0, eta=0.001, nesterov=False)[source]

Bases: torch.optim.optimizer.Optimizer

Implements layer-wise adaptive rate scaling for SGD.

Parameters
  • params (iterable) – iterable of parameters to optimize or dicts defining parameter groups

  • lr (float) – base learning rate (gamma_0)

  • momentum (float, optional) – momentum factor (default: 0) (“m”)

  • weight_decay (float, optional) – weight decay (L2 penalty) (default: 0) (“beta”)

  • dampening (float, optional) – dampening for momentum (default: 0)

  • eta (float, optional) – LARS coefficient

  • nesterov (bool, optional) – enables Nesterov momentum (default: False)

Based on Algorithm 1 of the following paper by You, Gitman, and Ginsburg. Large Batch Training of Convolutional Networks:

Example

>>> optimizer = LARS(model.parameters(), lr=0.1, momentum=0.9,
>>>                  weight_decay=1e-4, eta=1e-3)
>>> optimizer.zero_grad()
>>> loss_fn(model(input), target).backward()
>>> optimizer.step()
__init__(params, lr=<required parameter>, momentum=0, dampening=0, weight_decay=0, eta=0.001, nesterov=False)[source]

Initialize self. See help(type(self)) for accurate signature.

step(closure=None)[source]

Performs a single optimization step.

Parameters

closure (callable, optional) – A closure that reevaluates the model and returns the loss.

easycv.core.optimizer.ranger module
easycv.core.optimizer.ranger.centralized_gradient(x, use_gc=True, gc_conv_only=False)[source]

credit - https://github.com/Yonghongwei/Gradient-Centralization

class easycv.core.optimizer.ranger.Ranger(params, lr=0.001, alpha=0.5, k=6, N_sma_threshhold=5, betas=(0.95, 0.999), eps=1e-05, weight_decay=0, use_gc=True, gc_conv_only=False, gc_loc=True)[source]

Bases: torch.optim.optimizer.Optimizer

Adam+LookAhead: refer to https://github.com/lessw2020/Ranger-Deep-Learning-Optimizer

__init__(params, lr=0.001, alpha=0.5, k=6, N_sma_threshhold=5, betas=(0.95, 0.999), eps=1e-05, weight_decay=0, use_gc=True, gc_conv_only=False, gc_loc=True)[source]

Initialize self. See help(type(self)) for accurate signature.

step(closure=None)[source]

Performs a single optimization step (parameter update).

Parameters

closure (Callable) – A closure that reevaluates the model and returns the loss. Optional for most optimizers.

Note

Unless otherwise specified, this function should not modify the .grad field of the parameters.

easycv.core.post_processing package

easycv.core.post_processing.affine_transform(pt, trans_mat)[source]

Apply an affine transformation to the points.

Parameters
  • pt (np.ndarray) – a 2 dimensional point to be transformed

  • trans_mat (np.ndarray) – 2x3 matrix of an affine transform

Returns

Transformed points.

Return type

np.ndarray

easycv.core.post_processing.flip_back(output_flipped, flip_pairs, target_type='GaussianHeatmap')[source]

Flip the flipped heatmaps back to the original form.

Note

batch_size: N num_keypoints: K heatmap height: H heatmap width: W

Parameters
  • output_flipped (np.ndarray[N, K, H, W]) – The output heatmaps obtained from the flipped images.

  • flip_pairs (list[tuple()) – Pairs of keypoints which are mirrored (for example, left ear – right ear).

  • target_type (str) – GaussianHeatmap or CombinedTarget

Returns

heatmaps that flipped back to the original image

Return type

np.ndarray

easycv.core.post_processing.fliplr_joints(joints_3d, joints_3d_visible, img_width, flip_pairs)[source]

Flip human joints horizontally.

Note

num_keypoints: K

Parameters
  • joints_3d (np.ndarray([K, 3])) – Coordinates of keypoints.

  • joints_3d_visible (np.ndarray([K, 1])) – Visibility of keypoints.

  • img_width (int) – Image width.

  • flip_pairs (list[tuple()]) – Pairs of keypoints which are mirrored (for example, left ear – right ear).

Returns

Flipped human joints.

  • joints_3d_flipped (np.ndarray([K, 3])): Flipped joints.

  • joints_3d_visible_flipped (np.ndarray([K, 1])): Joint visibility.

Return type

tuple

easycv.core.post_processing.fliplr_regression(regression, flip_pairs, center_mode='static', center_x=0.5, center_index=0)[source]

Flip human joints horizontally.

Note

batch_size: N num_keypoint: K

Parameters
  • regression (np.ndarray([..., K, C])) –

    Coordinates of keypoints, where K is the joint number and C is the dimension. Example shapes are: - [N, K, C]: a batch of keypoints where N is the batch size. - [N, T, K, C]: a batch of pose sequences, where T is the frame

    number.

  • flip_pairs (list[tuple()]) – Pairs of keypoints which are mirrored (for example, left ear – right ear).

  • center_mode (str) – The mode to set the center location on the x-axis to flip around. Options are: - static: use a static x value (see center_x also) - root: use a root joint (see center_index also)

  • center_x (float) – Set the x-axis location of the flip center. Only used when center_mode=static.

  • center_index (int) – Set the index of the root joint, whose x location will be used as the flip center. Only used when center_mode=root.

Returns

Flipped human joints.

  • regression_flipped (np.ndarray([…, K, C])): Flipped joints.

Return type

tuple

easycv.core.post_processing.get_affine_transform(center, scale, rot, output_size, shift=(0.0, 0.0), inv=False)[source]

Get the affine transform matrix, given the center/scale/rot/output_size.

Parameters
  • center (np.ndarray[2, ]) – Center of the bounding box (x, y).

  • scale (np.ndarray[2, ]) – Scale of the bounding box wrt [width, height].

  • rot (float) – Rotation angle (degree).

  • output_size (np.ndarray[2, ] | list(2,)) – Size of the destination heatmaps.

  • shift (0-100%) – Shift translation ratio wrt the width/height. Default (0., 0.).

  • inv (bool) – Option to inverse the affine transform direction. (inv=False: src->dst or inv=True: dst->src)

Returns

The transform matrix.

Return type

np.ndarray

easycv.core.post_processing.get_warp_matrix(theta, size_input, size_dst, size_target)[source]

Calculate the transformation matrix under the constraint of unbiased. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

Parameters
  • theta (float) – Rotation angle in degrees.

  • size_input (np.ndarray) – Size of input image [w, h].

  • size_dst (np.ndarray) – Size of output image [w, h].

  • size_target (np.ndarray) – Size of ROI in input plane [w, h].

Returns

A matrix for transformation.

Return type

matrix (np.ndarray)

easycv.core.post_processing.rotate_point(pt, angle_rad)[source]

Rotate a point by an angle.

Parameters
  • pt (list[float]) – 2 dimensional point to be rotated

  • angle_rad (float) – rotation angle by radian

Returns

Rotated point.

Return type

list[float]

easycv.core.post_processing.transform_preds(coords, center, scale, output_size, use_udp=False)[source]

Get final keypoint predictions from heatmaps and apply scaling and translation to map them back to the image.

Note

num_keypoints: K

Parameters
  • coords (np.ndarray[K, ndims]) –

    • If ndims=2, corrds are predicted keypoint location.

    • If ndims=4, corrds are composed of (x, y, scores, tags)

    • If ndims=5, corrds are composed of (x, y, scores, tags, flipped_tags)

  • center (np.ndarray[2, ]) – Center of the bounding box (x, y).

  • scale (np.ndarray[2, ]) – Scale of the bounding box wrt [width, height].

  • output_size (np.ndarray[2, ] | list(2,)) – Size of the destination heatmaps.

  • use_udp (bool) – Use unbiased data processing

Returns

Predicted coordinates in the images.

Return type

np.ndarray

easycv.core.post_processing.warp_affine_joints(joints, mat)[source]

Apply affine transformation defined by the transform matrix on the joints.

Parameters
  • joints (np.ndarray[..., 2]) – Origin coordinate of joints.

  • mat (np.ndarray[3, 2]) – The affine matrix.

Returns

Result coordinate of joints.

Return type

matrix (np.ndarray[…, 2])

easycv.core.post_processing.oks_nms(kpts_db, thr, sigmas=None, vis_thr=None)[source]

OKS NMS implementations.

Parameters
  • kpts_db – keypoints.

  • thr – Retain overlap < thr.

  • sigmas – standard deviation of keypoint labelling.

  • vis_thr – threshold of the keypoint visibility.

Returns

indexes to keep.

Return type

np.ndarray

easycv.core.post_processing.soft_oks_nms(kpts_db, thr, max_dets=20, sigmas=None, vis_thr=None)[source]

Soft OKS NMS implementations.

Parameters
  • kpts_db

  • thr – retain oks overlap < thr.

  • max_dets – max number of detections to keep.

  • sigmas – Keypoint labelling uncertainty.

Returns

indexes to keep.

Return type

np.ndarray

Submodules
easycv.core.post_processing.nms module
easycv.core.post_processing.nms.oks_iou(g, d, a_g, a_d, sigmas=None, vis_thr=None)[source]

Calculate oks ious.

Parameters
  • g – Ground truth keypoints.

  • d – Detected keypoints.

  • a_g – Area of the ground truth object.

  • a_d – Area of the detected object.

  • sigmas – standard deviation of keypoint labelling.

  • vis_thr – threshold of the keypoint visibility.

Returns

The oks ious.

Return type

list

easycv.core.post_processing.nms.oks_nms(kpts_db, thr, sigmas=None, vis_thr=None)[source]

OKS NMS implementations.

Parameters
  • kpts_db – keypoints.

  • thr – Retain overlap < thr.

  • sigmas – standard deviation of keypoint labelling.

  • vis_thr – threshold of the keypoint visibility.

Returns

indexes to keep.

Return type

np.ndarray

easycv.core.post_processing.nms.soft_oks_nms(kpts_db, thr, max_dets=20, sigmas=None, vis_thr=None)[source]

Soft OKS NMS implementations.

Parameters
  • kpts_db

  • thr – retain oks overlap < thr.

  • max_dets – max number of detections to keep.

  • sigmas – Keypoint labelling uncertainty.

Returns

indexes to keep.

Return type

np.ndarray

easycv.core.post_processing.pose_transforms module
easycv.core.post_processing.pose_transforms.fliplr_joints(joints_3d, joints_3d_visible, img_width, flip_pairs)[source]

Flip human joints horizontally.

Note

num_keypoints: K

Parameters
  • joints_3d (np.ndarray([K, 3])) – Coordinates of keypoints.

  • joints_3d_visible (np.ndarray([K, 1])) – Visibility of keypoints.

  • img_width (int) – Image width.

  • flip_pairs (list[tuple()]) – Pairs of keypoints which are mirrored (for example, left ear – right ear).

Returns

Flipped human joints.

  • joints_3d_flipped (np.ndarray([K, 3])): Flipped joints.

  • joints_3d_visible_flipped (np.ndarray([K, 1])): Joint visibility.

Return type

tuple

easycv.core.post_processing.pose_transforms.fliplr_regression(regression, flip_pairs, center_mode='static', center_x=0.5, center_index=0)[source]

Flip human joints horizontally.

Note

batch_size: N num_keypoint: K

Parameters
  • regression (np.ndarray([..., K, C])) –

    Coordinates of keypoints, where K is the joint number and C is the dimension. Example shapes are: - [N, K, C]: a batch of keypoints where N is the batch size. - [N, T, K, C]: a batch of pose sequences, where T is the frame

    number.

  • flip_pairs (list[tuple()]) – Pairs of keypoints which are mirrored (for example, left ear – right ear).

  • center_mode (str) – The mode to set the center location on the x-axis to flip around. Options are: - static: use a static x value (see center_x also) - root: use a root joint (see center_index also)

  • center_x (float) – Set the x-axis location of the flip center. Only used when center_mode=static.

  • center_index (int) – Set the index of the root joint, whose x location will be used as the flip center. Only used when center_mode=root.

Returns

Flipped human joints.

  • regression_flipped (np.ndarray([…, K, C])): Flipped joints.

Return type

tuple

easycv.core.post_processing.pose_transforms.flip_back(output_flipped, flip_pairs, target_type='GaussianHeatmap')[source]

Flip the flipped heatmaps back to the original form.

Note

batch_size: N num_keypoints: K heatmap height: H heatmap width: W

Parameters
  • output_flipped (np.ndarray[N, K, H, W]) – The output heatmaps obtained from the flipped images.

  • flip_pairs (list[tuple()) – Pairs of keypoints which are mirrored (for example, left ear – right ear).

  • target_type (str) – GaussianHeatmap or CombinedTarget

Returns

heatmaps that flipped back to the original image

Return type

np.ndarray

easycv.core.post_processing.pose_transforms.transform_preds(coords, center, scale, output_size, use_udp=False)[source]

Get final keypoint predictions from heatmaps and apply scaling and translation to map them back to the image.

Note

num_keypoints: K

Parameters
  • coords (np.ndarray[K, ndims]) –

    • If ndims=2, corrds are predicted keypoint location.

    • If ndims=4, corrds are composed of (x, y, scores, tags)

    • If ndims=5, corrds are composed of (x, y, scores, tags, flipped_tags)

  • center (np.ndarray[2, ]) – Center of the bounding box (x, y).

  • scale (np.ndarray[2, ]) – Scale of the bounding box wrt [width, height].

  • output_size (np.ndarray[2, ] | list(2,)) – Size of the destination heatmaps.

  • use_udp (bool) – Use unbiased data processing

Returns

Predicted coordinates in the images.

Return type

np.ndarray

easycv.core.post_processing.pose_transforms.get_affine_transform(center, scale, rot, output_size, shift=(0.0, 0.0), inv=False)[source]

Get the affine transform matrix, given the center/scale/rot/output_size.

Parameters
  • center (np.ndarray[2, ]) – Center of the bounding box (x, y).

  • scale (np.ndarray[2, ]) – Scale of the bounding box wrt [width, height].

  • rot (float) – Rotation angle (degree).

  • output_size (np.ndarray[2, ] | list(2,)) – Size of the destination heatmaps.

  • shift (0-100%) – Shift translation ratio wrt the width/height. Default (0., 0.).

  • inv (bool) – Option to inverse the affine transform direction. (inv=False: src->dst or inv=True: dst->src)

Returns

The transform matrix.

Return type

np.ndarray

easycv.core.post_processing.pose_transforms.affine_transform(pt, trans_mat)[source]

Apply an affine transformation to the points.

Parameters
  • pt (np.ndarray) – a 2 dimensional point to be transformed

  • trans_mat (np.ndarray) – 2x3 matrix of an affine transform

Returns

Transformed points.

Return type

np.ndarray

easycv.core.post_processing.pose_transforms.rotate_point(pt, angle_rad)[source]

Rotate a point by an angle.

Parameters
  • pt (list[float]) – 2 dimensional point to be rotated

  • angle_rad (float) – rotation angle by radian

Returns

Rotated point.

Return type

list[float]

easycv.core.post_processing.pose_transforms.get_warp_matrix(theta, size_input, size_dst, size_target)[source]

Calculate the transformation matrix under the constraint of unbiased. Paper ref: Huang et al. The Devil is in the Details: Delving into Unbiased Data Processing for Human Pose Estimation (CVPR 2020).

Parameters
  • theta (float) – Rotation angle in degrees.

  • size_input (np.ndarray) – Size of input image [w, h].

  • size_dst (np.ndarray) – Size of output image [w, h].

  • size_target (np.ndarray) – Size of ROI in input plane [w, h].

Returns

A matrix for transformation.

Return type

matrix (np.ndarray)

easycv.core.post_processing.pose_transforms.warp_affine_joints(joints, mat)[source]

Apply affine transformation defined by the transform matrix on the joints.

Parameters
  • joints (np.ndarray[..., 2]) – Origin coordinate of joints.

  • mat (np.ndarray[3, 2]) – The affine matrix.

Returns

Result coordinate of joints.

Return type

matrix (np.ndarray[…, 2])

easycv.core.visualization package

easycv.core.visualization.imshow_bboxes(img, bboxes, labels=None, colors='green', text_color='white', font_size=20, thickness=1, font_scale=0.5, show=True, win_name='', wait_time=0, out_file=None)[source]

Draw bboxes with labels (optional) on an image. This is a wrapper of mmcv.imshow_bboxes.

Parameters
  • img (str or ndarray) – The image to be displayed.

  • bboxes (ndarray) – ndarray of shape (k, 4), each row is a bbox in format [x1, y1, x2, y2].

  • labels (str or list[str], optional) – labels of each bbox.

  • colors (list[str or tuple or Color]) – A list of colors.

  • text_color (str or tuple or Color) – Color of texts.

  • font_size (int) – Size of font.

  • thickness (int) – Thickness of lines.

  • font_scale (float) – Font scales of texts.

  • show (bool) – Whether to show the image.

  • win_name (str) – The window name.

  • wait_time (int) – Value of waitKey param.

  • out_file (str, optional) – The filename to write the image.

Returns

The image with bboxes drawn on it.

Return type

ndarray

easycv.core.visualization.imshow_keypoints(img, pose_result, skeleton=None, kpt_score_thr=0.3, pose_kpt_color=None, pose_link_color=None, radius=4, thickness=1, show_keypoint_weight=False)[source]

Draw keypoints and links on an image.

Parameters
  • img (str or Tensor) – The image to draw poses on. If an image array is given, id will be modified in-place.

  • pose_result (list[kpts]) – The poses to draw. Each element kpts is a set of K keypoints as an Kx3 numpy.ndarray, where each keypoint is represented as x, y, score.

  • kpt_score_thr (float, optional) – Minimum score of keypoints to be shown. Default: 0.3.

  • pose_kpt_color (np.array[Nx3]`) – Color of N keypoints. If None, the keypoint will not be drawn.

  • pose_link_color (np.array[Mx3]) – Color of M links. If None, the links will not be drawn.

  • thickness (int) – Thickness of lines.

easycv.core.visualization.imshow_label(img, labels, text_color='blue', font_size=20, thickness=1, font_scale=0.5, intervel=5, show=True, win_name='', wait_time=0, out_file=None)[source]

Draw images with labels on an image.

Parameters
  • img (str or ndarray) – The image to be displayed.

  • labels (str or list[str]) – labels of each image.

  • text_color (str or tuple or Color) – Color of texts.

  • font_size (int) – Size of font.

  • thickness (int) – Thickness of lines.

  • font_scale (float) – Font scales of texts.

  • intervel(int) – interval pixels between multiple labels

  • show (bool) – Whether to show the image.

  • win_name (str) – The window name.

  • wait_time (int) – Value of waitKey param.

  • out_file (str, optional) – The filename to write the image.

Returns

The image with bboxes drawn on it.

Return type

ndarray

Submodules
easycv.core.visualization.image module
easycv.core.visualization.image.get_font_path()[source]
easycv.core.visualization.image.put_text(img, xy, text, fill, size=20)[source]

support chinese text

easycv.core.visualization.image.imshow_label(img, labels, text_color='blue', font_size=20, thickness=1, font_scale=0.5, intervel=5, show=True, win_name='', wait_time=0, out_file=None)[source]

Draw images with labels on an image.

Parameters
  • img (str or ndarray) – The image to be displayed.

  • labels (str or list[str]) – labels of each image.

  • text_color (str or tuple or Color) – Color of texts.

  • font_size (int) – Size of font.

  • thickness (int) – Thickness of lines.

  • font_scale (float) – Font scales of texts.

  • intervel(int) – interval pixels between multiple labels

  • show (bool) – Whether to show the image.

  • win_name (str) – The window name.

  • wait_time (int) – Value of waitKey param.

  • out_file (str, optional) – The filename to write the image.

Returns

The image with bboxes drawn on it.

Return type

ndarray

easycv.core.visualization.image.imshow_bboxes(img, bboxes, labels=None, colors='green', text_color='white', font_size=20, thickness=1, font_scale=0.5, show=True, win_name='', wait_time=0, out_file=None)[source]

Draw bboxes with labels (optional) on an image. This is a wrapper of mmcv.imshow_bboxes.

Parameters
  • img (str or ndarray) – The image to be displayed.

  • bboxes (ndarray) – ndarray of shape (k, 4), each row is a bbox in format [x1, y1, x2, y2].

  • labels (str or list[str], optional) – labels of each bbox.

  • colors (list[str or tuple or Color]) – A list of colors.

  • text_color (str or tuple or Color) – Color of texts.

  • font_size (int) – Size of font.

  • thickness (int) – Thickness of lines.

  • font_scale (float) – Font scales of texts.

  • show (bool) – Whether to show the image.

  • win_name (str) – The window name.

  • wait_time (int) – Value of waitKey param.

  • out_file (str, optional) – The filename to write the image.

Returns

The image with bboxes drawn on it.

Return type

ndarray

easycv.core.visualization.image.imshow_keypoints(img, pose_result, skeleton=None, kpt_score_thr=0.3, pose_kpt_color=None, pose_link_color=None, radius=4, thickness=1, show_keypoint_weight=False)[source]

Draw keypoints and links on an image.

Parameters
  • img (str or Tensor) – The image to draw poses on. If an image array is given, id will be modified in-place.

  • pose_result (list[kpts]) – The poses to draw. Each element kpts is a set of K keypoints as an Kx3 numpy.ndarray, where each keypoint is represented as x, y, score.

  • kpt_score_thr (float, optional) – Minimum score of keypoints to be shown. Default: 0.3.

  • pose_kpt_color (np.array[Nx3]`) – Color of N keypoints. If None, the keypoint will not be drawn.

  • pose_link_color (np.array[Mx3]) – Color of M links. If None, the links will not be drawn.

  • thickness (int) – Thickness of lines.

Submodules

easycv.core.standard_fields module

Contains classes specifying naming conventions used for object detection.

Specifies:

InputDataFields: standard fields used by reader/preprocessor/batcher. DetectionResultFields: standard fields returned by object detector. BoxListFields: standard field used by BoxList TfExampleFields: standard fields for tf-example data format (go/tf-example).

class easycv.core.standard_fields.InputDataFields[source]

Bases: object

Names for the input tensors.

Holds the standard data field names to use for identifying input tensors. This should be used by the decoder to identify keys for the returned tensor_dict containing input tensors. And it should be used by the model to identify the tensors it needs.

image

image.

original_image

image in the original input size.

key

unique key corresponding to image.

source_id

source of the original image.

filename

original filename of the dataset (without common path).

groundtruth_image_classes

image-level class labels.

groundtruth_boxes

coordinates of the ground truth boxes in the image.

groundtruth_classes

box-level class labels.

groundtruth_label_types

box-level label types (e.g. explicit negative).

groundtruth_is_crowd

[DEPRECATED, use groundtruth_group_of instead] is the groundtruth a single object or a crowd.

groundtruth_area

area of a groundtruth segment.

groundtruth_difficult

is a difficult object

groundtruth_group_of

is a group_of objects, e.g. multiple objects of the same class, forming a connected group, where instances are heavily occluding each other.

proposal_boxes

coordinates of object proposal boxes.

proposal_objectness

objectness score of each proposal.

groundtruth_instance_masks

ground truth instance masks.

groundtruth_instance_boundaries

ground truth instance boundaries.

groundtruth_instance_classes

instance mask-level class labels.

groundtruth_keypoints

ground truth keypoints.

groundtruth_keypoint_visibilities

ground truth keypoint visibilities.

groundtruth_label_scores

groundtruth label scores.

groundtruth_weights

groundtruth weight factor for bounding boxes.

num_groundtruth_boxes

number of groundtruth boxes.

true_image_shapes

true shapes of images in the resized images, as resized images can be padded with zeros.

image = 'image'
mask = 'mask'
width = 'width'
height = 'height'
original_image = 'original_image'
optical_flow = 'optical_flow'
key = 'key'
source_id = 'source_id'
filename = 'filename'
dataset_name = 'dataset_name'
groundtruth_image_classes = 'groundtruth_image_classes'
groundtruth_image_classes_num = 'groundtruth_image_classes_num'
groundtruth_boxes = 'groundtruth_boxes'
groundtruth_classes = 'groundtruth_classes'
groundtruth_label_types = 'groundtruth_label_types'
groundtruth_is_crowd = 'groundtruth_is_crowd'
groundtruth_area = 'groundtruth_area'
groundtruth_difficult = 'groundtruth_difficult'
groundtruth_group_of = 'groundtruth_group_of'
proposal_boxes = 'proposal_boxes'
proposal_objectness = 'proposal_objectness'
groundtruth_instance_masks = 'groundtruth_instance_masks'
groundtruth_instance_boundaries = 'groundtruth_instance_boundaries'
groundtruth_instance_classes = 'groundtruth_instance_classes'
groundtruth_keypoints = 'groundtruth_keypoints'
groundtruth_keypoint_visibilities = 'groundtruth_keypoint_visibilities'
groundtruth_label_scores = 'groundtruth_label_scores'
groundtruth_weights = 'groundtruth_weights'
num_groundtruth_boxes = 'num_groundtruth_boxes'
true_image_shape = 'true_image_shape'
original_image_shape = 'original_image_shape'
original_instance_masks = 'original_instance_masks'
groundtruth_boxes_absolute = 'groundtruth_boxes_absolute'
groundtruth_keypoints_absolute = 'groundtruth_keypoints_absolute'
label_map = 'label_map'
char_dict = 'char_dict'
class easycv.core.standard_fields.DetectionResultFields[source]

Bases: object

Naming conventions for storing the output of the detector.

source_id

source of the original image.

key

unique key corresponding to image.

detection_boxes

coordinates of the detection boxes in the image.

detection_scores

detection scores for the detection boxes in the image.

detection_classes

detection-level class labels.

detection_masks

contains a segmentation mask for each detection box.

detection_boundaries

contains an object boundary for each detection box.

detection_keypoints

contains detection keypoints for each detection box.

num_detections

number of detections in the batch.

source_id = 'source_id'
key = 'key'
detection_boxes = 'detection_boxes'
detection_scores = 'detection_scores'
detection_classes = 'detection_classes'
detection_masks = 'detection_masks'
detection_boundaries = 'detection_boundaries'
detection_keypoints = 'detection_keypoints'
num_detections = 'num_detections'
class easycv.core.standard_fields.TfExampleFields[source]

Bases: object

TF-example proto feature names for object detection.

Holds the standard feature names to load from an Example proto for object detection.

image_encoded

JPEG encoded string

image_format

image format, e.g. “JPEG”

filename

filename

channels

number of channels of image

colorspace

colorspace, e.g. “RGB”

height

height of image in pixels, e.g. 462

width

width of image in pixels, e.g. 581

source_id

original source of the image

object_class_text

labels in text format, e.g. [“person”, “cat”]

object_class_label

labels in numbers, e.g. [16, 8]

object_bbox_xmin

xmin coordinates of groundtruth box, e.g. 10, 30

object_bbox_xmax

xmax coordinates of groundtruth box, e.g. 50, 40

object_bbox_ymin

ymin coordinates of groundtruth box, e.g. 40, 50

object_bbox_ymax

ymax coordinates of groundtruth box, e.g. 80, 70

object_view

viewpoint of object, e.g. [“frontal”, “left”]

object_truncated

is object truncated, e.g. [true, false]

object_occluded

is object occluded, e.g. [true, false]

object_difficult

is object difficult, e.g. [true, false]

object_group_of

is object a single object or a group of objects

object_depiction

is object a depiction

object_is_crowd

[DEPRECATED, use object_group_of instead] is the object a single object or a crowd

object_segment_area

the area of the segment.

object_weight

a weight factor for the object’s bounding box.

instance_masks

instance segmentation masks.

instance_boundaries

instance boundaries.

instance_classes

Classes for each instance segmentation mask.

detection_class_label

class label in numbers.

detection_bbox_ymin

ymin coordinates of a detection box.

detection_bbox_xmin

xmin coordinates of a detection box.

detection_bbox_ymax

ymax coordinates of a detection box.

detection_bbox_xmax

xmax coordinates of a detection box.

detection_score

detection score for the class label and box.

image_encoded = 'image/encoded'
image_format = 'image/format'
filename = 'image/filename'
channels = 'image/channels'
colorspace = 'image/colorspace'
height = 'image/height'
width = 'image/width'
source_id = 'image/source_id'
object_class_text = 'image/object/class/text'
object_class_label = 'image/object/class/label'
object_bbox_ymin = 'image/object/bbox/ymin'
object_bbox_xmin = 'image/object/bbox/xmin'
object_bbox_ymax = 'image/object/bbox/ymax'
object_bbox_xmax = 'image/object/bbox/xmax'
object_view = 'image/object/view'
object_truncated = 'image/object/truncated'
object_occluded = 'image/object/occluded'
object_difficult = 'image/object/difficult'
object_group_of = 'image/object/group_of'
object_depiction = 'image/object/depiction'
object_is_crowd = 'image/object/is_crowd'
object_segment_area = 'image/object/segment/area'
object_weight = 'image/object/weight'
instance_masks = 'image/segmentation/object'
instance_boundaries = 'image/boundaries/object'
instance_classes = 'image/segmentation/object/class'
detection_class_label = 'image/detection/label'
detection_bbox_ymin = 'image/detection/bbox/ymin'
detection_bbox_xmin = 'image/detection/bbox/xmin'
detection_bbox_ymax = 'image/detection/bbox/ymax'
detection_bbox_xmax = 'image/detection/bbox/xmax'
detection_score = 'image/detection/score'

easycv.models package

Subpackages

easycv.models.backbones package

Submodules
easycv.models.backbones.benchmark_mlp module
class easycv.models.backbones.benchmark_mlp.BenchMarkMLP(feature_num, num_classes=1000, avg_pool=False, **kwargs)[source]

Bases: torch.nn.modules.module.Module

__init__(feature_num, num_classes=1000, avg_pool=False, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.bninception module

This model is taken from the official PyTorch model zoo. - torchvision.models.mobilenet.py on 31th Aug, 2019

class easycv.models.backbones.bninception.BNInception(num_classes=0)[source]

Bases: torch.nn.modules.module.Module

__init__(num_classes=0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
features(input)[source]
logits(features)[source]
forward(input)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.darknet module
class easycv.models.backbones.darknet.Darknet(depth, in_channels=3, stem_out_channels=32, out_features=('dark3', 'dark4', 'dark5'))[source]

Bases: torch.nn.modules.module.Module

depth2blocks = {21: [1, 2, 2, 1], 53: [2, 8, 8, 4]}
__init__(depth, in_channels=3, stem_out_channels=32, out_features=('dark3', 'dark4', 'dark5'))[source]
Parameters
  • depth (int) – depth of darknet used in model, usually use [21, 53] for this param.

  • in_channels (int) – number of input channels, for example, use 3 for RGB image.

  • stem_out_channels (int) – number of output chanels of darknet stem. It decides channels of darknet layer2 to layer5.

  • out_features (Tuple[str]) – desired output layer name.

make_group_layer(in_channels: int, num_blocks: int, stride: int = 1)[source]

starts with conv layer then has num_blocks ResLayer

make_spp_block(filters_list, in_filters)[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.darknet.CSPDarknet(dep_mul, wid_mul, out_features=('dark3', 'dark4', 'dark5'), depthwise=False, act='silu', spp_type='spp')[source]

Bases: torch.nn.modules.module.Module

__init__(dep_mul, wid_mul, out_features=('dark3', 'dark4', 'dark5'), depthwise=False, act='silu', spp_type='spp')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.genet module
easycv.models.backbones.genet.remove_bn_in_superblock(super_block)[source]
easycv.models.backbones.genet.fuse_bn(model)[source]
class easycv.models.backbones.genet.PlainNetBasicBlockClass(in_channels=0, out_channels=0, stride=1, no_create=False, block_name=None, **kwargs)[source]

Bases: torch.nn.modules.module.Module

__init__(in_channels=0, out_channels=0, stride=1, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.AdaptiveAvgPool(out_channels, output_size, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(out_channels, output_size, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.BN(out_channels=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(out_channels=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.ConvDW(out_channels=None, kernel_size=None, stride=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(out_channels=None, kernel_size=None, stride=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.ConvKX(in_channels=None, out_channels=None, kernel_size=None, stride=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(in_channels=None, out_channels=None, kernel_size=None, stride=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.Flatten(out_channels, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(out_channels, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.Linear(in_channels=None, out_channels=None, bias=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(in_channels=None, out_channels=None, bias=None, copy_from=None, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.MaxPool(out_channels, kernel_size, stride, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(out_channels, kernel_size, stride, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.MultiSumBlock(inner_block_list, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(inner_block_list, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.RELU(out_channels, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(out_channels, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.ResBlock(inner_block_list, in_channels=None, stride=None, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

ResBlock(in_channles, inner_blocks_str). If in_channels is missing, use inner_block_list[0].in_channels as in_channels

__init__(inner_block_list, in_channels=None, stride=None, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.Sequential(inner_block_list, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(inner_block_list, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.SuperResKXKX(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.SuperResK1KX(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.SuperResK1KXK1(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.SuperResK1DWK1(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
training: bool
class easycv.models.backbones.genet.SuperResK1DW(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Bases: easycv.models.backbones.genet.PlainNetBasicBlockClass

__init__(in_channels=0, out_channels=0, kernel_size=3, stride=1, expansion=1.0, sublayers=1, no_create=False, block_name=None, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

static create_from_str(s, no_create=False)[source]
static is_instance_from_str(s)[source]
class easycv.models.backbones.genet.PlainNet(plainnet_struct_idx=None, num_classes=0, no_create=False, **kwargs)[source]

Bases: torch.nn.modules.module.Module

training: bool
__init__(plainnet_struct_idx=None, num_classes=0, no_create=False, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

easycv.models.backbones.hrnet module
easycv.models.backbones.hrnet.get_expansion(block, expansion=None)[source]

Get the expansion of a residual block.

The block expansion will be obtained by the following order:

  1. If expansion is given, just return it.

  2. If block has the attribute expansion, then return block.expansion.

  3. Return the default value according the the block type: 1 for BasicBlock and 4 for Bottleneck.

Parameters
  • block (class) – The block class.

  • expansion (int | None) – The given expansion ratio.

Returns

The expansion of the block.

Return type

int

class easycv.models.backbones.hrnet.Bottleneck(in_channels, out_channels, expansion=4, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]

Bases: torch.nn.modules.module.Module

Bottleneck block for ResNet.

Parameters
  • in_channels (int) – Input channels of this block.

  • out_channels (int) – Output channels of this block.

  • expansion (int) – The ratio of out_channels/mid_channels where mid_channels is the input/output channels of conv2. Default: 4.

  • stride (int) – stride of the block. Default: 1

  • dilation (int) – dilation of convolution. Default: 1

  • downsample (nn.Module) – downsample operation on identity branch. Default: None.

  • style (str) – "pytorch" or "caffe". If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer. Default: “pytorch”.

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.

  • conv_cfg (dict) – dictionary to construct and config conv layer. Default: None

  • norm_cfg (dict) – dictionary to construct and config norm layer. Default: dict(type=’BN’)

__init__(in_channels, out_channels, expansion=4, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property norm1

the normalization layer named “norm1”

Type

nn.Module

property norm2

the normalization layer named “norm2”

Type

nn.Module

property norm3

the normalization layer named “norm3”

Type

nn.Module

forward(x)[source]

Forward function.

training: bool
class easycv.models.backbones.hrnet.HRModule(num_branches, blocks, num_blocks, in_channels, num_channels, multiscale_output=False, with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, upsample_cfg={'align_corners': None, 'mode': 'nearest'})[source]

Bases: torch.nn.modules.module.Module

High-Resolution Module for HRNet.

In this module, every branch has 4 BasicBlocks/Bottlenecks. Fusion/Exchange is in this module.

__init__(num_branches, blocks, num_blocks, in_channels, num_channels, multiscale_output=False, with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, upsample_cfg={'align_corners': None, 'mode': 'nearest'})[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Forward function.

training: bool
class easycv.models.backbones.hrnet.HRNet(arch='w32', extra=None, in_channels=3, conv_cfg=None, norm_cfg={'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=False, multi_scale_output=False)[source]

Bases: torch.nn.modules.module.Module

HRNet backbone.

High-Resolution Representations for Labeling Pixels and Regions

Parameters
  • extra (dict) – detailed configuration for each stage of HRNet.

  • in_channels (int) – Number of input image channels. Default: 3.

  • conv_cfg (dict) – dictionary to construct and config conv layer.

  • norm_cfg (dict) – dictionary to construct and config norm layer.

  • norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only. Default: False

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.

  • zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.

Example

>>> from mmpose.models import HRNet
>>> import torch
>>> extra = dict(
>>>     stage1=dict(
>>>         num_modules=1,
>>>         num_branches=1,
>>>         block='BOTTLENECK',
>>>         num_blocks=(4, ),
>>>         num_channels=(64, )),
>>>     stage2=dict(
>>>         num_modules=1,
>>>         num_branches=2,
>>>         block='BASIC',
>>>         num_blocks=(4, 4),
>>>         num_channels=(32, 64)),
>>>     stage3=dict(
>>>         num_modules=4,
>>>         num_branches=3,
>>>         block='BASIC',
>>>         num_blocks=(4, 4, 4),
>>>         num_channels=(32, 64, 128)),
>>>     stage4=dict(
>>>         num_modules=3,
>>>         num_branches=4,
>>>         block='BASIC',
>>>         num_blocks=(4, 4, 4, 4),
>>>         num_channels=(32, 64, 128, 256)))
>>> self = HRNet(extra, in_channels=1)
>>> self.eval()
>>> inputs = torch.rand(1, 1, 32, 32)
>>> level_outputs = self.forward(inputs)
>>> for level_out in level_outputs:
...     print(tuple(level_out.shape))
(1, 32, 8, 8)
blocks_dict = {'BASIC': <class 'easycv.models.backbones.resnet.BasicBlock'>, 'BOTTLENECK': <class 'easycv.models.backbones.hrnet.Bottleneck'>}
arch_zoo = {'w18': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (18, 36)], [4, 3, 'BASIC', (4, 4, 4), (18, 36, 72)], [3, 4, 'BASIC', (4, 4, 4, 4), (18, 36, 72, 144)]], 'w30': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (30, 60)], [4, 3, 'BASIC', (4, 4, 4), (30, 60, 120)], [3, 4, 'BASIC', (4, 4, 4, 4), (30, 60, 120, 240)]], 'w32': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (32, 64)], [4, 3, 'BASIC', (4, 4, 4), (32, 64, 128)], [3, 4, 'BASIC', (4, 4, 4, 4), (32, 64, 128, 256)]], 'w40': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (40, 80)], [4, 3, 'BASIC', (4, 4, 4), (40, 80, 160)], [3, 4, 'BASIC', (4, 4, 4, 4), (40, 80, 160, 320)]], 'w44': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (44, 88)], [4, 3, 'BASIC', (4, 4, 4), (44, 88, 176)], [3, 4, 'BASIC', (4, 4, 4, 4), (44, 88, 176, 352)]], 'w48': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (48, 96)], [4, 3, 'BASIC', (4, 4, 4), (48, 96, 192)], [3, 4, 'BASIC', (4, 4, 4, 4), (48, 96, 192, 384)]], 'w64': [[1, 1, 'BOTTLENECK', (4,), (64,)], [1, 2, 'BASIC', (4, 4), (64, 128)], [4, 3, 'BASIC', (4, 4, 4), (64, 128, 256)], [3, 4, 'BASIC', (4, 4, 4, 4), (64, 128, 256, 512)]]}
__init__(arch='w32', extra=None, in_channels=3, conv_cfg=None, norm_cfg={'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=False, multi_scale_output=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property norm1

the normalization layer named “norm1”

Type

nn.Module

property norm2

the normalization layer named “norm2”

Type

nn.Module

init_weights()[source]
forward(x)[source]

Forward function.

train(mode=True)[source]

Convert the model into training mode.

training: bool
parse_arch(arch, extra=None)[source]
easycv.models.backbones.inceptionv3 module

This model is taken from the official PyTorch model zoo. - torchvision.models.inception.py on 31th Aug, 2019

class easycv.models.backbones.inceptionv3.Inception3(num_classes: int = 0, aux_logits: bool = True, transform_input: bool = False)[source]

Bases: torch.nn.modules.module.Module

__init__(num_classes: int = 0, aux_logits: bool = True, transform_input: bool = False)None[source]
Parameters
  • num_classes – number of classes based on dataset.

  • aux_logits – If True, adds two auxiliary branches that can improve training. Default: False when pretrained is True otherwise True

  • transform_input – If True, preprocesses the input according to the method with which it was trained on ImageNet. Default: False

init_weights()[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.lighthrnet module
easycv.models.backbones.lighthrnet.channel_shuffle(x, groups)[source]

Channel Shuffle operation.

This function enables cross-group information flow for multiple groups convolution layers.

Parameters
  • x (Tensor) – The input tensor.

  • groups (int) – The number of groups to divide the input tensor in the channel dimension.

Returns

The output tensor after channel shuffle operation.

Return type

Tensor

class easycv.models.backbones.lighthrnet.SpatialWeighting(channels, ratio=16, conv_cfg=None, norm_cfg=None, act_cfg=({'type': 'ReLU'}, {'type': 'Sigmoid'}))[source]

Bases: torch.nn.modules.module.Module

Spatial weighting module.

Parameters
  • channels (int) – The channels of the module.

  • ratio (int) – channel reduction ratio.

  • conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.

  • norm_cfg (dict) – Config dict for normalization layer. Default: None.

  • act_cfg (dict) – Config dict for activation layer. Default: (dict(type=’ReLU’), dict(type=’Sigmoid’)). The last ConvModule uses Sigmoid by default.

__init__(channels, ratio=16, conv_cfg=None, norm_cfg=None, act_cfg=({'type': 'ReLU'}, {'type': 'Sigmoid'}))[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.lighthrnet.CrossResolutionWeighting(channels, ratio=16, conv_cfg=None, norm_cfg=None, act_cfg=({'type': 'ReLU'}, {'type': 'Sigmoid'}))[source]

Bases: torch.nn.modules.module.Module

Cross-resolution channel weighting module.

Parameters
  • channels (int) – The channels of the module.

  • ratio (int) – channel reduction ratio.

  • conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.

  • norm_cfg (dict) – Config dict for normalization layer. Default: None.

  • act_cfg (dict) – Config dict for activation layer. Default: (dict(type=’ReLU’), dict(type=’Sigmoid’)). The last ConvModule uses Sigmoid by default.

__init__(channels, ratio=16, conv_cfg=None, norm_cfg=None, act_cfg=({'type': 'ReLU'}, {'type': 'Sigmoid'}))[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.lighthrnet.ConditionalChannelWeighting(in_channels, stride, reduce_ratio, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]

Bases: torch.nn.modules.module.Module

Conditional channel weighting block.

Parameters
  • in_channels (int) – The input channels of the block.

  • stride (int) – Stride of the 3x3 convolution layer.

  • reduce_ratio (int) – channel reduction ratio.

  • conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.

  • norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.

__init__(in_channels, stride, reduce_ratio, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.lighthrnet.Stem(in_channels, stem_channels, out_channels, expand_ratio, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]

Bases: torch.nn.modules.module.Module

Stem network block.

Parameters
  • in_channels (int) – The input channels of the block.

  • stem_channels (int) – Output channels of the stem layer.

  • out_channels (int) – The output channels of the block.

  • expand_ratio (int) – adjusts number of channels of the hidden layer in InvertedResidual by this amount.

  • conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.

  • norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.

__init__(in_channels, stem_channels, out_channels, expand_ratio, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.lighthrnet.IterativeHead(in_channels, norm_cfg={'type': 'BN'})[source]

Bases: torch.nn.modules.module.Module

Extra iterative head for feature learning.

Parameters
  • in_channels (int) – The input channels of the block.

  • norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).

__init__(in_channels, norm_cfg={'type': 'BN'})[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.lighthrnet.ShuffleUnit(in_channels, out_channels, stride=1, conv_cfg=None, norm_cfg={'type': 'BN'}, act_cfg={'type': 'ReLU'}, with_cp=False)[source]

Bases: torch.nn.modules.module.Module

InvertedResidual block for ShuffleNetV2 backbone.

Parameters
  • in_channels (int) – The input channels of the block.

  • out_channels (int) – The output channels of the block.

  • stride (int) – Stride of the 3x3 convolution layer. Default: 1

  • conv_cfg (dict) – Config dict for convolution layer. Default: None, which means using conv2d.

  • norm_cfg (dict) – Config dict for normalization layer. Default: dict(type=’BN’).

  • act_cfg (dict) – Config dict for activation layer. Default: dict(type=’ReLU’).

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed. Default: False.

__init__(in_channels, out_channels, stride=1, conv_cfg=None, norm_cfg={'type': 'BN'}, act_cfg={'type': 'ReLU'}, with_cp=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.lighthrnet.LiteHRModule(num_branches, num_blocks, in_channels, reduce_ratio, module_type, multiscale_output=False, with_fuse=True, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]

Bases: torch.nn.modules.module.Module

High-Resolution Module for LiteHRNet.

It contains conditional channel weighting blocks and shuffle blocks.

Parameters
  • num_branches (int) – Number of branches in the module.

  • num_blocks (int) – Number of blocks in the module.

  • in_channels (list(int)) – Number of input image channels.

  • reduce_ratio (int) – Channel reduction ratio.

  • module_type (str) – ‘LITE’ or ‘NAIVE’

  • multiscale_output (bool) – Whether to output multi-scale features.

  • with_fuse (bool) – Whether to use fuse layers.

  • conv_cfg (dict) – dictionary to construct and config conv layer.

  • norm_cfg (dict) – dictionary to construct and config norm layer.

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.

__init__(num_branches, num_blocks, in_channels, reduce_ratio, module_type, multiscale_output=False, with_fuse=True, conv_cfg=None, norm_cfg={'type': 'BN'}, with_cp=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Forward function.

training: bool
class easycv.models.backbones.lighthrnet.LiteHRNet(extra, in_channels=3, conv_cfg=None, norm_cfg={'type': 'BN'}, norm_eval=False, with_cp=False)[source]

Bases: torch.nn.modules.module.Module

Lite-HRNet backbone.

Lite-HRNet: A Lightweight High-Resolution Network

Code adapted from ‘https://github.com/HRNet/Lite-HRNet/’ ‘blob/hrnet/models/backbones/litehrnet.py’

Parameters
  • extra (dict) – detailed configuration for each stage of HRNet.

  • in_channels (int) – Number of input image channels. Default: 3.

  • conv_cfg (dict) – dictionary to construct and config conv layer.

  • norm_cfg (dict) – dictionary to construct and config norm layer.

  • norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only. Default: False

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.

Example

>>> from mmpose.models import LiteHRNet
>>> import torch
>>> extra=dict(
>>>    stem=dict(stem_channels=32, out_channels=32, expand_ratio=1),
>>>    num_stages=3,
>>>    stages_spec=dict(
>>>        num_modules=(2, 4, 2),
>>>        num_branches=(2, 3, 4),
>>>        num_blocks=(2, 2, 2),
>>>        module_type=('LITE', 'LITE', 'LITE'),
>>>        with_fuse=(True, True, True),
>>>        reduce_ratios=(8, 8, 8),
>>>        num_channels=(
>>>            (40, 80),
>>>            (40, 80, 160),
>>>            (40, 80, 160, 320),
>>>        )),
>>>    with_head=False)
>>> self = LiteHRNet(extra, in_channels=1)
>>> self.eval()
>>> inputs = torch.rand(1, 1, 32, 32)
>>> level_outputs = self.forward(inputs)
>>> for level_out in level_outputs:
...     print(tuple(level_out.shape))
(1, 40, 8, 8)
__init__(extra, in_channels=3, conv_cfg=None, norm_cfg={'type': 'BN'}, norm_eval=False, with_cp=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]

Initialize the weights in backbone.

Parameters

pretrained (str, optional) – Path to pre-trained weights. Defaults to None.

forward(x)[source]

Forward function.

train(mode=True)[source]

Convert the model into training mode.

training: bool
easycv.models.backbones.mae_vit_transformer module

Mostly copy-paste from https://github.com/facebookresearch/mae/blob/main/models_mae.py

class easycv.models.backbones.mae_vit_transformer.MaskedAutoencoderViT(img_size=224, patch_size=16, in_chans=3, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]

Bases: torch.nn.modules.module.Module

Masked Autoencoder with VisionTransformer backbone.

MaskedAutoencoderViT is mostly same as vit_tranformer_dynamic, but with a random_masking func. MaskedAutoencoderViT model can be loaded by vit_tranformer_dynamic.

Parameters
  • img_size (int) – input image size

  • patch_size (int) – patch size

  • in_chans (int) – input image channels

  • embed_dim (int) – feature dimensions

  • depth (int) – number of encoder layers

  • num_heads (int) – Parallel attention heads

  • mlp_ratio (float) – mlp ratio

  • norm_layer – type of normalization layer

__init__(img_size=224, patch_size=16, in_chans=3, embed_dim=1024, depth=24, num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
random_masking(x, mask_ratio)[source]

Perform per-sample random masking by per-sample shuffling. Per-sample shuffling is done by argsort random noise. x: [N, L, D], sequence

forward(x, mask_ratio)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.mnasnet module

This model is taken from the official PyTorch model zoo. - torchvision.models.mnasnet.py on 31th Aug, 2019

class easycv.models.backbones.mnasnet.MNASNet(alpha, num_classes=0, dropout=0.2)[source]

Bases: torch.nn.modules.module.Module

MNASNet, as described in https://arxiv.org/pdf/1807.11626.pdf. >>> model = MNASNet(1000, 1.0) >>> x = torch.rand(1, 3, 224, 224) >>> y = model(x) >>> y.dim() 1 >>> y.nelement() 1000

__init__(alpha, num_classes=0, dropout=0.2)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights()[source]
training: bool
easycv.models.backbones.mobilenetv2 module

This model is taken from the official PyTorch model zoo. - torchvision.models.mobilenet.py on 31th Aug, 2019

class easycv.models.backbones.mobilenetv2.MobileNetV2(num_classes=0, width_multi=1.0, inverted_residual_setting=None, round_nearest=8)[source]

Bases: torch.nn.modules.module.Module

__init__(num_classes=0, width_multi=1.0, inverted_residual_setting=None, round_nearest=8)[source]

MobileNet V2 main class :param num_classes: Number of classes :type num_classes: int :param width_multi: Width multiplier - adjusts number of channels in each layer by this amount :type width_multi: float :param inverted_residual_setting: Network structure :param round_nearest: Round the number of channels in each layer to be a multiple of this number :type round_nearest: int :param Set to 1 to turn off rounding:

init_weights()[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.network_blocks module
class easycv.models.backbones.network_blocks.SiLU(inplace=True)[source]

Bases: torch.nn.modules.module.Module

export-friendly inplace version of nn.SiLU()

__init__(inplace=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

static forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.HSiLU(inplace=True)[source]

Bases: torch.nn.modules.module.Module

export-friendly inplace version of nn.SiLU() hardsigmoid is better than sigmoid when used for edge model

__init__(inplace=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

static forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.network_blocks.get_activation(name='silu', inplace=True)[source]
class easycv.models.backbones.network_blocks.BaseConv(in_channels, out_channels, ksize, stride, groups=1, bias=False, act='silu')[source]

Bases: torch.nn.modules.module.Module

A Conv2d -> Batchnorm -> silu/leaky relu block

__init__(in_channels, out_channels, ksize, stride, groups=1, bias=False, act='silu')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

fuseforward(x)[source]
training: bool
class easycv.models.backbones.network_blocks.DWConv(in_channels, out_channels, ksize, stride=1, act='silu')[source]

Bases: torch.nn.modules.module.Module

Depthwise Conv + Conv

__init__(in_channels, out_channels, ksize, stride=1, act='silu')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.Bottleneck(in_channels, out_channels, shortcut=True, expansion=0.5, depthwise=False, act='silu')[source]

Bases: torch.nn.modules.module.Module

__init__(in_channels, out_channels, shortcut=True, expansion=0.5, depthwise=False, act='silu')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.ResLayer(in_channels: int)[source]

Bases: torch.nn.modules.module.Module

Residual layer with in_channels inputs.

__init__(in_channels: int)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.SPPFBottleneck(in_channels, out_channels, kernel_size=5, activation='silu')[source]

Bases: torch.nn.modules.module.Module

Spatial pyramid pooling layer used in YOLOv3-SPP

__init__(in_channels, out_channels, kernel_size=5, activation='silu')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.SPPBottleneck(in_channels, out_channels, kernel_sizes=(5, 9, 13), activation='silu')[source]

Bases: torch.nn.modules.module.Module

Spatial pyramid pooling layer used in YOLOv3-SPP

__init__(in_channels, out_channels, kernel_sizes=(5, 9, 13), activation='silu')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.CSPLayer(in_channels, out_channels, n=1, shortcut=True, expansion=0.5, depthwise=False, act='silu')[source]

Bases: torch.nn.modules.module.Module

CSP Bottleneck with 3 convolutions

__init__(in_channels, out_channels, n=1, shortcut=True, expansion=0.5, depthwise=False, act='silu')[source]
Parameters
  • in_channels (int) – input channels.

  • out_channels (int) – output channels.

  • n (int) – number of Bottlenecks. Default value: 1.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.Focus(in_channels, out_channels, ksize=1, stride=1, act='silu')[source]

Bases: torch.nn.modules.module.Module

Focus width and height information into channel space.

__init__(in_channels, out_channels, ksize=1, stride=1, act='silu')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.GSConv(c1, c2, k=1, s=1, g=1, act='silu')[source]

Bases: torch.nn.modules.module.Module

GSConv is used to merge the channel information of DSConv and BaseConv You can refer to https://github.com/AlanLi1997/slim-neck-by-gsconv for more details

__init__(c1, c2, k=1, s=1, g=1, act='silu')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.GSBottleneck(c1, c2, k=3, s=1)[source]

Bases: torch.nn.modules.module.Module

The use of GSBottleneck is to stack the GSConv layer You can refer to https://github.com/AlanLi1997/slim-neck-by-gsconv for more details

__init__(c1, c2, k=3, s=1)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.network_blocks.VoVGSCSP(c1, c2, n=1, shortcut=True, g=1, e=0.5)[source]

Bases: torch.nn.modules.module.Module

VoVGSCSP is a new neck structure used in CSPNet You can refer to https://github.com/AlanLi1997/slim-neck-by-gsconv for more details

__init__(c1, c2, n=1, shortcut=True, g=1, e=0.5)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.pytorch_image_models_wrapper module
class easycv.models.backbones.pytorch_image_models_wrapper.PytorchImageModelWrapper(model_name='resnet50', scriptable=None, exportable=None, no_jit=None, **kwargs)[source]

Bases: torch.nn.modules.module.Module

Support Backbones From pytorch-image-models. The PyTorch community has lots of awesome contributions for image models. PyTorch Image Models (timm) is a collection of image models, aim to pull together a wide variety of SOTA models with ability to reproduce ImageNet training results. Model pages can be found at https://rwightman.github.io/pytorch-image-models/models/ References: https://github.com/rwightman/pytorch-image-models

__init__(model_name='resnet50', scriptable=None, exportable=None, no_jit=None, **kwargs)[source]

Inits PytorchImageModelWrapper by timm.create_models :param model_name: name of model to instantiate :type model_name: str :param scriptable: set layer config so that model is jit scriptable (not working for all models yet) :type scriptable: bool :param exportable: set layer config so that model is traceable / ONNX exportable (not fully impl/obeyed yet) :type exportable: bool :param no_jit: set layer config so that model doesn’t utilize jit scripted layers (so far activations only) :type no_jit: bool

init_weights(pretrained=None)[source]
Parameters
  • pretrained == True (if) –

  • model from default path; (load) –

  • pretrained == False or None (if) –

  • from init weights. (load) –

  • model_name in timm_model_names (if) –

  • model from timm default path; (load) –

  • model_name in _MODEL_MAP (if) –

  • model from easycv default path (load) –

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.resnest module

ResNet variants

class easycv.models.backbones.resnest.SplAtConv2d(in_channels, channels, kernel_size, stride=(1, 1), padding=(0, 0), dilation=(1, 1), groups=1, bias=True, radix=2, reduction_factor=4, rectify=False, rectify_avg=False, norm_layer=None, dropblock_prob=0.0, **kwargs)[source]

Bases: torch.nn.modules.module.Module

Split-Attention Conv2d

__init__(in_channels, channels, kernel_size, stride=(1, 1), padding=(0, 0), dilation=(1, 1), groups=1, bias=True, radix=2, reduction_factor=4, rectify=False, rectify_avg=False, norm_layer=None, dropblock_prob=0.0, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.resnest.rSoftMax(radix, cardinality)[source]

Bases: torch.nn.modules.module.Module

__init__(radix, cardinality)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.resnest.DropBlock2D(*args, **kwargs)[source]

Bases: object

__init__(*args, **kwargs)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.models.backbones.resnest.GlobalAvgPool2d[source]

Bases: torch.nn.modules.module.Module

__init__()[source]

Global average pooling over the input’s spatial dimensions

forward(inputs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.resnest.Bottleneck(inplanes, planes, stride=1, downsample=None, radix=1, cardinality=1, bottleneck_width=64, avd=False, avd_first=False, dilation=1, is_first=False, rectified_conv=False, rectify_avg=False, norm_layer=None, dropblock_prob=0.0, last_gamma=False)[source]

Bases: torch.nn.modules.module.Module

ResNet Bottleneck

expansion = 4
__init__(inplanes, planes, stride=1, downsample=None, radix=1, cardinality=1, bottleneck_width=64, avd=False, avd_first=False, dilation=1, is_first=False, rectified_conv=False, rectify_avg=False, norm_layer=None, dropblock_prob=0.0, last_gamma=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.resnest.ResNeSt(depth=None, block=<class 'easycv.models.backbones.resnest.Bottleneck'>, layers=[3, 4, 6, 3], radix=2, groups=1, bottleneck_width=64, num_classes=0, dilated=False, dilation=1, deep_stem=True, stem_width=32, avg_down=True, rectified_conv=False, rectify_avg=False, avd=False, avd_first=False, final_drop=0.0, dropblock_prob=0, last_gamma=False, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>)[source]

Bases: torch.nn.modules.module.Module

ResNet Variants

Parameters
  • block (Block) – Class for the residual block. Options are BasicBlockV1, BottleneckV1.

  • layers (list of int) – Numbers of layers in each block

  • classes (int, default 1000) – Number of classification classes.

  • dilated (bool, default False) – Applying dilation strategy to pretrained ResNet yielding a stride-8 model, typically used in Semantic Segmentation.

  • norm_layer (object) – Normalization layer used in backbone network (default: mxnet.gluon.nn.BatchNorm; for Synchronized Cross-GPU BachNormalization).

  • Reference

    • He, Kaiming, et al. “Deep residual learning for image recognition.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.

    • Yu, Fisher, and Vladlen Koltun. “Multi-scale context aggregation by dilated convolutions.”

arch_settings = {50: ((3, 4, 6, 3), 32), 101: ((3, 4, 23, 3), 64), 200: ((3, 24, 36, 3), 64), 269: ((3, 30, 48, 8), 64)}
__init__(depth=None, block=<class 'easycv.models.backbones.resnest.Bottleneck'>, layers=[3, 4, 6, 3], radix=2, groups=1, bottleneck_width=64, num_classes=0, dilated=False, dilation=1, deep_stem=True, stem_width=32, avg_down=True, rectified_conv=False, rectify_avg=False, avd=False, avd_first=False, final_drop=0.0, dropblock_prob=0, last_gamma=False, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool
init_weights()[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

easycv.models.backbones.resnet module
class easycv.models.backbones.resnet.BasicBlock(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, frelu=False, dcn=None)[source]

Bases: torch.nn.modules.module.Module

expansion = 1
__init__(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, frelu=False, dcn=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property norm1
property norm2
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.resnet.Bottleneck(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, frelu=False, dcn=None)[source]

Bases: torch.nn.modules.module.Module

expansion = 4
__init__(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, frelu=False, dcn=None)[source]

Bottleneck block for ResNet. If style is “pytorch”, the stride-two layer is the 3x3 conv layer, if it is “caffe”, the stride-two layer is the first 1x1 conv layer.

property norm1
property norm2
property norm3
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.resnet.make_res_layer(block, inplanes, planes, blocks, stride=1, dilation=1, style='pytorch', avg_down=False, with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'}, dcn=None, frelu=False, multi_grid=None, contract_dilation=False)[source]
class easycv.models.backbones.resnet.ResNet(depth, in_channels=3, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), style='pytorch', deep_stem=False, avg_down=False, num_classes=0, frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, dcn=None, stage_with_dcn=(False, False, False, False), with_cp=False, frelu=False, original_inplanes=64, stem_channels=64, zero_init_residual=False, multi_grid=None, contract_dilation=False)[source]

Bases: torch.nn.modules.module.Module

ResNet backbone.

Parameters
  • depth (int) – Depth of resnet, from {18, 34, 50, 101, 152}.

  • in_channels (int) – Number of input image channels. Normally 3.

  • num_stages (int) – Resnet stages, normally 4.

  • strides (Sequence[int]) – Strides of the first block of each stage.

  • dilations (Sequence[int]) – Dilation of each stage.

  • out_indices (Sequence[int]) – Output from which stages.

  • style (str) – pytorch or caffe. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer.

  • deep_stem (bool) – Replace 7x7 conv in input stem with 3 3x3 conv. Default: False.

  • avg_down (bool) – Use AvgPool instead of stride conv when downsampling in the bottleneck. Default: False.

  • frozen_stages (int) – Stages to be frozen (stop grad and set eval mode). -1 means not freezing any parameters.

  • norm_cfg (dict) – dictionary to construct and config norm layer.

  • norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only.

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.

  • original_inplanes – start channel for first block, default=64

  • stem_channels (int) – Number of stem channels. Default: 64.

  • zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.

  • multi_grid (Sequence[int]|None) – Multi grid dilation rates of last stage. Default: None.

  • contract_dilation (bool) – Whether contract first dilation of each layer Default: False.

Example

>>> from easycv.models import ResNet
>>> import torch
>>> self = ResNet(depth=18)
>>> self.eval()
>>> inputs = torch.rand(1, 3, 32, 32)
>>> level_outputs = self.forward(inputs)
>>> for level_out in level_outputs:
...     print(tuple(level_out.shape))
(1, 64, 8, 8)
(1, 128, 4, 4)
(1, 256, 2, 2)
(1, 512, 1, 1)
arch_settings = {10: (<class 'easycv.models.backbones.resnet.BasicBlock'>, (1, 1, 1, 1)), 18: (<class 'easycv.models.backbones.resnet.BasicBlock'>, (2, 2, 2, 2)), 34: (<class 'easycv.models.backbones.resnet.BasicBlock'>, (3, 4, 6, 3)), 50: (<class 'easycv.models.backbones.resnet.Bottleneck'>, (3, 4, 6, 3)), 101: (<class 'easycv.models.backbones.resnet.Bottleneck'>, (3, 4, 23, 3)), 152: (<class 'easycv.models.backbones.resnet.Bottleneck'>, (3, 8, 36, 3))}
__init__(depth, in_channels=3, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), style='pytorch', deep_stem=False, avg_down=False, num_classes=0, frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, dcn=None, stage_with_dcn=(False, False, False, False), with_cp=False, frelu=False, original_inplanes=64, stem_channels=64, zero_init_residual=False, multi_grid=None, contract_dilation=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property norm1
init_weights()[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

train(mode=True)[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns

self

Return type

Module

training: bool
class easycv.models.backbones.resnet.ResNetV1c(**kwargs)[source]

Bases: easycv.models.backbones.resnet.ResNet

Compared to ResNet, ResNetV1c replaces the 7x7 conv in the input stem with three 3x3 convs. For more details please refer to <https://arxiv.org/abs/1812.01187>.

training: bool
__init__(**kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

class easycv.models.backbones.resnet.ResNetV1d(**kwargs)[source]

Bases: easycv.models.backbones.resnet.ResNet

Compared to ResNet, ResNetV1d replaces the 7x7 conv in the input stem with three 3x3 convs. And in the downsampling block, a 2x2 avg_pool with stride 2 is added before conv, whose stride is changed to 1.

training: bool
__init__(**kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

easycv.models.backbones.resnet_jit module
class easycv.models.backbones.resnet_jit.BasicBlock(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]

Bases: torch.nn.modules.module.Module

expansion = 1
__init__(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property norm1
property norm2
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.resnet_jit.Bottleneck(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]

Bases: torch.nn.modules.module.Module

expansion = 4
__init__(inplanes, planes, stride=1, dilation=1, downsample=None, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]

Bottleneck block for ResNet. If style is “pytorch”, the stride-two layer is the 3x3 conv layer, if it is “caffe”, the stride-two layer is the first 1x1 conv layer.

property norm1
property norm2
property norm3
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.resnet_jit.make_res_layer(block, inplanes, planes, blocks, stride=1, dilation=1, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]
class easycv.models.backbones.resnet_jit.ResNetJIT(depth, in_channels=3, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), style='pytorch', frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=False)[source]

Bases: torch.nn.modules.module.Module

ResNet backbone.

Parameters
  • depth (int) – Depth of resnet, from {18, 34, 50, 101, 152}.

  • in_channels (int) – Number of input image channels. Normally 3.

  • num_stages (int) – Resnet stages, normally 4.

  • strides (Sequence[int]) – Strides of the first block of each stage.

  • dilations (Sequence[int]) – Dilation of each stage.

  • out_indices (Sequence[int]) – Output from which stages.

  • style (str) – pytorch or caffe. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer.

  • frozen_stages (int) – Stages to be frozen (stop grad and set eval mode). -1 means not freezing any parameters.

  • norm_cfg (dict) – dictionary to construct and config norm layer.

  • norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only.

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.

  • zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.

Example

>>> from easycv.models import ResNet
>>> import torch
>>> self = ResNet(depth=18)
>>> self.eval()
>>> inputs = torch.rand(1, 3, 32, 32)
>>> level_outputs = self.forward(inputs)
>>> for level_out in level_outputs:
...     print(tuple(level_out.shape))
(1, 64, 8, 8)
(1, 128, 4, 4)
(1, 256, 2, 2)
(1, 512, 1, 1)
arch_settings = {18: (<class 'easycv.models.backbones.resnet_jit.BasicBlock'>, (2, 2, 2, 2)), 34: (<class 'easycv.models.backbones.resnet_jit.BasicBlock'>, (3, 4, 6, 3)), 50: (<class 'easycv.models.backbones.resnet_jit.Bottleneck'>, (3, 4, 6, 3)), 101: (<class 'easycv.models.backbones.resnet_jit.Bottleneck'>, (3, 4, 23, 3)), 152: (<class 'easycv.models.backbones.resnet_jit.Bottleneck'>, (3, 8, 36, 3))}
__init__(depth, in_channels=3, num_stages=4, strides=(1, 2, 2, 2), dilations=(1, 1, 1, 1), out_indices=(0, 1, 2, 3, 4), style='pytorch', frozen_stages=- 1, conv_cfg=None, norm_cfg={'requires_grad': True, 'type': 'BN'}, norm_eval=False, with_cp=False, zero_init_residual=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property norm1
init_weights()[source]
training: bool
forward(x: torch.Tensor)List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

train(mode=True)[source]

Sets the module in training mode.

This has any effect only on certain modules. See documentations of particular modules for details of their behaviors in training/evaluation mode, if they are affected, e.g. Dropout, BatchNorm, etc.

Parameters

mode (bool) – whether to set training mode (True) or evaluation mode (False). Default: True.

Returns

self

Return type

Module

easycv.models.backbones.resnext module
class easycv.models.backbones.resnext.Bottleneck(inplanes, planes, groups=1, base_width=4, **kwargs)[source]

Bases: easycv.models.backbones.resnet.Bottleneck

__init__(inplanes, planes, groups=1, base_width=4, **kwargs)[source]

Bottleneck block for ResNeXt. If style is “pytorch”, the stride-two layer is the 3x3 conv layer, if it is “caffe”, the stride-two layer is the first 1x1 conv layer.

training: bool
easycv.models.backbones.resnext.make_res_layer(block, inplanes, planes, blocks, stride=1, dilation=1, groups=1, base_width=4, style='pytorch', with_cp=False, conv_cfg=None, norm_cfg={'type': 'BN'})[source]
class easycv.models.backbones.resnext.ResNeXt(groups=1, base_width=4, **kwargs)[source]

Bases: easycv.models.backbones.resnet.ResNet

ResNeXt backbone.

Parameters
  • depth (int) – Depth of resnet, from {18, 34, 50, 101, 152}.

  • in_channels (int) – Number of input image channels. Normally 3.

  • num_stages (int) – Resnet stages, normally 4.

  • groups (int) – Group of resnext.

  • base_width (int) – Base width of resnext.

  • strides (Sequence[int]) – Strides of the first block of each stage.

  • dilations (Sequence[int]) – Dilation of each stage.

  • out_indices (Sequence[int]) – Output from which stages.

  • style (str) – pytorch or caffe. If set to “pytorch”, the stride-two layer is the 3x3 conv layer, otherwise the stride-two layer is the first 1x1 conv layer.

  • frozen_stages (int) – Stages to be frozen (all param fixed). -1 means not freezing any parameters.

  • norm_cfg (dict) – dictionary to construct and config norm layer.

  • norm_eval (bool) – Whether to set norm layers to eval mode, namely, freeze running stats (mean and var). Note: Effect on Batch Norm and its variants only.

  • with_cp (bool) – Use checkpoint or not. Using checkpoint will save some memory while slowing down the training speed.

  • zero_init_residual (bool) – whether to use zero init for last norm layer in resblocks to let them behave as identity.

Example

>>> from easycv.models import ResNeXt
>>> import torch
>>> self = ResNeXt(depth=50)
>>> self.eval()
>>> inputs = torch.rand(1, 3, 32, 32)
>>> level_outputs = self.forward(inputs)
>>> for level_out in level_outputs:
...     print(tuple(level_out.shape))
(1, 256, 8, 8)
(1, 512, 4, 4)
(1, 1024, 2, 2)
(1, 2048, 1, 1)
arch_settings = {50: (<class 'easycv.models.backbones.resnext.Bottleneck'>, (3, 4, 6, 3)), 101: (<class 'easycv.models.backbones.resnext.Bottleneck'>, (3, 4, 23, 3)), 152: (<class 'easycv.models.backbones.resnext.Bottleneck'>, (3, 8, 36, 3))}
__init__(groups=1, base_width=4, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool
easycv.models.backbones.shuffle_transformer module
class easycv.models.backbones.shuffle_transformer.Mlp(in_features, hidden_features=None, out_features=None, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, drop=0.0, stride=False)[source]

Bases: torch.nn.modules.module.Module

__init__(in_features, hidden_features=None, out_features=None, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, drop=0.0, stride=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.shuffle_transformer.Attention(dim, num_heads, window_size=1, shuffle=False, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0, relative_pos_embedding=False)[source]

Bases: torch.nn.modules.module.Module

__init__(dim, num_heads, window_size=1, shuffle=False, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0, relative_pos_embedding=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.shuffle_transformer.Block(dim, out_dim, num_heads, window_size=1, shuffle=False, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, stride=False, relative_pos_embedding=False)[source]

Bases: torch.nn.modules.module.Module

__init__(dim, out_dim, num_heads, window_size=1, shuffle=False, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, stride=False, relative_pos_embedding=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.shuffle_transformer.PatchMerging(dim, out_dim, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>)[source]

Bases: torch.nn.modules.module.Module

__init__(dim, out_dim, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

extra_repr()str[source]

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

training: bool
class easycv.models.backbones.shuffle_transformer.StageModule(layers, dim, out_dim, num_heads, window_size=1, shuffle=True, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, relative_pos_embedding=False)[source]

Bases: torch.nn.modules.module.Module

__init__(layers, dim, out_dim, num_heads, window_size=1, shuffle=True, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.ReLU6'>, norm_layer=<class 'torch.nn.modules.batchnorm.BatchNorm2d'>, relative_pos_embedding=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.shuffle_transformer.PatchEmbedding(inter_channel=32, out_channels=48)[source]

Bases: torch.nn.modules.module.Module

__init__(inter_channel=32, out_channels=48)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.shuffle_transformer.ShuffleTransformer(img_size=224, in_chans=3, num_classes=1000, token_dim=32, embed_dim=96, mlp_ratio=4.0, layers=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], relative_pos_embedding=True, shuffle=True, window_size=7, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, has_pos_embed=False, **kwargs)[source]

Bases: torch.nn.modules.module.Module

__init__(img_size=224, in_chans=3, num_classes=1000, token_dim=32, embed_dim=96, mlp_ratio=4.0, layers=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], relative_pos_embedding=True, shuffle=True, window_size=7, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, has_pos_embed=False, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
no_weight_decay()[source]
no_weight_decay_keywords()[source]
get_classifier()[source]
reset_classifier(num_classes, global_pool='')[source]
forward_features(x)[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.shuffle_transformer.shuffletrans_base_p4_w7_224(pretrained=False, **kwargs)[source]
easycv.models.backbones.shuffle_transformer.shuffletrans_small_p4_w7_224(pretrained=False, **kwargs)[source]
easycv.models.backbones.shuffle_transformer.shuffletrans_tiny_p4_w7_224(pretrained=False, **kwargs)[source]
easycv.models.backbones.swin_transformer_dynamic module

Borrow this code from https://github.com/microsoft/esvit/blob/main/models/swin_transformer.py To support dynamic swin-transformer for ssl!

class easycv.models.backbones.swin_transformer_dynamic.WindowAttention(dim, window_size, num_heads, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]

Bases: torch.nn.modules.module.Module

Window based multi-head self attention (W-MSA) module with relative position bias. It supports both of shifted and non-shifted window.

Parameters
  • dim (int) – Number of input channels.

  • window_size (tuple[int]) – The height and width of the window.

  • num_heads (int) – Number of attention heads.

  • qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True

  • qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set

  • attn_drop (float, optional) – Dropout ratio of attention weight. Default: 0.0

  • proj_drop (float, optional) – Dropout ratio of output. Default: 0.0

__init__(dim, window_size, num_heads, qkv_bias=True, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, mask=None)[source]
Parameters
  • x – input features with shape of (num_windows*B, N, C)

  • mask – (0/-inf) mask with shape of (num_windows, Wh*Ww, Wh*Ww) or None

extra_repr()str[source]

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

flops(N)[source]
static compute_macs(module, input, output)[source]
training: bool
class easycv.models.backbones.swin_transformer_dynamic.SwinTransformerBlock(dim, input_resolution, num_heads, window_size=7, shift_size=0, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]

Bases: torch.nn.modules.module.Module

Swin Transformer Block.

Parameters
  • dim (int) – Number of input channels.

  • input_resolution (tuple[int]) – Input resulotion.

  • num_heads (int) – Number of attention heads.

  • window_size (int) – Window size.

  • shift_size (int) – Shift size for SW-MSA.

  • mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim.

  • qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True

  • qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set.

  • drop (float, optional) – Dropout rate. Default: 0.0

  • attn_drop (float, optional) – Attention dropout rate. Default: 0.0

  • drop_path (float, optional) – Stochastic depth rate. Default: 0.0

  • act_layer (nn.Module, optional) – Activation layer. Default: nn.GELU

  • norm_layer (nn.Module, optional) – Normalization layer. Default: nn.LayerNorm

__init__(dim, input_resolution, num_heads, window_size=7, shift_size=0, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

create_attn_mask(H, W)[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

extra_repr()str[source]

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

flops()[source]
training: bool
class easycv.models.backbones.swin_transformer_dynamic.PatchMerging(input_resolution, dim, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]

Bases: torch.nn.modules.module.Module

Patch Merging Layer.

Parameters
  • input_resolution (tuple[int]) – Resolution of input feature.

  • dim (int) – Number of input channels.

  • norm_layer (nn.Module, optional) – Normalization layer. Default: nn.LayerNorm

__init__(input_resolution, dim, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Forward function. :param x: Input feature, tensor size (B, H*W, C). :param H: Spatial resolution of the input feature. :param W: Spatial resolution of the input feature.

extra_repr()str[source]

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

flops()[source]
training: bool
class easycv.models.backbones.swin_transformer_dynamic.BasicLayer(dim, input_resolution, depth, num_heads, window_size, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, downsample=None)[source]

Bases: torch.nn.modules.module.Module

A basic Swin Transformer layer for one stage.

Parameters
  • dim (int) – Number of input channels.

  • input_resolution (tuple[int]) – Input resulotion.

  • depth (int) – Number of blocks.

  • num_heads (int) – Number of attention heads.

  • window_size (int) – Window size.

  • mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim.

  • qkv_bias (bool, optional) – If True, add a learnable bias to query, key, value. Default: True

  • qk_scale (float | None, optional) – Override default qk scale of head_dim ** -0.5 if set.

  • drop (float, optional) – Dropout rate. Default: 0.0

  • attn_drop (float, optional) – Attention dropout rate. Default: 0.0

  • drop_path (float | tuple[float], optional) – Stochastic depth rate. Default: 0.0

  • norm_layer (nn.Module, optional) – Normalization layer. Default: nn.LayerNorm

  • downsample (nn.Module | None, optional) – Downsample layer at the end of the layer. Default: None

__init__(dim, input_resolution, depth, num_heads, window_size, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, downsample=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_with_features(x)[source]
forward_with_attention(x)[source]
extra_repr()str[source]

Set the extra representation of the module

To print customized extra information, you should re-implement this method in your own modules. Both single-line and multi-line strings are acceptable.

flops()[source]
training: bool
class easycv.models.backbones.swin_transformer_dynamic.PatchEmbed(img_size=224, patch_size=16, in_chans=3, embed_dim=768, norm_layer=None)[source]

Bases: torch.nn.modules.module.Module

Image to Patch Embedding

__init__(img_size=224, patch_size=16, in_chans=3, embed_dim=768, norm_layer=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

flops()[source]
training: bool
class easycv.models.backbones.swin_transformer_dynamic.DynamicSwinTransformer(img_size=224, patch_size=4, in_chans=3, num_classes=1000, embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], window_size=7, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.1, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, ape=False, patch_norm=True, use_dense_prediction=False, **kwargs)[source]

Bases: torch.nn.modules.module.Module

Swin Transformer
A PyTorch impl ofSwin Transformer: Hierarchical Vision Transformer using Shifted Windows -

https://arxiv.org/pdf/2103.14030

Parameters
  • img_size (int | tuple(int)) – Input image size.

  • patch_size (int | tuple(int)) – Patch size.

  • in_chans (int) – Number of input channels.

  • num_classes (int) – Number of classes for classification head.

  • embed_dim (int) – Embedding dimension.

  • depths (tuple(int)) – Depth of Swin Transformer layers.

  • num_heads (tuple(int)) – Number of attention heads in different layers.

  • window_size (int) – Window size.

  • mlp_ratio (float) – Ratio of mlp hidden dim to embedding dim.

  • qkv_bias (bool) – If True, add a learnable bias to query, key, value. Default: Truee

  • qk_scale (float) – Override default qk scale of head_dim ** -0.5 if set.

  • drop_rate (float) – Dropout rate.

  • attn_drop_rate (float) – Attention dropout rate.

  • drop_path_rate (float) – Stochastic depth rate.

  • norm_layer (nn.Module) – normalization layer.

  • ape (bool) – If True, add absolute position embedding to the patch embedding.

  • patch_norm (bool) – If True, add normalization after patch embedding.

__init__(img_size=224, patch_size=4, in_chans=3, num_classes=1000, embed_dim=96, depths=[2, 2, 6, 2], num_heads=[3, 6, 12, 24], window_size=7, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.1, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, ape=False, patch_norm=True, use_dense_prediction=False, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
no_weight_decay()[source]
no_weight_decay_keywords()[source]
forward_features(x)[source]
forward_feature_maps(x)[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_selfattention(x, n=1)[source]
forward_last_selfattention(x)[source]
forward_all_selfattention(x)[source]
forward_return_n_last_blocks(x, n=1, return_patch_avgpool=False, depth=[])[source]
flops()[source]
freeze_pretrained_layers(frozen_layers=[])[source]
training: bool
easycv.models.backbones.swin_transformer_dynamic.dynamic_swin_tiny_p4_w7_224(pretrained=False, **kwargs)[source]
easycv.models.backbones.swin_transformer_dynamic.dynamic_swin_small_p4_w7_224(pretrained=False, **kwargs)[source]
easycv.models.backbones.swin_transformer_dynamic.dynamic_swin_base_p4_w7_224(pretrained=False, **kwargs)[source]
easycv.models.backbones.vit_transfomer_dynamic module

Mostly copy-paste from timm library. https://github.com/rwightman/pytorch-image-models/blob/master/timm/models/vision_transformer.py

dynamic Input support borrow from https://github.com/microsoft/esvit/blob/main/models/vision_transformer.py

class easycv.models.backbones.vit_transformer_dynamic.DynamicVisionTransformer(use_dense_prediction=False, **kwargs)[source]

Bases: easycv.models.backbones.vision_transformer.VisionTransformer

Dynamic Vision Transformer

Parameters

use_dense_prediction (bool) – If use_dense_prediction is True, the global pool and norm will before head will be removed.(if any) Default: False

__init__(use_dense_prediction=False, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

forward_features(x)[source]
forward_feature_maps(x)[source]
interpolate_pos_encoding(x, pos_embed)[source]
forward_selfattention(x, n=1)[source]
forward_last_selfattention(x)[source]
forward_all_selfattention(x)[source]
forward_return_n_last_blocks(x, n=1, return_patch_avgpool=False, depths=[])[source]
training: bool
easycv.models.backbones.vit_transformer_dynamic.dynamic_deit_tiny_p16(patch_size=16, **kwargs)[source]
easycv.models.backbones.vit_transformer_dynamic.dynamic_deit_small_p16(patch_size=16, **kwargs)[source]
easycv.models.backbones.vit_transformer_dynamic.dynamic_vit_base_p16(patch_size=16, **kwargs)[source]
easycv.models.backbones.vit_transformer_dynamic.dynamic_vit_large_p16(patch_size=16, **kwargs)[source]
easycv.models.backbones.vit_transformer_dynamic.dynamic_vit_huge_p14(patch_size=14, **kwargs)[source]
easycv.models.backbones.xcit_transformer module

Implementation of Cross-Covariance Image Transformer (XCiT) Based on timm and DeiT code bases https://github.com/rwightman/pytorch-image-models/tree/master/timm https://github.com/facebookresearch/deit/

XCiT Transformer. Part of the code is borrowed from: https://github.com/facebookresearch/xcit/blob/master/xcit.py

class easycv.models.backbones.xcit_transformer.PositionalEncodingFourier(hidden_dim=32, dim=768, temperature=10000)[source]

Bases: torch.nn.modules.module.Module

Positional encoding relying on a fourier kernel matching the one used in the “Attention is all of Need” paper. The implementation builds on DeTR code https://github.com/facebookresearch/detr/blob/master/models/position_encoding.py

__init__(hidden_dim=32, dim=768, temperature=10000)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(B, H, W)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.xcit_transformer.conv3x3(in_planes, out_planes, stride=1)[source]

3x3 convolution with padding

class easycv.models.backbones.xcit_transformer.ConvPatchEmbed(img_size=224, patch_size=16, in_chans=3, embed_dim=768)[source]

Bases: torch.nn.modules.module.Module

Image to Patch Embedding using multiple convolutional layers

__init__(img_size=224, patch_size=16, in_chans=3, embed_dim=768)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, padding_size=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.xcit_transformer.LPI(in_features, hidden_features=None, out_features=None, act_layer=<class 'torch.nn.modules.activation.GELU'>, drop=0.0, kernel_size=3)[source]

Bases: torch.nn.modules.module.Module

Local Patch Interaction module that allows explicit communication between tokens in 3x3 windows to augment the implicit communcation performed by the block diagonal scatter attention. Implemented using 2 layers of separable 3x3 convolutions with GeLU and BatchNorm2d

__init__(in_features, hidden_features=None, out_features=None, act_layer=<class 'torch.nn.modules.activation.GELU'>, drop=0.0, kernel_size=3)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, H, W)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.xcit_transformer.ClassAttention(dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]

Bases: torch.nn.modules.module.Module

Class Attention Layer as in CaiT https://arxiv.org/abs/2103.17239

__init__(dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.xcit_transformer.ClassAttentionBlock(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, eta=None, tokens_norm=False)[source]

Bases: torch.nn.modules.module.Module

Class Attention Layer as in CaiT https://arxiv.org/abs/2103.17239

__init__(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, eta=None, tokens_norm=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, H, W, mask=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.xcit_transformer.XCA(dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]

Bases: torch.nn.modules.module.Module

Cross-Covariance Attention (XCA) operation where the channels are updated using a weighted sum.

The weights are obtained from the (softmax normalized) Cross-covariance matrix (Q^T K in d_h times d_h)

__init__(dim, num_heads=8, qkv_bias=False, qk_scale=None, attn_drop=0.0, proj_drop=0.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

no_weight_decay()[source]
training: bool
class easycv.models.backbones.xcit_transformer.XCABlock(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, num_tokens=196, eta=None)[source]

Bases: torch.nn.modules.module.Module

__init__(dim, num_heads, mlp_ratio=4.0, qkv_bias=False, qk_scale=None, drop=0.0, attn_drop=0.0, drop_path=0.0, act_layer=<class 'torch.nn.modules.activation.GELU'>, norm_layer=<class 'torch.nn.modules.normalization.LayerNorm'>, num_tokens=196, eta=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x, H, W)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.backbones.xcit_transformer.XCiT(img_size=224, patch_size=16, in_chans=3, num_classes=1000, embed_dim=768, depth=12, num_heads=12, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, norm_layer=None, cls_attn_layers=2, use_pos=True, patch_proj='linear', eta=None, tokens_norm=False)[source]

Bases: torch.nn.modules.module.Module

Based on timm and DeiT code bases https://github.com/rwightman/pytorch-image-models/tree/master/timm https://github.com/facebookresearch/deit/

__init__(img_size=224, patch_size=16, in_chans=3, num_classes=1000, embed_dim=768, depth=12, num_heads=12, mlp_ratio=4.0, qkv_bias=True, qk_scale=None, drop_rate=0.0, attn_drop_rate=0.0, drop_path_rate=0.0, norm_layer=None, cls_attn_layers=2, use_pos=True, patch_proj='linear', eta=None, tokens_norm=False)[source]
Parameters
  • img_size (int, tuple) – input image size

  • patch_size (int, tuple) – patch size

  • in_chans (int) – number of input channels

  • num_classes (int) – number of classes for classification head

  • embed_dim (int) – embedding dimension

  • depth (int) – depth of transformer

  • num_heads (int) – number of attention heads

  • mlp_ratio (int) – ratio of mlp hidden dim to embedding dim

  • qkv_bias (bool) – enable bias for qkv if True

  • qk_scale (float) – override default qk scale of head_dim ** -0.5 if set

  • drop_rate (float) – dropout rate

  • attn_drop_rate (float) – attention dropout rate

  • drop_path_rate (float) – stochastic depth rate

  • norm_layer – (nn.Module): normalization layer

  • cls_attn_layers – (int) Depth of Class attention layers

  • use_pos – (bool) whether to use positional encoding

  • eta – (float) layerscale initialization value

  • tokens_norm – (bool) Whether to normalize all tokens or just the cls_token in the CA

init_weights()[source]
no_weight_decay()[source]
forward_features(x)[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.backbones.xcit_transformer.xcit_small_12_p16(pretrained=False, **kwargs)[source]
easycv.models.backbones.xcit_transformer.xcit_small_24_p16(pretrained=False, **kwargs)[source]
easycv.models.backbones.xcit_transformer.xcit_medium_24_p16(pretrained=False, **kwargs)[source]
easycv.models.backbones.xcit_transformer.xcit_small_12_p8(pretrained=False, **kwargs)[source]
easycv.models.backbones.xcit_transformer.xcit_small_24_p8(pretrained=False, **kwargs)[source]
easycv.models.backbones.xcit_transformer.xcit_medium_24_p8(pretrained=False, **kwargs)[source]
easycv.models.backbones.xcit_transformer.xcit_large_24_p8(pretrained=False, **kwargs)[source]

easycv.models.classification package

Submodules
easycv.models.classification.classification module
class easycv.models.classification.classification.Classification(backbone, train_preprocess=[], with_sobel=False, head=None, neck=None, pretrained=True, mixup_cfg=None)[source]

Bases: easycv.models.base.BaseModel

Parameters
  • pretrained – Select one {str or True or False/None}.

  • pretrained == str (if) –

  • model from specified path; (load) –

  • pretrained == True (if) –

  • model from default path (load) –

  • pretrained == False or None (if) –

  • from init weights. (load) –

__init__(backbone, train_preprocess=[], with_sobel=False, head=None, neck=None, pretrained=True, mixup_cfg=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward_backbone(img: torch.Tensor)List[torch.Tensor][source]

Forward backbone

Returns

backbone outputs

Return type

x (tuple)

forward_train(img, gt_labels)Dict[str, torch.Tensor][source]

In forward train, model will forward backbone + neck / multi-neck, get alist of output tensor, and put this list to head / multi-head, to compute each loss

forward_test(img: torch.Tensor)Dict[str, torch.Tensor][source]

forward_test means generate prob/class from image only support one neck + one head

forward_test_label(img, gt_labels)Dict[str, torch.Tensor][source]

forward_test_label means generate prob/class from image only support one neck + one head ps : head init need set the input feature idx

training: bool
aug_test(imgs)[source]
forward_feature(img)Dict[str, torch.Tensor][source]
Forward feature means forward backbone + neck/multineck ,get dict of output feature,

self.neck_num = 0: means only forward backbone, output backbone feature with avgpool, with key neck, self.neck_num > 0: means has 1/multi neck, output neck’s feature with key neck_neckidx_featureidx, suck as neck_0_0

Returns

feature tensor

Return type

x (torch.Tensor)

update_extract_list(key)[source]
forward(img: torch.Tensor, mode: str = 'train', gt_labels: Optional[torch.Tensor] = None, img_metas: Optional[torch.Tensor] = None)Dict[str, torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

easycv.models.classification.necks module
class easycv.models.classification.necks.LinearNeck(in_channels, out_channels, with_avg_pool=True, with_norm=False)[source]

Bases: torch.nn.modules.module.Module

Linear neck: fc only

__init__(in_channels, out_channels, with_avg_pool=True, with_norm=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.classification.necks.RetrivalNeck(in_channels, out_channels, with_avg_pool=True, cdg_config=['G', 'M'])[source]

Bases: torch.nn.modules.module.Module

RetrivalNeck: refer, Combination of Multiple Global Descriptors for Image Retrieval

https://arxiv.org/pdf/1903.10663.pdf

CGD feature : only use avg pool + gem pooling + max pooling, by pool -> fc -> norm -> concat -> norm Avg feature : use avg pooling, avg pool -> syncbn -> fc

len(cgd_config) > 0: return [CGD, Avg] len(cgd_config) = 0 : return [Avg]

__init__(in_channels, out_channels, with_avg_pool=True, cdg_config=['G', 'M'])[source]

Init RetrivalNeck, faceid neck doesn’t pool for input feature map, doesn’t support dynamic input

Parameters
  • in_channels – Int - input feature map channels

  • out_channels – Int - output feature map channels

  • with_avg_pool – bool do avg pool for BNneck or not

  • cdg_config – list(‘G’,’M’,’S’), to configure output feature, CGD = [gempooling] + [maxpooling] + [meanpooling], if len(cgd_config) > 0: return [CGD, Avg] if len(cgd_config) = 0 : return [Avg]

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.classification.necks.FaceIDNeck(in_channels, out_channels, map_shape=1, dropout_ratio=0.4, with_norm=False, bn_type='SyncBN')[source]

Bases: torch.nn.modules.module.Module

FaceID neck: Include BN, dropout, flatten, linear, bn

__init__(in_channels, out_channels, map_shape=1, dropout_ratio=0.4, with_norm=False, bn_type='SyncBN')[source]

Init FaceIDNeck, faceid neck doesn’t pool for input feature map, doesn’t support dynamic input

Parameters
  • in_channels – Int - input feature map channels

  • out_channels – Int - output feature map channels

  • map_shape – Int or list(int,…), input feature map (w,h) or w when w=h,

  • dropout_ratio – float, drop out ratio

  • with_norm – normalize output feature or not

  • bn_type – SyncBN or BN

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.classification.necks.MultiLinearNeck(in_channels, out_channels, num_layers=1, with_avg_pool=True)[source]

Bases: torch.nn.modules.module.Module

MultiLinearNeck neck: MultiFc head

__init__(in_channels, out_channels, num_layers=1, with_avg_pool=True)[source]
Parameters
  • in_channels – int or list[int]

  • out_channels – int or list[int]

  • num_layers – total fc num

  • with_avg_pool – input will be avgPool if True

Returns

None

Raises
  • len(in_channel) != len(out_channels)

  • len(in_channel) != len(num_layers)

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.classification.necks.HRFuseScales(in_channels, out_channels=2048, norm_cfg={'momentum': 0.1, 'type': 'BN'})[source]

Bases: torch.nn.modules.module.Module

Fuse feature map of multiple scales in HRNet. :param in_channels: The input channels of all scales. :type in_channels: list[int] :param out_channels: The channels of fused feature map.

Defaults to 2048.

Parameters
  • norm_cfg (dict) – dictionary to construct norm layers. Defaults to dict(type='BN', momentum=0.1).

  • init_cfg (dict | list[dict], optional) – Initialization config dict. Defaults to dict(type='Normal', layer='Linear', std=0.01)).

__init__(in_channels, out_channels=2048, norm_cfg={'momentum': 0.1, 'type': 'BN'})[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.classification.necks.ReIDNeck(in_channels, dropout, relu=False, norm=True, out_channels=512)[source]

Bases: torch.nn.modules.module.Module

ReID neck: Include Linear, bn, relu, dropout

__init__(in_channels, dropout, relu=False, norm=True, out_channels=512)[source]

Init FaceIDNeck, faceid neck doesn’t pool for input feature map, doesn’t support dynamic input

Parameters
  • in_channels – Int - input feature map channels

  • out_channels – Int - output feature map channels

  • map_shape – Int or list(int,…), input feature map (w,h) or w when w=h,

  • dropout_ratio – float, drop out ratio

  • with_norm – normalize output feature or not

  • bn_type – SyncBN or BN

training: bool
init_weights(init_linear='kaiming')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

easycv.models.detection package

Subpackages
easycv.models.detection.utils package
Submodules
easycv.models.detection.utils.boxes module
easycv.models.detection.utils.boxes.bboxes_iou(bboxes_a, bboxes_b, xyxy=True)[source]
easycv.models.detection.utils.boxes.bbox2result(bboxes, labels, num_classes)[source]

Convert detection results to a list of numpy arrays. :param bboxes: shape (n, 5) :type bboxes: torch.Tensor | np.ndarray :param labels: shape (n, ) :type labels: torch.Tensor | np.ndarray :param num_classes: class number, including background class :type num_classes: int

Returns

bbox results of each class

Return type

list(ndarray)

easycv.models.detection.utils.boxes.box_cxcywh_to_xyxy(x)[source]
easycv.models.detection.utils.boxes.box_xyxy_to_cxcywh(x)[source]
easycv.models.detection.utils.boxes.box_iou(boxes1, boxes2)[source]
easycv.models.detection.utils.boxes.generalized_box_iou(boxes1, boxes2)[source]

Generalized IoU from https://giou.stanford.edu/ The boxes should be in [x0, y0, x1, y1] format Returns a [N, M] pairwise matrix, where N = len(boxes1) and M = len(boxes2)

easycv.models.detection.utils.boxes.bbox_overlaps(bboxes1, bboxes2, mode='iou', is_aligned=False, eps=1e-06)[source]

Calculate overlap between two set of bboxes.

FP16 Contributed by https://github.com/open-mmlab/mmdetection/pull/4889 .. note:

Assume bboxes1 is M x 4, bboxes2 is N x 4, when mode is 'iou',
there are some new generated variable when calculating IOU
using bbox_overlaps function:

1) is_aligned is False
    area1: M x 1
    area2: N x 1
    lt: M x N x 2
    rb: M x N x 2
    wh: M x N x 2
    overlap: M x N x 1
    union: M x N x 1
    ious: M x N x 1

    Total memory:
        S = (9 x N x M + N + M) * 4 Byte,

    When using FP16, we can reduce:
        R = (9 x N x M + N + M) * 4 / 2 Byte
        R large than (N + M) * 4 * 2 is always true when N and M >= 1.
        Obviously, N + M <= N * M < 3 * N * M, when N >=2 and M >=2,
                   N + 1 < 3 * N, when N or M is 1.

    Given M = 40 (ground truth), N = 400000 (three anchor boxes
    in per grid, FPN, R-CNNs),
        R = 275 MB (one times)

    A special case (dense detection), M = 512 (ground truth),
        R = 3516 MB = 3.43 GB

    When the batch size is B, reduce:
        B x R

    Therefore, CUDA memory runs out frequently.

    Experiments on GeForce RTX 2080Ti (11019 MiB):

    |   dtype   |   M   |   N   |   Use    |   Real   |   Ideal   |
    |:----:|:----:|:----:|:----:|:----:|:----:|
    |   FP32   |   512 | 400000 | 8020 MiB |   --   |   --   |
    |   FP16   |   512 | 400000 |   4504 MiB | 3516 MiB | 3516 MiB |
    |   FP32   |   40 | 400000 |   1540 MiB |   --   |   --   |
    |   FP16   |   40 | 400000 |   1264 MiB |   276MiB   | 275 MiB |

2) is_aligned is True
    area1: N x 1
    area2: N x 1
    lt: N x 2
    rb: N x 2
    wh: N x 2
    overlap: N x 1
    union: N x 1
    ious: N x 1

    Total memory:
        S = 11 x N * 4 Byte

    When using FP16, we can reduce:
        R = 11 x N * 4 / 2 Byte

So do the 'giou' (large than 'iou').

Time-wise, FP16 is generally faster than FP32.

When gpu_assign_thr is not -1, it takes more time on cpu
but not reduce memory.
There, we can reduce half the memory and keep the speed.

If is_aligned is False, then calculate the overlaps between each bbox of bboxes1 and bboxes2, otherwise the overlaps between each aligned pair of bboxes1 and bboxes2.

Parameters
  • bboxes1 (Tensor) – shape (B, m, 4) in <x1, y1, x2, y2> format or empty.

  • bboxes2 (Tensor) – shape (B, n, 4) in <x1, y1, x2, y2> format or empty. B indicates the batch dim, in shape (B1, B2, …, Bn). If is_aligned is True, then m and n must be equal.

  • mode (str) – “iou” (intersection over union), “iof” (intersection over foreground) or “giou” (generalized intersection over union). Default “iou”.

  • is_aligned (bool, optional) – If True, then m and n must be equal. Default False.

  • eps (float, optional) – A value added to the denominator for numerical stability. Default 1e-6.

Returns

shape (m, n) if is_aligned is False else shape (m,)

Return type

Tensor

Example

>>> bboxes1 = torch.FloatTensor([
>>>     [0, 0, 10, 10],
>>>     [10, 10, 20, 20],
>>>     [32, 32, 38, 42],
>>> ])
>>> bboxes2 = torch.FloatTensor([
>>>     [0, 0, 10, 20],
>>>     [0, 10, 10, 19],
>>>     [10, 10, 20, 20],
>>> ])
>>> overlaps = bbox_overlaps(bboxes1, bboxes2)
>>> assert overlaps.shape == (3, 3)
>>> overlaps = bbox_overlaps(bboxes1, bboxes2, is_aligned=True)
>>> assert overlaps.shape == (3, )

Example

>>> empty = torch.empty(0, 4)
>>> nonempty = torch.FloatTensor([[0, 0, 10, 9]])
>>> assert tuple(bbox_overlaps(empty, nonempty).shape) == (0, 1)
>>> assert tuple(bbox_overlaps(nonempty, empty).shape) == (1, 0)
>>> assert tuple(bbox_overlaps(empty, empty).shape) == (0, 0)
easycv.models.detection.utils.boxes.bbox2distance(points, bbox, max_dis=None, eps=0.1)[source]

Decode bounding box based on distances.

Parameters
  • points (Tensor) – Shape (n, 2), [x, y].

  • bbox (Tensor) – Shape (n, 4), “xyxy” format

  • max_dis (float) – Upper bound of the distance.

  • eps (float) – a small value to ensure target < max_dis, instead <=

Returns

Decoded distances.

Return type

Tensor

easycv.models.detection.utils.boxes.distance2bbox(points, distance, max_shape=None)[source]

Decode distance prediction to bounding box.

Parameters
  • points (Tensor) – Shape (B, N, 2) or (N, 2).

  • distance (Tensor) – Distance from the given point to 4 boundaries (left, top, right, bottom). Shape (B, N, 4) or (N, 4)

  • (Sequence[int] or torch.Tensor or Sequence[ (max_shape) – Sequence[int]],optional): Maximum bounds for boxes, specifies (H, W, C) or (H, W). If priors shape is (B, N, 4), then the max_shape should be a Sequence[Sequence[int]] and the length of max_shape should also be B.

Returns

Boxes with shape (N, 4) or (B, N, 4)

Return type

Tensor

easycv.models.detection.utils.boxes.batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False)[source]

Performs non-maximum suppression in a batched fashion.

Modified from torchvision/ops/boxes.py#L39. In order to perform NMS independently per class, we add an offset to all the boxes. The offset is dependent only on the class idx, and is large enough so that boxes from different classes do not overlap.

Note

In v1.4.1 and later, batched_nms supports skipping the NMS and returns sorted raw results when nms_cfg is None.

Parameters
  • boxes (torch.Tensor) – boxes in shape (N, 4).

  • scores (torch.Tensor) – scores in shape (N, ).

  • idxs (torch.Tensor) – each index value correspond to a bbox cluster, and NMS will not be applied between elements of different idxs, shape (N, ).

  • nms_cfg (dict | None) –

    Supports skipping the nms when nms_cfg is None, otherwise it should specify nms type and other parameters like iou_thr. Possible keys includes the following.

    • iou_thr (float): IoU threshold used for NMS.

    • split_thr (float): threshold number of boxes. In some cases the number of boxes is large (e.g., 200k). To avoid OOM during training, the users could set split_thr to a small value. If the number of boxes is greater than the threshold, it will perform NMS on each group of boxes separately and sequentially. Defaults to 10000.

  • class_agnostic (bool) – if true, nms is class agnostic, i.e. IoU thresholding happens over all boxes, regardless of the predicted class.

Returns

kept dets and indice.

  • boxes (Tensor): Bboxes with score after nms, has shape (num_bboxes, 5). last dimension 5 arrange as (x1, y1, x2, y2, score)

  • keep (Tensor): The indices of remaining boxes in input boxes.

Return type

tuple

easycv.models.detection.yolox package
Submodules
easycv.models.detection.yolox.yolo_head module
easycv.models.detection.yolox.yolo_pafpn module
easycv.models.detection.yolox.yolox module
easycv.models.detection.yolox_edge package
Submodules
easycv.models.detection.yolox_edge.yolox_edge module

easycv.models.heads package

Submodules
easycv.models.heads.cls_head module
class easycv.models.heads.cls_head.ClsHead(with_avg_pool=False, label_smooth=0.0, in_channels=2048, with_fc=True, num_classes=1000, loss_config={'type': 'CrossEntropyLossWithLabelSmooth'}, input_feature_index=[0], init_cfg={'bias': 0.0, 'layer': 'Linear', 'std': 0.01, 'type': 'Normal'}, use_num_classes=True)[source]

Bases: torch.nn.modules.module.Module

Simplest classifier head, with only one fc layer. Should Notice Evtorch module design input always be feature_list = [tensor, tensor,…]

__init__(with_avg_pool=False, label_smooth=0.0, in_channels=2048, with_fc=True, num_classes=1000, loss_config={'type': 'CrossEntropyLossWithLabelSmooth'}, input_feature_index=[0], init_cfg={'bias': 0.0, 'layer': 'Linear', 'std': 0.01, 'type': 'Normal'}, use_num_classes=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward(x: List[torch.Tensor])List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss(cls_score: List[torch.Tensor], labels: torch.Tensor)Dict[str, torch.Tensor][source]
Parameters
  • cls_score – [N x num_classes]

  • labels – if don’t use mixup, shape is [N],else [N x num_classes]

mixup_loss(cls_score, labels_1, labels_2, lam)Dict[str, torch.Tensor][source]
training: bool
easycv.models.heads.contrastive_head module
class easycv.models.heads.contrastive_head.ContrastiveHead(temperature=0.1)[source]

Bases: torch.nn.modules.module.Module

Head for contrastive learning.

__init__(temperature=0.1)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pos, neg)[source]
Parameters
  • pos (Tensor) – Nx1 positive similarity

  • neg (Tensor) – Nxk negative similarity

training: bool
class easycv.models.heads.contrastive_head.DebiasedContrastiveHead(temperature=0.1, tau=0.1)[source]

Bases: torch.nn.modules.module.Module

__init__(temperature=0.1, tau=0.1)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pos, neg)[source]
Parameters
  • pos (Tensor) – Nx1 positive similarity

  • neg (Tensor) – Nxk negative similarity

training: bool
easycv.models.heads.latent_pred_head module
class easycv.models.heads.latent_pred_head.LatentPredictHead(predictor, size_average=True)[source]

Bases: torch.nn.modules.module.Module

Head for contrastive learning.

__init__(predictor, size_average=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(input, target)[source]
Parameters
  • input (Tensor) – NxC input features.

  • target (Tensor) – NxC target features.

training: bool
class easycv.models.heads.latent_pred_head.LatentClsHead(predictor)[source]

Bases: torch.nn.modules.module.Module

Head for contrastive learning.

__init__(predictor)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(input, target)[source]
Parameters
  • input (Tensor) – NxC input features.

  • target (Tensor) – NxC target features.

training: bool
easycv.models.heads.mp_metric_head module
easycv.models.heads.mp_metric_head.EmbeddingExplansion(embs, labels, explanion_rate=4, alpha=1.0)[source]

Expand embedding: CVPR refer to https://github.com/clovaai/embedding-expansion combine PK sampled data, mixup anchor positive pair to generate more features, always combine with BatchHardminer. result on SOP and CUB need to be add

Parameters
  • embs – [N , dims] tensor

  • labels – [N] tensor

  • explanion_rate – to expand N to explanion_rate * N

  • alpha – beta distribution parameter for mixup

Returns

[N*explanion_rate , dims]

Return type

embs

class easycv.models.heads.mp_metric_head.MpMetrixHead(with_avg_pool=False, in_channels=2048, loss_config=[{'type': 'CircleLoss', 'loss_weight': 1.0, 'norm': True, 'ddp': True, 'm': 0.4, 'gamma': 80}], input_feature_index=[0], input_label_index=0, ignore_label=None)[source]

Bases: torch.nn.modules.module.Module

Simplest classifier head, with only one fc layer.

__init__(with_avg_pool=False, in_channels=2048, loss_config=[{'type': 'CircleLoss', 'loss_weight': 1.0, 'norm': True, 'ddp': True, 'm': 0.4, 'gamma': 80}], input_feature_index=[0], input_label_index=0, ignore_label=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(pretrained=None, init_linear='normal', std=0.01, bias=0.0)[source]
forward(x: List[torch.Tensor])List[torch.Tensor][source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss(cls_score, labels)Dict[str, torch.Tensor][source]
training: bool
easycv.models.heads.multi_cls_head module
class easycv.models.heads.multi_cls_head.MultiClsHead(pool_type='adaptive', in_indices=(0), with_last_layer_unpool=False, backbone='resnet50', norm_cfg={'type': 'BN'}, num_classes=1000)[source]

Bases: torch.nn.modules.module.Module

Multiple classifier heads.

FEAT_CHANNELS = {'resnet50': [64, 256, 512, 1024, 2048]}
FEAT_LAST_UNPOOL = {'resnet50': 100352}
__init__(pool_type='adaptive', in_indices=(0), with_last_layer_unpool=False, backbone='resnet50', norm_cfg={'type': 'BN'}, num_classes=1000)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

loss(cls_score, labels)[source]
training: bool

easycv.models.loss package

class easycv.models.loss.CrossEntropyLoss(use_sigmoid=False, use_mask=False, reduction='mean', class_weight=None, loss_weight=1.0, loss_name='loss_ce', avg_non_ignore=False, label_ceil=False)[source]

Bases: torch.nn.modules.module.Module

CrossEntropyLoss.

Parameters
  • use_sigmoid (bool, optional) – Whether the prediction uses sigmoid of softmax. Defaults to False.

  • use_mask (bool, optional) – Whether to use mask cross entropy loss. Defaults to False.

  • reduction (str, optional) – . Defaults to ‘mean’. Options are “none”, “mean” and “sum”.

  • class_weight (list[float] | str, optional) – Weight of each class. If in str format, read them from a file. Defaults to None.

  • loss_weight (float, optional) – Weight of the loss. Defaults to 1.0.

  • loss_name (str, optional) – Name of the loss item. If you want this loss item to be included into the backward graph, loss_ must be the prefix of the name. Defaults to ‘loss_ce’.

  • avg_non_ignore (bool) – The flag decides to whether the loss is only averaged over non-ignored targets. Default: False. New in version 0.23.0.

  • label_ceil (bool) – When use bce and set label_ceil=True, it will make elements belong to (0, 1] in label change to 1. Default: False.

__init__(use_sigmoid=False, use_mask=False, reduction='mean', class_weight=None, loss_weight=1.0, loss_name='loss_ce', avg_non_ignore=False, label_ceil=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

extra_repr()[source]

Extra repr.

forward(cls_score, label, weight=None, avg_factor=None, reduction_override=None, ignore_index=- 100, **kwargs)[source]

Forward function.

property loss_name

Loss Name.

This function must be implemented and will return the name of this loss function. This name will be used to combine different loss items by simple sum operation. In addition, if you want this loss item to be included into the backward graph, loss_ must be the prefix of the name.

Returns

The name of this loss item.

Return type

str

training: bool
class easycv.models.loss.FacePoseLoss(pose_weight=1.0)[source]

Bases: torch.nn.modules.module.Module

__init__(pose_weight=1.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.WingLossWithPose(num_points=106, left_eye_left_corner_index=66, right_eye_right_corner_index=79, points_weight=1.0, contour_weight=1.5, eyebrow_weight=1.5, eye_weight=1.7, nose_weight=1.3, lip_weight=1.7, omega=10, epsilon=2)[source]

Bases: torch.nn.modules.module.Module

__init__(num_points=106, left_eye_left_corner_index=66, right_eye_right_corner_index=79, points_weight=1.0, contour_weight=1.5, eyebrow_weight=1.5, eye_weight=1.7, nose_weight=1.3, lip_weight=1.7, omega=10, epsilon=2)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target, pose)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.FocalLoss(use_sigmoid=True, gamma=2.0, alpha=0.25, reduction='mean', loss_weight=1.0, activated=False)[source]

Bases: torch.nn.modules.module.Module

__init__(use_sigmoid=True, gamma=2.0, alpha=0.25, reduction='mean', loss_weight=1.0, activated=False)[source]

Focal Loss

Parameters
  • use_sigmoid (bool, optional) – Whether to the prediction is used for sigmoid or softmax. Defaults to True.

  • gamma (float, optional) – The gamma for calculating the modulating factor. Defaults to 2.0.

  • alpha (float, optional) – A balanced form for Focal Loss. Defaults to 0.25.

  • reduction (str, optional) – The method used to reduce the loss into a scalar. Defaults to ‘mean’. Options are “none”, “mean” and “sum”.

  • loss_weight (float, optional) – Weight of loss. Defaults to 1.0.

  • activated (bool, optional) – Whether the input is activated. If True, it means the input has been activated and can be treated as probabilities. Else, it should be treated as logits. Defaults to False.

forward(pred, target, weight=None, avg_factor=None, reduction_override=None)[source]

Forward function.

Parameters
  • pred (torch.Tensor) – The prediction.

  • target (torch.Tensor) – The learning label of the prediction.

  • weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.

  • avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.

  • reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”.

Returns

The calculated loss

Return type

torch.Tensor

training: bool
class easycv.models.loss.VarifocalLoss(use_sigmoid=True, alpha=0.75, gamma=2.0, iou_weighted=True, reduction='mean', loss_weight=1.0)[source]

Bases: torch.nn.modules.module.Module

__init__(use_sigmoid=True, alpha=0.75, gamma=2.0, iou_weighted=True, reduction='mean', loss_weight=1.0)[source]

Varifocal Loss :param use_sigmoid: Whether the prediction is

used for sigmoid or softmax. Defaults to True.

Parameters
  • alpha (float, optional) – A balance factor for the negative part of Varifocal Loss, which is different from the alpha of Focal Loss. Defaults to 0.75.

  • gamma (float, optional) – The gamma for calculating the modulating factor. Defaults to 2.0.

  • iou_weighted (bool, optional) – Whether to weight the loss of the positive examples with the iou target. Defaults to True.

  • reduction (str, optional) – The method used to reduce the loss into a scalar. Defaults to ‘mean’. Options are “none”, “mean” and “sum”.

  • loss_weight (float, optional) – Weight of loss. Defaults to 1.0.

forward(pred, target, weight=None, avg_factor=None, reduction_override=None)[source]

Forward function. :param pred: The prediction. :type pred: torch.Tensor :param target: The learning target of the prediction. :type target: torch.Tensor :param weight: The weight of loss for each

prediction. Defaults to None.

Parameters
  • avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.

  • reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Options are “none”, “mean” and “sum”.

Returns

The calculated loss

Return type

torch.Tensor

training: bool
class easycv.models.loss.GIoULoss(eps=1e-06, reduction='mean', loss_weight=1.0)[source]

Bases: torch.nn.modules.module.Module

__init__(eps=1e-06, reduction='mean', loss_weight=1.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.IoULoss(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[source]

Bases: torch.nn.modules.module.Module

IoULoss.

Computing the IoU loss between a set of predicted bboxes and target bboxes.

Parameters
  • linear (bool) – If True, use linear scale of loss else determined by mode. Default: False.

  • eps (float) – Eps to avoid log(0).

  • reduction (str) – Options are “none”, “mean” and “sum”.

  • loss_weight (float) – Weight of loss.

  • mode (str) – Loss scaling mode, including “linear”, “square”, and “log”. Default: ‘log’

__init__(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]

Forward function.

Parameters
  • pred (torch.Tensor) – The prediction.

  • target (torch.Tensor) – The learning target of the prediction.

  • weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.

  • avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.

  • reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None. Options are “none”, “mean” and “sum”.

training: bool
class easycv.models.loss.YOLOX_IOULoss(reduction='none', loss_type='iou')[source]

Bases: torch.nn.modules.module.Module

__init__(reduction='none', loss_type='iou')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.JointsMSELoss(use_target_weight=False, loss_weight=1.0)[source]

Bases: torch.nn.modules.module.Module

MSE loss for heatmaps.

Parameters
  • use_target_weight (bool) – Option to use weighted MSE loss. Different joint types may have different target weights.

  • loss_weight (float) – Weight of the loss. Default: 1.0.

__init__(use_target_weight=False, loss_weight=1.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(output, target, target_weight)[source]

Forward function.

training: bool
class easycv.models.loss.FocalLoss2d(gamma=2, weight=None, size_average=None, reduce=None, reduction='mean', num_classes=2)[source]

Bases: torch.nn.modules.loss._WeightedLoss

__init__(gamma=2, weight=None, size_average=None, reduce=None, reduction='mean', num_classes=2)[source]

FocalLoss2d, loss solve 2-class classification unbalance problem

Parameters
  • gamma – focal loss param Gamma

  • weight – weight same as loss._WeightedLoss

  • size_average – size_average same as loss._WeightedLoss

  • reduce – reduce same as loss._WeightedLoss

  • reduction – reduce same as loss._WeightedLoss

  • num_classes – fix num 2

Returns

Focalloss nn.module.loss object

forward(input, target)[source]

input: [N * num_classes] target : [N * num_classes] one-hot

reduction: str
class easycv.models.loss.DistributeMSELoss[source]

Bases: torch.nn.modules.module.Module

__init__()[source]

DistributeMSELoss : for faceid age, score predict (regression by softmax)

forward(input, target)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.CrossEntropyLossWithLabelSmooth(label_smooth=0.1, temperature=1.0, with_cls=False, embedding_size=512, num_classes=10000)[source]

Bases: torch.nn.modules.module.Module

__init__(label_smooth=0.1, temperature=1.0, with_cls=False, embedding_size=512, num_classes=10000)[source]

A softmax loss , with label_smooth and fc(to fit pytorch metric learning interface) :param label_smooth: label_smooth args, default=0.1 :param with_cls: if True, will generate a nn.Linear to trans input embedding from embedding_size to num_classes :param embedding_size: if input is feature not logits, then need this to indicate embedding shape :param num_classes: if input is feature not logits, then need this to indicate classification num_classes

Returns

None

Raises

IOError – An error occurred accessing the bigtable.Table object.

forward(input, target)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.AMSoftmaxLoss(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]

Bases: torch.nn.modules.module.Module

__init__(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]

AMsoftmax loss , with fc(to fit pytorch metric learning interface), paper: https://arxiv.org/pdf/1801.05599.pdf :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes :param margin: AMSoftmax param :param scale: AMSoftmax param, should increase num_classes

forward(x, lb)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.ModelParallelSoftmaxLoss(embedding_size=512, num_classes=100000, scale=None, margin=None, bias=True)[source]

Bases: torch.nn.modules.module.Module

__init__(embedding_size=512, num_classes=100000, scale=None, margin=None, bias=True)[source]

ModelParallel Softmax by sailfish :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes

forward(x, lb)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.ModelParallelAMSoftmaxLoss(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]

Bases: torch.nn.modules.module.Module

__init__(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]

ModelParallel AMSoftmax by sailfish :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes

forward(x, lb)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.SoftTargetCrossEntropy(num_classes=1000, **kwargs)[source]

Bases: torch.nn.modules.module.Module

__init__(num_classes=1000, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x: torch.Tensor, target: torch.Tensor)torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.CDNCriterion(num_classes, matcher, weight_dict, losses, eos_coef=None, loss_class_type='ce')[source]

Bases: easycv.models.loss.set_criterion.set_criterion.SetCriterion

This class computes the loss for Conditional DETR. The process happens in two steps:

  1. we compute hungarian assignment between ground truth boxes and the outputs of the model

  2. we supervise each pair of matched ground-truth / prediction (supervise class and box)

__init__(num_classes, matcher, weight_dict, losses, eos_coef=None, loss_class_type='ce')[source]

Create the criterion. :param num_classes: number of object categories, omitting the special no-object category :param matcher: module able to compute a matching between targets and proposals :param weight_dict: dict containing as key the names of the losses and as values their relative weight. :param losses: list of all the losses to be applied. See get_loss for list of available losses.

prep_for_dn(dn_meta)[source]
forward(outputs, targets, aux_num, num_boxes)[source]

This performs the loss computation. :param outputs: dict of tensors, see the output specification of the model for the format :param targets: list of dicts, such that len(targets) == batch_size.

The expected keys in each dict depends on the losses applied, see each loss’ doc

Parameters

return_indices – used for vis. if True, the layer0-5 indices will be returned as well.

training: bool
class easycv.models.loss.DNCriterion(weight_dict)[source]

Bases: torch.nn.modules.module.Module

This class computes the loss for Conditional DETR. The process happens in two steps:

  1. we compute hungarian assignment between ground truth boxes and the outputs of the model

  2. we supervise each pair of matched ground-truth / prediction (supervise class and box)

__init__(weight_dict)[source]

Create the criterion. :param num_classes: number of object categories, omitting the special no-object category :param matcher: module able to compute a matching between targets and proposals :param weight_dict: dict containing as key the names of the losses and as values their relative weight. :param losses: list of all the losses to be applied. See get_loss for list of available losses.

prepare_for_loss(mask_dict)[source]

prepare dn components to calculate loss :param mask_dict: a dict that contains dn information

tgt_loss_boxes(src_boxes, tgt_boxes, num_tgt)[source]

Compute the losses related to the bounding boxes, the L1 regression loss and the GIoU loss targets dicts must contain the key “boxes” containing a tensor of dim [nb_target_boxes, 4] The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size.

tgt_loss_labels(src_logits_, tgt_labels_, num_tgt, focal_alpha, log=False)[source]

Classification loss (NLL) targets dicts must contain the key “labels” containing a tensor of dim [nb_target_boxes]

forward(mask_dict, aux_num)[source]

compute dn loss in criterion :param mask_dict: a dict for dn information :param training: training or inference flag :param aux_num: aux loss number

training: bool
class easycv.models.loss.DBLoss(balance_loss=True, main_loss_type='DiceLoss', alpha=5, beta=10, ohem_ratio=3, eps=1e-06, **kwargs)[source]

Bases: torch.nn.modules.module.Module

Differentiable Binarization (DB) Loss Function :param parm: the super paramter for DB Loss :type parm: dict

__init__(balance_loss=True, main_loss_type='DiceLoss', alpha=5, beta=10, ohem_ratio=3, eps=1e-06, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool
forward(predicts, labels)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class easycv.models.loss.HungarianMatcher(cost_dict, cost_class_type='ce_cost')[source]

Bases: torch.nn.modules.module.Module

This class computes an assignment between the targets and the predictions of the network For efficiency reasons, the targets don’t include the no_object. Because of this, in general, there are more predictions than targets. In this case, we do a 1-to-1 matching of the best predictions, while the others are un-matched (and thus treated as non-objects).

__init__(cost_dict, cost_class_type='ce_cost')[source]

Creates the matcher Params:

cost_class: This is the relative weight of the classification error in the matching cost cost_bbox: This is the relative weight of the L1 error of the bounding box coordinates in the matching cost cost_giou: This is the relative weight of the giou loss of the bounding box in the matching cost

forward(outputs, targets)[source]

Performs the matching Params:

outputs: This is a dict that contains at least these entries:

“pred_logits”: Tensor of dim [batch_size, num_queries, num_classes] with the classification logits “pred_boxes”: Tensor of dim [batch_size, num_queries, 4] with the predicted box coordinates

targets: This is a list of targets (len(targets) = batch_size), where each target is a dict containing:
“labels”: Tensor of dim [num_target_boxes] (where num_target_boxes is the number of ground-truth

objects in the target) containing the class labels

“boxes”: Tensor of dim [num_target_boxes, 4] containing the target box coordinates

Returns

  • index_i is the indices of the selected predictions (in order)
    • index_j is the indices of the corresponding selected targets (in order)

For each batch element, it holds:

len(index_i) = len(index_j) = min(num_queries, num_target_boxes)

Return type

A list of size batch_size, containing tuples of (index_i, index_j) where

training: bool
class easycv.models.loss.SetCriterion(num_classes, matcher, weight_dict, losses, eos_coef=None, loss_class_type='ce')[source]

Bases: torch.nn.modules.module.Module

This class computes the loss for Conditional DETR. The process happens in two steps:

  1. we compute hungarian assignment between ground truth boxes and the outputs of the model

  2. we supervise each pair of matched ground-truth / prediction (supervise class and box)

__init__(num_classes, matcher, weight_dict, losses, eos_coef=None, loss_class_type='ce')[source]

Create the criterion. :param num_classes: number of object categories, omitting the special no-object category :param matcher: module able to compute a matching between targets and proposals :param weight_dict: dict containing as key the names of the losses and as values their relative weight. :param losses: list of all the losses to be applied. See get_loss for list of available losses.

loss_labels(outputs, targets, indices, num_boxes, log=True)[source]

Classification loss (Binary focal loss) targets dicts must contain the key “labels” containing a tensor of dim [nb_target_boxes]

loss_cardinality(outputs, targets, indices, num_boxes)[source]

Compute the cardinality error, ie the absolute error in the number of predicted non-empty boxes This is not really a loss, it is intended for logging purposes only. It doesn’t propagate gradients

loss_boxes(outputs, targets, indices, num_boxes)[source]

Compute the losses related to the bounding boxes, the L1 regression loss and the GIoU loss targets dicts must contain the key “boxes” containing a tensor of dim [nb_target_boxes, 4] The target boxes are expected in format (center_x, center_y, w, h), normalized by the image size.

loss_centerness(outputs, targets, indices, num_boxes)[source]
loss_iouaware(outputs, targets, indices, num_boxes)[source]
get_loss(loss, outputs, targets, indices, num_boxes, **kwargs)[source]
forward(outputs, targets, num_boxes=None, return_indices=False)[source]

This performs the loss computation. :param outputs: dict of tensors, see the output specification of the model for the format :param targets: list of dicts, such that len(targets) == batch_size.

The expected keys in each dict depends on the losses applied, see each loss’ doc

Parameters

return_indices – used for vis. if True, the layer0-5 indices will be returned as well.

training: bool
class easycv.models.loss.L1Loss(reduction='mean', loss_weight=1.0)[source]

Bases: torch.nn.modules.module.Module

L1 loss.

Parameters
  • reduction (str, optional) – The method to reduce the loss. Options are “none”, “mean” and “sum”.

  • loss_weight (float, optional) – The weight of loss.

__init__(reduction='mean', loss_weight=1.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target, weight=None, avg_factor=None, reduction_override=None)[source]

Forward function.

Parameters
  • pred (torch.Tensor) – The prediction.

  • target (torch.Tensor) – The learning target of the prediction.

  • weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.

  • avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.

  • reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.

training: bool
class easycv.models.loss.MultiLoss(loss_config_list, weight_1=1.0, weight_2=1.0, gtc_loss='sar', **kwargs)[source]

Bases: torch.nn.modules.module.Module

__init__(loss_config_list, weight_1=1.0, weight_2=1.0, gtc_loss='sar', **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(predicts, label_ctc=None, label_sar=None, length=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.SmoothL1Loss(beta=1.0, reduction='mean', loss_weight=1.0)[source]

Bases: torch.nn.modules.module.Module

Smooth L1 loss. :param beta: The threshold in the piecewise function.

Defaults to 1.0.

Parameters
  • reduction (str, optional) – The method to reduce the loss. Options are “none”, “mean” and “sum”. Defaults to “mean”.

  • loss_weight (float, optional) – The weight of loss.

__init__(beta=1.0, reduction='mean', loss_weight=1.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]

Forward function. :param pred: The prediction. :type pred: torch.Tensor :param target: The learning target of the prediction. :type target: torch.Tensor :param weight: The weight of loss for each

prediction. Defaults to None.

Parameters
  • avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.

  • reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None.

training: bool
class easycv.models.loss.DiceLoss(smooth=1, exponent=2, reduction='mean', class_weight=None, loss_weight=1.0, ignore_index=255, loss_name='loss_dice', **kwargs)[source]

Bases: torch.nn.modules.module.Module

DiceLoss.

This loss is proposed in V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation.

Parameters
  • smooth (float) – A float number to smooth loss, and avoid NaN error. Default: 1

  • exponent (float) – An float number to calculate denominator value: sum{x^exponent} + sum{y^exponent}. Default: 2.

  • reduction (str, optional) – The method used to reduce the loss. Options are “none”, “mean” and “sum”. This parameter only works when per_image is True. Default: ‘mean’.

  • class_weight (list[float] | str, optional) – Weight of each class. If in str format, read them from a file. Defaults to None.

  • loss_weight (float, optional) – Weight of the loss. Default to 1.0.

  • ignore_index (int | None) – The label index to be ignored. Default: 255.

  • loss_name (str, optional) – Name of the loss item. If you want this loss item to be included into the backward graph, loss_ must be the prefix of the name. Defaults to ‘loss_dice’.

__init__(smooth=1, exponent=2, reduction='mean', class_weight=None, loss_weight=1.0, ignore_index=255, loss_name='loss_dice', **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target, avg_factor=None, reduction_override=None, **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

property loss_name

Loss Name.

This function must be implemented and will return the name of this loss function. This name will be used to combine different loss items by simple sum operation. In addition, if you want this loss item to be included into the backward graph, loss_ must be the prefix of the name. :returns: The name of this loss item. :rtype: str

training: bool
Submodules
easycv.models.loss.iou_loss module
class easycv.models.loss.iou_loss.YOLOX_IOULoss(reduction='none', loss_type='iou')[source]

Bases: torch.nn.modules.module.Module

__init__(reduction='none', loss_type='iou')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.iou_loss.IoULoss(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[source]

Bases: torch.nn.modules.module.Module

IoULoss.

Computing the IoU loss between a set of predicted bboxes and target bboxes.

Parameters
  • linear (bool) – If True, use linear scale of loss else determined by mode. Default: False.

  • eps (float) – Eps to avoid log(0).

  • reduction (str) – Options are “none”, “mean” and “sum”.

  • loss_weight (float) – Weight of loss.

  • mode (str) – Loss scaling mode, including “linear”, “square”, and “log”. Default: ‘log’

__init__(linear=False, eps=1e-06, reduction='mean', loss_weight=1.0, mode='log')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]

Forward function.

Parameters
  • pred (torch.Tensor) – The prediction.

  • target (torch.Tensor) – The learning target of the prediction.

  • weight (torch.Tensor, optional) – The weight of loss for each prediction. Defaults to None.

  • avg_factor (int, optional) – Average factor that is used to average the loss. Defaults to None.

  • reduction_override (str, optional) – The reduction method used to override the original reduction method of the loss. Defaults to None. Options are “none”, “mean” and “sum”.

training: bool
class easycv.models.loss.iou_loss.GIoULoss(eps=1e-06, reduction='mean', loss_weight=1.0)[source]

Bases: torch.nn.modules.module.Module

__init__(eps=1e-06, reduction='mean', loss_weight=1.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(pred, target, weight=None, avg_factor=None, reduction_override=None, **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.loss.mse_loss module
class easycv.models.loss.mse_loss.JointsMSELoss(use_target_weight=False, loss_weight=1.0)[source]

Bases: torch.nn.modules.module.Module

MSE loss for heatmaps.

Parameters
  • use_target_weight (bool) – Option to use weighted MSE loss. Different joint types may have different target weights.

  • loss_weight (float) – Weight of the loss. Default: 1.0.

__init__(use_target_weight=False, loss_weight=1.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(output, target, target_weight)[source]

Forward function.

training: bool
easycv.models.loss.pytorch_metric_learning module
class easycv.models.loss.pytorch_metric_learning.FocalLoss2d(gamma=2, weight=None, size_average=None, reduce=None, reduction='mean', num_classes=2)[source]

Bases: torch.nn.modules.loss._WeightedLoss

__init__(gamma=2, weight=None, size_average=None, reduce=None, reduction='mean', num_classes=2)[source]

FocalLoss2d, loss solve 2-class classification unbalance problem

Parameters
  • gamma – focal loss param Gamma

  • weight – weight same as loss._WeightedLoss

  • size_average – size_average same as loss._WeightedLoss

  • reduce – reduce same as loss._WeightedLoss

  • reduction – reduce same as loss._WeightedLoss

  • num_classes – fix num 2

Returns

Focalloss nn.module.loss object

weight: Optional[Tensor]
forward(input, target)[source]

input: [N * num_classes] target : [N * num_classes] one-hot

reduction: str
training: bool
class easycv.models.loss.pytorch_metric_learning.DistributeMSELoss[source]

Bases: torch.nn.modules.module.Module

__init__()[source]

DistributeMSELoss : for faceid age, score predict (regression by softmax)

forward(input, target)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.pytorch_metric_learning.CrossEntropyLossWithLabelSmooth(label_smooth=0.1, temperature=1.0, with_cls=False, embedding_size=512, num_classes=10000)[source]

Bases: torch.nn.modules.module.Module

__init__(label_smooth=0.1, temperature=1.0, with_cls=False, embedding_size=512, num_classes=10000)[source]

A softmax loss , with label_smooth and fc(to fit pytorch metric learning interface) :param label_smooth: label_smooth args, default=0.1 :param with_cls: if True, will generate a nn.Linear to trans input embedding from embedding_size to num_classes :param embedding_size: if input is feature not logits, then need this to indicate embedding shape :param num_classes: if input is feature not logits, then need this to indicate classification num_classes

Returns

None

Raises

IOError – An error occurred accessing the bigtable.Table object.

forward(input, target)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.pytorch_metric_learning.AMSoftmaxLoss(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]

Bases: torch.nn.modules.module.Module

__init__(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]

AMsoftmax loss , with fc(to fit pytorch metric learning interface), paper: https://arxiv.org/pdf/1801.05599.pdf :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes :param margin: AMSoftmax param :param scale: AMSoftmax param, should increase num_classes

forward(x, lb)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.pytorch_metric_learning.ModelParallelSoftmaxLoss(embedding_size=512, num_classes=100000, scale=None, margin=None, bias=True)[source]

Bases: torch.nn.modules.module.Module

__init__(embedding_size=512, num_classes=100000, scale=None, margin=None, bias=True)[source]

ModelParallel Softmax by sailfish :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes

forward(x, lb)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.pytorch_metric_learning.ModelParallelAMSoftmaxLoss(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]

Bases: torch.nn.modules.module.Module

__init__(embedding_size=512, num_classes=100000, margin=0.35, scale=30)[source]

ModelParallel AMSoftmax by sailfish :param embedding_size: forward input [N, embedding_size ] :param num_classes: classification num_classes

forward(x, lb)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.loss.pytorch_metric_learning.SoftTargetCrossEntropy(num_classes=1000, **kwargs)[source]

Bases: torch.nn.modules.module.Module

__init__(num_classes=1000, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x: torch.Tensor, target: torch.Tensor)torch.Tensor[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

easycv.models.pose package

Subpackages
easycv.models.pose.heads package
Submodules
easycv.models.pose.heads.topdown_heatmap_base_head module
easycv.models.pose.heads.topdown_heatmap_base_head.decode_heatmap(heatmaps, img_metas, test_cfg)[source]
class easycv.models.pose.heads.topdown_heatmap_base_head.TopdownHeatmapBaseHead[source]

Bases: torch.nn.modules.module.Module

Base class for top-down heatmap heads.

All top-down heatmap heads should subclass it. All subclass should overwrite:

Methods:get_loss, supporting to calculate loss. Methods:get_accuracy, supporting to calculate accuracy. Methods:forward, supporting to forward model. Methods:inference_model, supporting to inference model.

abstract get_loss(**kwargs)[source]

Gets the loss.

abstract get_accuracy(**kwargs)[source]

Gets the accuracy.

abstract forward(**kwargs)[source]

Forward function.

abstract inference_model(**kwargs)[source]

Inference function.

decode(img_metas, output, **kwargs)[source]

Decode keypoints from heatmaps.

Parameters
  • img_metas (list(dict)) – Information about data augmentation By default this includes: - “image_file: path to the image file - “center”: center of the bbox - “scale”: scale of the bbox - “rotation”: rotation of the bbox - “bbox_score”: score of bbox

  • output (np.ndarray[N, K, H, W]) – model predicted heatmaps.

training: bool
easycv.models.pose.heads.topdown_heatmap_simple_head module
class easycv.models.pose.heads.topdown_heatmap_simple_head.TopdownHeatmapSimpleHead(in_channels, out_channels, num_deconv_layers=3, num_deconv_filters=(256, 256, 256), num_deconv_kernels=(4, 4, 4), extra=None, in_index=0, input_transform=None, align_corners=False, loss_keypoint=None, train_cfg=None, test_cfg=None)[source]

Bases: easycv.models.pose.heads.topdown_heatmap_base_head.TopdownHeatmapBaseHead

Top-down heatmap simple head. paper ref: Bin Xiao et al. Simple Baselines for Human Pose Estimation and Tracking.

TopdownHeatmapSimpleHead is consisted of (>=0) number of deconv layers and a simple conv2d layer.

Parameters
  • in_channels (int) – Number of input channels

  • out_channels (int) – Number of output channels

  • num_deconv_layers (int) – Number of deconv layers. num_deconv_layers should >= 0. Note that 0 means no deconv layers.

  • num_deconv_filters (list|tuple) – Number of filters. If num_deconv_layers > 0, the length of

  • num_deconv_kernels (list|tuple) – Kernel sizes.

  • in_index (int|Sequence[int]) – Input feature index. Default: 0

  • input_transform (str|None) –

    Transformation type of input features. Options: ‘resize_concat’, ‘multiple_select’, None. Default: None.

    • ’resize_concat’: Multiple feature maps will be resized to the

      same size as the first one and then concat together. Usually used in FCN head of HRNet.

    • ’multiple_select’: Multiple feature maps will be bundle into

      a list and passed into decode head.

    • None: Only one select feature map is allowed.

  • align_corners (bool) – align_corners argument of F.interpolate. Default: False.

  • loss_keypoint (dict) – Config for keypoint loss. Default: None.

__init__(in_channels, out_channels, num_deconv_layers=3, num_deconv_filters=(256, 256, 256), num_deconv_kernels=(4, 4, 4), extra=None, in_index=0, input_transform=None, align_corners=False, loss_keypoint=None, train_cfg=None, test_cfg=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

get_loss(output, target, target_weight)[source]

Calculate top-down keypoint loss.

Note

batch_size: N num_keypoints: K heatmaps height: H heatmaps weight: W

Parameters
  • output (torch.Tensor[NxKxHxW]) – Output heatmaps.

  • target (torch.Tensor[NxKxHxW]) – Target heatmaps.

  • target_weight (torch.Tensor[NxKx1]) – Weights across different joint types.

get_accuracy(output, target, target_weight)[source]

Calculate accuracy for top-down keypoint loss.

Note

batch_size: N num_keypoints: K heatmaps height: H heatmaps weight: W

Parameters
  • output (torch.Tensor[NxKxHxW]) – Output heatmaps.

  • target (torch.Tensor[NxKxHxW]) – Target heatmaps.

  • target_weight (torch.Tensor[NxKx1]) – Weights across different joint types.

forward(x)[source]

Forward function.

inference_model(x, flip_pairs=None)[source]

Inference function.

Returns

Output heatmaps.

Return type

output_heatmap (np.ndarray)

Parameters
  • x (torch.Tensor[NxKxHxW]) – Input features.

  • flip_pairs (None | list[tuple()) – Pairs of keypoints which are mirrored.

training: bool
init_weights()[source]

Initialize model weights.

Submodules
easycv.models.pose.top_down module
class easycv.models.pose.top_down.TopDown(backbone, neck=None, keypoint_head=None, train_cfg=None, test_cfg=None, pretrained=None, loss_pose=None)[source]

Bases: easycv.models.base.BaseModel

Top-down pose detectors.

Parameters
  • backbone (dict) – Backbone modules to extract feature.

  • keypoint_head (dict) – Keypoint head to process feature.

  • train_cfg (dict) – Config for training. Default: None.

  • test_cfg (dict) – Config for testing. Default: None.

  • pretrained (str) – Path to the pretrained models.

  • loss_pose (None) – Deprecated arguments. Please use loss_keypoint for heads instead.

__init__(backbone, neck=None, keypoint_head=None, train_cfg=None, test_cfg=None, pretrained=None, loss_pose=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property with_neck

Check if has keypoint_head.

property with_keypoint

Check if has keypoint_head.

init_weights()[source]

Weight initialization for model.

forward_train(img, target, target_weight, img_metas, **kwargs)[source]

Defines the computation performed at every call when training.

forward_test(img, img_metas, return_heatmap=False, **kwargs)[source]

Defines the computation performed at every call when testing.

forward_export(img, img_metas, return_heatmap=False)[source]
show_result(img, result, skeleton=None, kpt_score_thr=0.3, bbox_color='green', pose_kpt_color=None, pose_link_color=None, text_color='white', radius=4, thickness=1, font_scale=0.5, bbox_thickness=1, win_name='', show=False, show_keypoint_weight=False, wait_time=0, out_file=None)[source]

Draw result over img.

Parameters
  • img (str or Tensor) – The image to be displayed.

  • result (list[dict]) – The results to draw over img (bbox_result, pose_result).

  • skeleton (list[list]) – The connection of keypoints. skeleton is 0-based indexing.

  • kpt_score_thr (float, optional) – Minimum score of keypoints to be shown. Default: 0.3.

  • bbox_color (str or tuple or Color) – Color of bbox lines.

  • pose_kpt_color (np.array[Nx3]`) – Color of N keypoints. If None, do not draw keypoints.

  • pose_link_color (np.array[Mx3]) – Color of M links. If None, do not draw links.

  • text_color (str or tuple or Color) – Color of texts.

  • radius (int) – Radius of circles.

  • thickness (int) – Thickness of lines.

  • font_scale (float) – Font scales of texts.

  • win_name (str) – The window name.

  • show (bool) – Whether to show the image. Default: False.

  • show_keypoint_weight (bool) – Whether to change the transparency using the predicted confidence scores of keypoints.

  • wait_time (int) – Value of waitKey param. Default: 0.

  • out_file (str or None) – The filename to write the image. Default: None.

Returns

Visualized img, only if not show or out_file.

Return type

Tensor

training: bool

easycv.models.selfsup package

Submodules
easycv.models.selfsup.byol module
class easycv.models.selfsup.byol.BYOL(backbone, neck=None, head=None, pretrained=None, base_momentum=0.996, **kwargs)[source]

Bases: easycv.models.base.BaseModel

BYOL unofficial implementation. Paper: https://arxiv.org/abs/2006.07733

__init__(backbone, neck=None, head=None, pretrained=None, base_momentum=0.996, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward_train(img, **kwargs)[source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_test(img, **kwargs)[source]

Abstract interface for model forward in testing

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward(img, mode='train', **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.selfsup.dino module
class easycv.models.selfsup.dino.MultiCropWrapper(backbone, head)[source]

Bases: torch.nn.modules.module.Module

Perform forward pass separately on each resolution input. The inputs corresponding to a single resolution are clubbed and single forward is run on the same resolution inputs. Hence we do several forward passes = number of different resolutions used. We then concatenate all the output features and run the head forward on these concatenated features.

__init__(backbone, head)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.dino.DINOLoss(out_dim, ncrops, warmup_teacher_temp, teacher_temp, warmup_teacher_temp_epochs, nepochs, device, student_temp=0.1, center_momentum=0.9)[source]

Bases: torch.nn.modules.module.Module

__init__(out_dim, ncrops, warmup_teacher_temp, teacher_temp, warmup_teacher_temp_epochs, nepochs, device, student_temp=0.1, center_momentum=0.9)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(student_output, teacher_output, epoch)[source]

Cross-entropy between softmax outputs of the teacher and student networks.

update_center(teacher_output)[source]

Update center used for teacher output.

training: bool
easycv.models.selfsup.dino.has_batchnorms(model)[source]
easycv.models.selfsup.dino.get_params_groups(model)[source]
class easycv.models.selfsup.dino.DINO(backbone, train_preprocess=[], neck=None, config=None, pretrained=None)[source]

Bases: easycv.models.base.BaseModel

__init__(backbone, train_preprocess=[], neck=None, config=None, pretrained=None)[source]

Init Moby

Parameters
  • backbone – backbone config to build vision backbone

  • train_preprocess – [gaussBlur, mixUp, solarize]

  • neck – neck config to build Moby Neck

  • config – DINO parameter config

get_params_groups()[source]
init_weights(pretrained=None)[source]
init_before_train()[source]
momentum_update_key_encoder(m=0.999)[source]

ema for dino

forward_train(inputs)[source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_test(img, **kwargs)[source]

Abstract interface for model forward in testing

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_feature(img, **kwargs)[source]

Forward backbone

Returns

feature tensor

Return type

x (torch.Tensor)

training: bool
forward(img, gt_label=None, mode='train', extract_list=['neck'], **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

easycv.models.selfsup.mae module
class easycv.models.selfsup.mae.MAE(backbone, neck, mask_ratio=0.75, norm_pix_loss=True, **kwargs)[source]

Bases: easycv.models.base.BaseModel

__init__(backbone, neck, mask_ratio=0.75, norm_pix_loss=True, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
patchify(imgs)[source]

convert image to patch

Parameters

imgs – (N, 3, H, W)

Returns

(N, L, patch_size**2 *3)

Return type

x

forward_loss(imgs, pred, mask)[source]

compute loss

Parameters
  • imgs – (N, 3, H, W)

  • pred – (N, L, p*p*3)

  • mask – (N, L), 0 is keep, 1 is remove,

forward_train(img, **kwargs)[source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_test(img, **kwargs)[source]

Abstract interface for model forward in testing

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward(img, mode='train', **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.selfsup.mixco module
class easycv.models.selfsup.mixco.MIXCO(backbone, train_preprocess=[], neck=None, head=None, mixco_head=None, pretrained=None, queue_len=65536, feat_dim=128, momentum=0.999, **kwargs)[source]

Bases: easycv.models.selfsup.moco.MOCO

MOCO.

A mixup version moco https://arxiv.org/pdf/2010.06300.pdf

__init__(backbone, train_preprocess=[], neck=None, head=None, mixco_head=None, pretrained=None, queue_len=65536, feat_dim=128, momentum=0.999, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward_train(img, **kwargs)[source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

training: bool
easycv.models.selfsup.moby module
class easycv.models.selfsup.moby.MoBY(backbone, train_preprocess=[], neck=None, head=None, pretrained=None, queue_len=4096, contrast_temperature=0.2, momentum=0.99, online_drop_path_rate=0.2, target_drop_path_rate=0.0, **kwargs)[source]

Bases: easycv.models.base.BaseModel

MoBY. Part of the code is borrowed from: https://github.com/SwinTransformer/Transformer-SSL/blob/main/models/moby.py.

__init__(backbone, train_preprocess=[], neck=None, head=None, pretrained=None, queue_len=4096, contrast_temperature=0.2, momentum=0.99, online_drop_path_rate=0.2, target_drop_path_rate=0.0, **kwargs)[source]

Init Moby

Parameters
  • backbone – backbone config to build vision backbone

  • train_preprocess – [gaussBlur, mixUp, solarize]

  • neck – neck config to build Moby Neck

  • head – head config to build Moby Neck

  • pretrained – pretrained weight for backbone

  • queue_len – moby queue length

  • contrast_temperature – contrastive_loss temperature

  • momentum – ema target weights momentum

  • online_drop_path_rate – for transformer based backbone, set online model drop_path_rate

  • target_drop_path_rate – for transformer based backbone, set target model drop_path_rate

init_weights()[source]
forward_backbone(img)[source]
contrastive_loss(q, k, queue)[source]
forward_train(img, **kwargs)[source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_test(img, **kwargs)[source]

Abstract interface for model forward in testing

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_feature(img, **kwargs)[source]

Forward backbone

Returns

feature tensor

Return type

x (torch.Tensor)

forward(img, gt_label=None, mode='train', extract_list=['neck'], **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.selfsup.moby.concat_all_gather(tensor)[source]

Performs all_gather operation on the provided tensors. * Warning *: torch.distributed.all_gather has no gradient.

easycv.models.selfsup.moco module
class easycv.models.selfsup.moco.MOCO(backbone, train_preprocess=[], neck=None, head=None, pretrained=None, queue_len=65536, feat_dim=128, momentum=0.999, **kwargs)[source]

Bases: easycv.models.base.BaseModel

MOCO. Part of the code is borrowed from: https://github.com/facebookresearch/moco/blob/master/moco/builder.py.

__init__(backbone, train_preprocess=[], neck=None, head=None, pretrained=None, queue_len=65536, feat_dim=128, momentum=0.999, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward_backbone(img)[source]
forward_train(img, **kwargs)[source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_test(img, **kwargs)[source]

Abstract interface for model forward in testing

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_feature(img, **kwargs)[source]

Forward backbone

Returns

feature tensor

Return type

x (torch.Tensor)

forward(img, gt_label=None, mode='train', extract_list=['neck'], **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.selfsup.moco.concat_all_gather(tensor)[source]

Performs all_gather operation on the provided tensors. * Warning *: torch.distributed.all_gather has no gradient.

easycv.models.selfsup.necks module
class easycv.models.selfsup.necks.DINONeck(in_dim, out_dim, use_bn=False, norm_last_layer=True, nlayers=3, hidden_dim=2048, bottleneck_dim=256)[source]

Bases: torch.nn.modules.module.Module

__init__(in_dim, out_dim, use_bn=False, norm_last_layer=True, nlayers=3, hidden_dim=2048, bottleneck_dim=256)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.necks.MoBYMLP(in_channels=256, hid_channels=4096, out_channels=256, num_layers=2, with_avg_pool=True)[source]

Bases: torch.nn.modules.module.Module

__init__(in_channels=256, hid_channels=4096, out_channels=256, num_layers=2, with_avg_pool=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

init_weights(init_linear='normal')[source]
training: bool
class easycv.models.selfsup.necks.NonLinearNeckSwav(in_channels, hid_channels, out_channels, with_avg_pool=True, export=False)[source]

Bases: torch.nn.modules.module.Module

The non-linear neck in byol: fc-syncbn-relu-fc

__init__(in_channels, hid_channels, out_channels, with_avg_pool=True, export=False)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.necks.NonLinearNeckV0(in_channels, hid_channels, out_channels, sync_bn=False, with_avg_pool=True)[source]

Bases: torch.nn.modules.module.Module

The non-linear neck in ODC, fc-bn-relu-dropout-fc-relu

__init__(in_channels, hid_channels, out_channels, sync_bn=False, with_avg_pool=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.necks.NonLinearNeckV1(in_channels, hid_channels, out_channels, with_avg_pool=True)[source]

Bases: torch.nn.modules.module.Module

The non-linear neck in MoCO v2: fc-relu-fc

__init__(in_channels, hid_channels, out_channels, with_avg_pool=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.necks.NonLinearNeckV2(in_channels, hid_channels, out_channels, with_avg_pool=True)[source]

Bases: torch.nn.modules.module.Module

The non-linear neck in byol: fc-bn-relu-fc

__init__(in_channels, hid_channels, out_channels, with_avg_pool=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.necks.NonLinearNeckSimCLR(in_channels, hid_channels, out_channels, num_layers=2, with_avg_pool=True)[source]

Bases: torch.nn.modules.module.Module

SimCLR non-linear neck.

Structure: fc(no_bias)-bn(has_bias)-[relu-fc(no_bias)-bn(no_bias)].

The substructures in [] can be repeated. For the SimCLR default setting, the repeat time is 1.

However, PyTorch does not support to specify (weight=True, bias=False).

It only support “affine” including the weight and bias. Hence, the second BatchNorm has bias in this implementation. This is different from the offical implementation of SimCLR.

Since SyncBatchNorm in pytorch<1.4.0 does not support 2D input, the input is

expanded to 4D with shape: (N,C,1,1). I am not sure if this workaround has no bugs. See the pull request here: https://github.com/pytorch/pytorch/pull/29626

Parameters
  • in_channels – input channel number

  • hid_channels – hidden channels

  • out_channels – output channel number

  • num_layers (int) – number of fc layers, it is 2 in the SimCLR default setting.

  • with_avg_pool – output with average pooling

__init__(in_channels, hid_channels, out_channels, num_layers=2, with_avg_pool=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.necks.RelativeLocNeck(in_channels, out_channels, sync_bn=False, with_avg_pool=True)[source]

Bases: torch.nn.modules.module.Module

Relative patch location neck: fc-bn-relu-dropout

__init__(in_channels, out_channels, sync_bn=False, with_avg_pool=True)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights(init_linear='normal')[source]
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.necks.MAENeck(num_patches, embed_dim=768, patch_size=16, in_chans=3, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]

Bases: torch.nn.modules.module.Module

MAE decoder

Parameters
  • num_patches (int) – number of patches from encoder

  • embed_dim (int) – encoder embedding dimension

  • patch_size (int) – encoder patch size

  • in_chans (int) – input image channels

  • decoder_embed_dim (int) – decoder embedding dimension

  • decoder_depth (int) – number of decoder layers

  • decoder_num_heads (int) – Parallel attention heads

  • mlp_ratio (float) – mlp ratio

  • norm_layer – type of normalization layer

__init__(num_patches, embed_dim=768, patch_size=16, in_chans=3, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
training: bool
forward(x, ids_restore)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

class easycv.models.selfsup.necks.FastConvMAENeck(num_patches, embed_dim=768, patch_size=16, in_channels=3, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]

Bases: easycv.models.selfsup.necks.MAENeck

Fast ConvMAE decoder, refer to: https://github.com/Alpha-VL/FastConvMAE

Parameters
  • num_patches (int) – number of patches from encoder

  • embed_dim (int) – encoder embedding dimension

  • patch_size (int) – encoder patch size

  • in_channels (int) – input image channels

  • decoder_embed_dim (int) – decoder embedding dimension

  • decoder_depth (int) – number of decoder layers

  • decoder_num_heads (int) – Parallel attention heads

  • mlp_ratio (float) – mlp ratio

  • norm_layer – type of normalization layer

training: bool
__init__(num_patches, embed_dim=768, patch_size=16, in_channels=3, decoder_embed_dim=512, decoder_depth=8, decoder_num_heads=16, mlp_ratio=4.0, norm_layer=functools.partial(<class 'torch.nn.modules.normalization.LayerNorm'>, eps=1e-06))[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward(x, ids_restore)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

easycv.models.selfsup.simclr module
class easycv.models.selfsup.simclr.SimCLR(backbone, train_preprocess=[], neck=None, head=None, pretrained=None)[source]

Bases: easycv.models.base.BaseModel

__init__(backbone, train_preprocess=[], neck=None, head=None, pretrained=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward_backbone(img)[source]

Forward backbone

Returns

backbone outputs

Return type

x (tuple)

forward_train(img, **kwargs)[source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_test(img, **kwargs)[source]

Abstract interface for model forward in testing

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward(img, mode='train', **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.selfsup.swav module
class easycv.models.selfsup.swav.SWAV(backbone, train_preprocess=[], neck=None, config=None, pretrained=None)[source]

Bases: easycv.models.base.BaseModel

__init__(backbone, train_preprocess=[], neck=None, config=None, pretrained=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

init_weights()[source]
forward_backbone(img)[source]
forward_train_model(inputs)[source]
forward_train(inputs)[source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_test(img, **kwargs)[source]

Abstract interface for model forward in testing

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_feature(img, **kwargs)[source]

Forward backbone

Returns

feature tensor

Return type

x (torch.Tensor)

forward(img, gt_label=None, mode='train', extract_list=['neck'], **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.selfsup.swav.MultiPrototypes(output_dim, nmb_prototypes)[source]

Bases: torch.nn.modules.module.Module

__init__(output_dim, nmb_prototypes)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.selfsup.swav.distributed_sinkhorn(Q, nmb_iters)[source]

easycv.models.utils package

Submodules
easycv.models.utils.accuracy module
easycv.models.utils.activation module
class easycv.models.utils.activation.FReLU(in_channel)[source]

Bases: torch.nn.modules.module.Module

__init__(in_channel)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.utils.activation.build_activation_layer(cfg)[source]

Build activation layer.

Parameters

cfg (dict) – The activation layer config, which should contain: - type (str): Layer type. - layer args: Args needed to instantiate an activation layer.

Returns

Created activation layer.

Return type

nn.Module

easycv.models.utils.conv_module module
easycv.models.utils.conv_module.build_conv_layer(cfg, *args, **kwargs)[source]

Build convolution layer

Parameters

cfg (None or dict) – cfg should contain: type (str): identify conv layer type. layer args: args needed to instantiate a conv layer.

Returns

created conv layer

Return type

layer (nn.Module)

class easycv.models.utils.conv_module.ConvModule(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias='auto', conv_cfg=None, norm_cfg=None, act_cfg={'type': 'ReLU'}, inplace=True, order=('conv', 'norm', 'act'))[source]

Bases: torch.nn.modules.module.Module

A conv block that contains conv/norm/activation layers.

Parameters
  • in_channels (int) – Same as nn.Conv2d.

  • out_channels (int) – Same as nn.Conv2d.

  • kernel_size (int or tuple[int]) – Same as nn.Conv2d.

  • stride (int or tuple[int]) – Same as nn.Conv2d.

  • padding (int or tuple[int]) – Same as nn.Conv2d.

  • dilation (int or tuple[int]) – Same as nn.Conv2d.

  • groups (int) – Same as nn.Conv2d.

  • bias (bool or str) – If specified as auto, it will be decided by the norm_cfg. Bias will be set as True if norm_cfg is None, otherwise False.

  • conv_cfg (dict) – Config dict for convolution layer.

  • norm_cfg (dict) – Config dict for normalization layer.

  • act_cfg (dict) – Config of activation layers. Default: dict(type=’ReLU’)

  • inplace (bool) – Whether to use inplace mode for activation.

  • order (tuple[str]) – The order of conv/norm/activation layers. It is a sequence of “conv”, “norm” and “act”. Examples are (“conv”, “norm”, “act”) and (“act”, “conv”, “norm”).

__init__(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias='auto', conv_cfg=None, norm_cfg=None, act_cfg={'type': 'ReLU'}, inplace=True, order=('conv', 'norm', 'act'))[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property norm
init_weights()[source]
forward(x, activate=True, norm=True)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.utils.conv_ws module
easycv.models.utils.conv_ws.conv_ws_2d(input, weight, bias=None, stride=1, padding=0, dilation=1, groups=1, eps=1e-05)[source]
class easycv.models.utils.conv_ws.ConvWS2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, eps=1e-05)[source]

Bases: torch.nn.modules.conv.Conv2d

__init__(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, eps=1e-05)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

bias: Optional[torch.Tensor]
in_channels: int
out_channels: int
kernel_size: Tuple[int, ...]
stride: Tuple[int, ...]
padding: Union[str, Tuple[int, ...]]
dilation: Tuple[int, ...]
transposed: bool
output_padding: Tuple[int, ...]
groups: int
padding_mode: str
weight: torch.Tensor
easycv.models.utils.dist_utils module
easycv.models.utils.dist_utils.all_gather(embeddings, labels)[source]
easycv.models.utils.dist_utils.all_gather_embeddings_labels(embeddings, labels)[source]
class easycv.models.utils.dist_utils.DistributedLossWrapper(loss, **kwargs)[source]

Bases: torch.nn.modules.module.Module

__init__(loss, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(embeddings, labels, *args, **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.utils.dist_utils.DistributedMinerWrapper(miner)[source]

Bases: torch.nn.modules.module.Module

__init__(miner)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(embeddings, labels, ref_emb=None, ref_labels=None)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.utils.dist_utils.reduce_mean(tensor)[source]

“Obtain the mean of tensor on different GPUs.

easycv.models.utils.gather_layer module
class easycv.models.utils.gather_layer.GatherLayer(*args, **kwargs)[source]

Bases: torch.autograd.function.Function

Gather tensors from all process, supporting backward propagation.

static forward(ctx, input)[source]

Performs the operation.

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by any number of arguments (tensors or other types).

The context can be used to store arbitrary data that can be then retrieved during the backward pass. Tensors should not be stored directly on ctx (though this is not currently enforced for backward compatibility). Instead, tensors should be saved either with ctx.save_for_backward() if they are intended to be used in backward (equivalently, vjp) or ctx.save_for_forward() if they are intended to be used for in jvp.

static backward(ctx, *grads)[source]

Defines a formula for differentiating the operation with backward mode automatic differentiation (alias to the vjp function).

This function is to be overridden by all subclasses.

It must accept a context ctx as the first argument, followed by as many outputs as the forward() returned (None will be passed in for non tensor outputs of the forward function), and it should return as many tensors, as there were inputs to forward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Tensor or is a Tensor not requiring grads, you can just pass None as a gradient for that input.

The context can be used to retrieve tensors saved during the forward pass. It also has an attribute ctx.needs_input_grad as a tuple of booleans representing whether each input needs gradient. E.g., backward() will have ctx.needs_input_grad[0] = True if the first input to forward() needs gradient computated w.r.t. the output.

easycv.models.utils.init_weights module
easycv.models.utils.init_weights.trunc_normal_(tensor, mean=0.0, std=1.0, a=- 2.0, b=2.0)[source]
easycv.models.utils.multi_pooling module
class easycv.models.utils.multi_pooling.GeMPooling(p=3, eps=1e-06)[source]

Bases: torch.nn.modules.module.Module

GemPooling used for image retrival p = 1, avgpooling p > 1 : increases the contrast of the pooled feature map and focuses on the salient features of the image p = infinite : spatial max-pooling layer

__init__(p=3, eps=1e-06)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

gem(x, p=3, eps=1e-06)[source]
training: bool
class easycv.models.utils.multi_pooling.MultiPooling(pool_type='adaptive', in_indices=(0), backbone='resnet50')[source]

Bases: torch.nn.modules.module.Module

Pooling layers for features from multiple depth.

POOL_PARAMS = {'resnet50': [{'kernel_size': 10, 'stride': 10, 'padding': 4}, {'kernel_size': 16, 'stride': 8, 'padding': 0}, {'kernel_size': 13, 'stride': 5, 'padding': 0}, {'kernel_size': 8, 'stride': 3, 'padding': 0}, {'kernel_size': 6, 'stride': 1, 'padding': 0}]}
POOL_SIZES = {'resnet50': [12, 6, 4, 3, 2]}
POOL_DIMS = {'resnet50': [9216, 9216, 8192, 9216, 8192]}
__init__(pool_type='adaptive', in_indices=(0), backbone='resnet50')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.utils.multi_pooling.MultiAvgPooling(pool_type='adaptive', in_indices=(0), backbone='resnet50')[source]

Bases: torch.nn.modules.module.Module

Pooling layers for features from multiple depth.

POOL_PARAMS = {'resnet50': [{'kernel_size': 10, 'stride': 10, 'padding': 4}, {'kernel_size': 16, 'stride': 8, 'padding': 0}, {'kernel_size': 13, 'stride': 5, 'padding': 0}, {'kernel_size': 8, 'stride': 3, 'padding': 0}, {'kernel_size': 7, 'stride': 1, 'padding': 0}]}
POOL_SIZES = {'resnet50': [12, 6, 4, 3, 1]}
POOL_DIMS = {'resnet50': [9216, 9216, 8192, 9216, 2048]}
__init__(pool_type='adaptive', in_indices=(0), backbone='resnet50')[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

training: bool
forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

easycv.models.utils.norm module
class easycv.models.utils.norm.SyncIBN(planes, ratio=0.5, eps=1e-05)[source]

Bases: torch.nn.modules.module.Module

Instance-Batch Normalization layer from “Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net” <https://arxiv.org/pdf/1807.09441.pdf> :param planes: Number of channels for the input tensor :type planes: int :param ratio: Ratio of instance normalization in the IBN layer :type ratio: float

__init__(planes, ratio=0.5, eps=1e-05)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
class easycv.models.utils.norm.IBN(planes, ratio=0.5, eps=1e-05)[source]

Bases: torch.nn.modules.module.Module

Instance-Batch Normalization layer from “Two at Once: Enhancing Learning and Generalization Capacities via IBN-Net” <https://arxiv.org/pdf/1807.09441.pdf> :param planes: Number of channels for the input tensor :type planes: int :param ratio: Ratio of instance normalization in the IBN layer :type ratio: float

__init__(planes, ratio=0.5, eps=1e-05)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.utils.norm.build_norm_layer(cfg, num_features, postfix='')[source]

Build normalization layer

Parameters
  • cfg (dict) – cfg should contain: type (str): identify norm layer type. layer args: args needed to instantiate a norm layer. requires_grad (bool): [optional] whether stop gradient updates

  • num_features (int) – number of channels from input.

  • postfix (int, str) – appended into norm abbreviation to create named layer.

Returns

abbreviation + postfix layer (nn.Module): created norm layer

Return type

name (str)

easycv.models.utils.ops module
easycv.models.utils.ops.resize_tensor(input, size=None, scale_factor=None, mode='nearest', align_corners=None, warning=True)[source]

Resize tensor with F.interpolate.

Parameters
  • input (Tensor) – the input tensor.

  • size (Tuple[int, int]) – output spatial size.

  • scale_factor (float or Tuple[float]) – multiplier for spatial size. If scale_factor is a tuple, its length has to match input.dim().

  • mode (str) – algorithm used for upsampling: ‘nearest’ | ‘linear’ | ‘bilinear’ | ‘bicubic’ | ‘trilinear’ | ‘area’. Default: ‘nearest’

  • align_corners (bool) –

    Geometrically, we consider the pixels of the input and output as squares rather than points. If set to True, the input and output tensors are aligned by the center points of their corner pixels, preserving the values at the corner pixels.

    If set to False, the input and output tensors are aligned by the corner points of their corner pixels, and the interpolation uses edge value padding for out-of-boundary values, making this operation independent of input size when scale_factor is kept the same. This only has an effect when mode is ‘linear’, ‘bilinear’, ‘bicubic’ or ‘trilinear’.

easycv.models.utils.ops.make_divisible(x, divisor)[source]
easycv.models.utils.pos_embed module
easycv.models.utils.pos_embed.get_2d_sincos_pos_embed(embed_dim, grid_size, cls_token=False)[source]

grid_size: int of the grid height and width return: pos_embed: [grid_size*grid_size, embed_dim] or [1+grid_size*grid_size, embed_dim] (w/ or w/o cls_token)

easycv.models.utils.pos_embed.get_2d_sincos_pos_embed_from_grid(embed_dim, grid)[source]
easycv.models.utils.pos_embed.get_1d_sincos_pos_embed_from_grid(embed_dim, pos)[source]

embed_dim: output dimension for each position pos: a list of positions to be encoded: size (M,) out: (M, D)

easycv.models.utils.pos_embed.interpolate_pos_embed(model, checkpoint_model)[source]
easycv.models.utils.res_layer module
class easycv.models.utils.res_layer.ResLayer(block, num_blocks, in_channels, out_channels, expansion=None, stride=1, avg_down=False, conv_cfg=None, norm_cfg={'type': 'BN'}, **kwargs)[source]

Bases: torch.nn.modules.container.Sequential

ResLayer to build ResNet style backbone. :param block: Residual block used to build ResLayer. :type block: nn.Module :param num_blocks: Number of blocks. :type num_blocks: int :param in_channels: Input channels of this block. :type in_channels: int :param out_channels: Output channels of this block. :type out_channels: int :param expansion: The expansion for BasicBlock/Bottleneck.

If not specified, it will firstly be obtained via block.expansion. If the block has no attribute “expansion”, the following default values will be used: 1 for BasicBlock and 4 for Bottleneck. Default: None.

Parameters
  • stride (int) – stride of the first block. Default: 1.

  • avg_down (bool) – Use AvgPool instead of stride conv when downsampling in the bottleneck. Default: False

  • conv_cfg (dict, optional) – dictionary to construct and config conv layer. Default: None

  • norm_cfg (dict) – dictionary to construct and config norm layer. Default: dict(type=’BN’)

__init__(block, num_blocks, in_channels, out_channels, expansion=None, stride=1, avg_down=False, conv_cfg=None, norm_cfg={'type': 'BN'}, **kwargs)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

easycv.models.utils.scale module
class easycv.models.utils.scale.Scale(scale=1.0)[source]

Bases: torch.nn.modules.module.Module

A learnable scale parameter

__init__(scale=1.0)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool
easycv.models.utils.sobel module
class easycv.models.utils.sobel.Sobel[source]

Bases: torch.nn.modules.module.Module

__init__()[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

forward(x)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

training: bool

Submodules

easycv.models.base module

class easycv.models.base.BaseModel(init_cfg=None)[source]

Bases: torch.nn.modules.module.Module

base class for model.

__init__(init_cfg=None)[source]

Initializes internal Module state, shared by both nn.Module and ScriptModule.

property is_init: bool
init_weights()[source]
abstract forward_train(img: torch.Tensor, **kwargs)Dict[str, torch.Tensor][source]

Abstract interface for model forward in training

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward_test(img: torch.Tensor, **kwargs)Dict[str, torch.Tensor][source]

Abstract interface for model forward in testing

Parameters
  • img (Tensor) – image tensor

  • kwargs (keyword arguments) – Specific to concrete implementation.

forward(mode='train', *args, **kwargs)[source]

Defines the computation performed at every call.

Should be overridden by all subclasses.

Note

Although the recipe for forward pass needs to be defined within this function, one should call the Module instance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.

train_step(data, optimizer)[source]

The iteration step during training.

This method defines an iteration step during training, except for the back propagation and optimizer updating, which are done in an optimizer hook. Note that in some complicated cases or models, the whole process including back propagation and optimizer updating is also defined in this method, such as GAN.

Parameters
  • data (dict) – The output of dataloader.

  • optimizer (torch.optim.Optimizer | dict) – The optimizer of runner is passed to train_step(). This argument is unused and reserved.

Returns

It should contain at least 3 keys: loss, log_vars, num_samples.

  • loss is a tensor for back propagation, which can be a weighted sum of multiple losses.

  • log_vars contains all the variables to be sent to the logger.

  • num_samples indicates the batch size (when the model is DDP, it means the batch size on each GPU), which is used for averaging the logs.

Return type

dict

val_step(data, optimizer)[source]

The iteration step during validation.

This method shares the same signature as train_step(), but used during val epochs. Note that the evaluation after training epochs is not implemented with this method, but an evaluation hook.

show_result(**kwargs)[source]

Visualize the results.

training: bool

easycv.models.builder module

easycv.models.builder.build(cfg, registry, default_args=None)[source]
easycv.models.builder.build_backbone(cfg)[source]
easycv.models.builder.build_neck(cfg)[source]
easycv.models.builder.build_head(cfg)[source]
easycv.models.builder.build_loss(cfg)[source]
easycv.models.builder.build_model(cfg)[source]
easycv.models.builder.build_voxel_encoder(cfg)[source]

Build voxel encoder.

easycv.models.builder.build_middle_encoder(cfg)[source]

Build middle level encoder.

easycv.models.builder.build_fusion_layer(cfg)[source]

Build fusion layer.

easycv.models.builder.build_transformer(cfg, default_args=None)[source]

Builder for Transformer.

easycv.models.builder.build_positional_encoding(cfg, default_args=None)[source]

Builder for Position Encoding.

easycv.models.builder.build_attention(cfg, default_args=None)[source]

Builder for attention.

easycv.models.builder.build_feedforward_network(cfg, default_args=None)[source]

Builder for feed-forward network (FFN).

easycv.models.builder.build_transformer_layer(cfg, default_args=None)[source]

Builder for transformer layer.

easycv.models.builder.build_transformer_layer_sequence(cfg, default_args=None)[source]

Builder for transformer encoder and transformer decoder.

easycv.models.modelzoo module

easycv.models.registry module

easycv.utils package

Submodules

easycv.utils.alias_multinomial module

class easycv.utils.alias_multinomial.AliasMethod(probs)[source]

Bases: object

https://hips.seas.harvard.edu/blog/2013/03/03/the-alias-method-efficient-sampling-with-many-discrete-outcomes/

__init__(probs)[source]

Initialize self. See help(type(self)) for accurate signature.

cuda()[source]
draw(N)[source]

Draw N samples from multinomial

easycv.utils.bbox_util module

easycv.utils.checkpoint module

easycv.utils.checkpoint.get_checkpoint(filename)[source]
easycv.utils.checkpoint.load_checkpoint(model, filename, map_location='cpu', strict=False, logger=None, revise_keys=[('^module\\.', '')])[source]

Load checkpoint from a file or URI.

Parameters
  • model (Module) – Module to load checkpoint.

  • filename (str) – Accept local filepath, URL, torchvision://xxx, open-mmlab://xxx. Please refer to docs/model_zoo.md for details.

  • map_location (str) – Same as torch.load().

  • strict (bool) – Whether to allow different params for the model and checkpoint.

  • logger (logging.Logger or None) – The logger for error message.

  • revise_keys (list) – A list of customized keywords to modify the state_dict in checkpoint. Each item is a (pattern, replacement) pair of the regular expression operations. Default: strip the prefix ‘module.’ by [(r’^module.’, ‘’)].

Returns

The loaded checkpoint.

Return type

dict or OrderedDict

easycv.utils.checkpoint.save_checkpoint(model, filename, optimizer=None, meta=None)[source]

Save checkpoint to file.

The checkpoint will have 3 fields: meta, state_dict and optimizer. By default meta will contain version and time info.

Parameters
  • model (Module) – Module whose params are to be saved.

  • filename (str) – Checkpoint filename.

  • optimizer (Optimizer, optional) – Optimizer to be saved.

  • meta (dict, optional) – Metadata to be saved in checkpoint.

easycv.utils.collect module

easycv.utils.collect.nondist_forward_collect(func, data_loader, length)[source]

Forward and collect network outputs.

This function performs forward propagation and collects outputs. It can be used to collect results, features, losses, etc.

Parameters
  • func (function) – The function to process data. The output must be a dictionary of CPU tensors.

  • length (int) – Expected length of output arrays.

Returns

The concatenated outputs.

Return type

results_all (dict(np.ndarray))

easycv.utils.collect.dist_forward_collect(func, data_loader, rank, length, ret_rank=- 1)[source]

Forward and collect network outputs in a distributed manner.

This function performs forward propagation and collects outputs. It can be used to collect results, features, losses, etc.

Parameters
  • func (function) – The function to process data. The output must be a dictionary of CPU tensors.

  • rank (int) – This process id.

  • length (int) – Expected length of output arrays.

  • ret_rank (int) – The process that returns. Other processes will return None.

Returns

The concatenated outputs.

Return type

results_all (dict(np.ndarray))

easycv.utils.collect_env module

easycv.utils.collect_env.collect_env()[source]

easycv.utils.config_tools module

easycv.utils.config_tools.traverse_replace(d, key, value)[source]
class easycv.utils.config_tools.WrapperConfig(cfg_dict=None, cfg_text=None, filename=None)[source]

Bases: mmcv.utils.config.Config

A facility for config and config files. It supports common file formats as configs: python/json/yaml. The interface is the same as a dict object and also allows access config values as attributes. .. rubric:: Example

>>> cfg = Config(dict(a=1, b=dict(b1=[0, 1])))
>>> cfg.a
1
>>> cfg.b
{'b1': [0, 1]}
>>> cfg.b.b1
[0, 1]
>>> cfg = Config.fromfile('tests/data/config/a.py')
>>> cfg.filename
"/home/kchen/projects/mmcv/tests/data/config/a.py"
>>> cfg.item4
'test'
>>> cfg
"Config [path: /home/kchen/projects/mmcv/tests/data/config/a.py]: "
"{'item1': [1, 2], 'item2': {'a': 0}, 'item3': True, 'item4': 'test'}"
easycv.utils.config_tools.check_base_cfg_path(base_cfg_name='configs/base.py', father_cfg_name=None, easycv_root=None)[source]

Concatenate paths by parsing path rules. for example(pseudo-code):

1. ‘configs’ in base_cfg_name or ‘benchmarks’ in base_cfg_name: base_cfg_name = easycv_root + base_cfg_name 2. ‘configs’ not in base_cfg_name and ‘benchmarks’ not in base_cfg_name: base_cfg_name = father_cfg_name + base_cfg_name

easycv.utils.config_tools.mmcv_file2dict_raw(filename, first_order_params=None)[source]
easycv.utils.config_tools.mmcv_file2dict_base(ori_filename, first_order_params=None, easycv_root=None)[source]
easycv.utils.config_tools.grouping_params(user_config_params)[source]
easycv.utils.config_tools.adapt_pai_params(cfg_dict)[source]
Parameters

cfg_dict (dict) – All parameters of cfg.

Returns

Add the cfg of export and oss.

Return type

cfg_dict (dict)

easycv.utils.config_tools.init_path(ori_filename)[source]
easycv.utils.config_tools.mmcv_config_fromfile(ori_filename)[source]
easycv.utils.config_tools.pai_config_fromfile(ori_filename, user_config_params=None, model_type=None)[source]
easycv.utils.config_tools.get_config_class_value(cfg_dict, ori_key, dict_mem_helper)[source]
easycv.utils.config_tools.config_dict_edit(ori_cfg_dict, cfg_dict, reg, dict_mem_helper)[source]

edit ${configs.variables} in config dict to solve dependicies in config ori_cfg_dict: to find the true value of ${configs.variables} cfg_dict: for find leafs of dict by recursive reg: Regular expression pattern for find all ${configs.variables} in leafs of dict dict_mem_helper: to store the true value of ${configs.variables} which have been found

easycv.utils.config_tools.rebuild_config(cfg, user_config_params)[source]

# rebuild config by user config params, modify config by user config params & replace ${configs.variables} by true value return: Config

easycv.utils.config_tools.validate_export_config(cfg)[source]

easycv.utils.constant module

easycv.utils.dist_utils module

easycv.utils.dist_utils.is_master()[source]
easycv.utils.dist_utils.local_rank()[source]
easycv.utils.dist_utils.dist_zero_exec(rank=0)[source]
easycv.utils.dist_utils.get_num_gpu_per_node()[source]

get number of gpu per node

easycv.utils.dist_utils.barrier()[source]
easycv.utils.dist_utils.is_parallel(model)[source]
easycv.utils.dist_utils.obj2tensor(pyobj, device='cuda')[source]

Serialize picklable python object to tensor.

easycv.utils.dist_utils.tensor2obj(tensor)[source]

Deserialize tensor to picklable python object.

easycv.utils.dist_utils.all_reduce_dict(py_dict, op='sum', group=None, to_float=True)[source]

Apply all reduce function for python dict object.

The code is modified from https://github.com/Megvii- BaseDetection/YOLOX/blob/main/yolox/utils/allreduce_norm.py.

NOTE: make sure that py_dict in different ranks has the same keys and the values should be in the same shape.

Parameters
  • py_dict (dict) – Dict to be applied all reduce op.

  • op (str) – Operator, could be ‘sum’ or ‘mean’. Default: ‘sum’

  • group (torch.distributed.group, optional) – Distributed group, Default: None.

  • to_float (bool) – Whether to convert all values of dict to float. Default: True.

Returns

reduced python dict object.

Return type

OrderedDict

easycv.utils.dist_utils.get_device()[source]

Returns an available device, cpu, cuda.

easycv.utils.dist_utils.sync_random_seed(seed=None, device='cuda')[source]

Make sure different ranks share the same seed. All workers must call this function, otherwise it will deadlock. This method is generally used in DistributedSampler, because the seed should be identical across all processes in the distributed group. In distributed sampling, different ranks should sample non-overlapped data in the dataset. Therefore, this function is used to make sure that each rank shuffles the data indices in the same order based on the same seed. Then different ranks could use different indices to select non-overlapped data from the same data list. :param seed: The seed. Default to None. :type seed: int, Optional :param device: The device where the seed will be put on.

Default to ‘cuda’.

Returns

Seed to be used.

Return type

int

easycv.utils.dist_utils.is_dist_available()[source]

easycv.utils.eval_utils module

easycv.utils.eval_utils.generate_best_metric_name(evaluate_type, dataset_name, metric_names)[source]

Generate best metric name for different evaluator / different dataset / different metric_names evaluate_type: str dataset_name: None or str metric_names: None str or list[str] or tuple(str)

Returns

list[str]

easycv.utils.flops_counter module

easycv.utils.flops_counter.get_model_info(model, input_size, model_config, logger)[source]

get_model_info, check model parameters and Gflops

easycv.utils.flops_counter.get_model_complexity_info(model, input_res, print_per_layer_stat=True, as_strings=True, input_constructor=None, ost=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]
easycv.utils.flops_counter.flops_to_string(flops, units='GMac', precision=2)[source]
easycv.utils.flops_counter.params_to_string(params_num)[source]

converting number to string

Parameters

params_num (float) – number

Returns str

number

>>> params_to_string(1e9)
'1000.0 M'
>>> params_to_string(2e5)
'200.0 k'
>>> params_to_string(3e-9)
'3e-09'
easycv.utils.flops_counter.print_model_with_flops(model, units='GMac', precision=3, ost=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>)[source]
easycv.utils.flops_counter.get_model_parameters_number(model)[source]
easycv.utils.flops_counter.add_flops_counting_methods(net_main_module)[source]
easycv.utils.flops_counter.compute_average_flops_cost(self)[source]

A method that will be available after add_flops_counting_methods() is called on a desired net object. Returns current mean flops consumption per image.

easycv.utils.flops_counter.start_flops_count(self)[source]

A method that will be available after add_flops_counting_methods() is called on a desired net object. Activates the computation of mean flops consumption per image. Call it before you run the network.

easycv.utils.flops_counter.stop_flops_count(self)[source]

A method that will be available after add_flops_counting_methods() is called on a desired net object. Stops computing the mean flops consumption per image. Call whenever you want to pause the computation.

easycv.utils.flops_counter.reset_flops_count(self)[source]

A method that will be available after add_flops_counting_methods() is called on a desired net object. Resets statistics computed so far.

easycv.utils.flops_counter.add_flops_mask(module, mask)[source]
easycv.utils.flops_counter.remove_flops_mask(module)[source]
easycv.utils.flops_counter.is_supported_instance(module)[source]
easycv.utils.flops_counter.empty_flops_counter_hook(module, input, output)[source]
easycv.utils.flops_counter.upsample_flops_counter_hook(module, input, output)[source]
easycv.utils.flops_counter.relu_flops_counter_hook(module, input, output)[source]
easycv.utils.flops_counter.linear_flops_counter_hook(module, input, output)[source]
easycv.utils.flops_counter.pool_flops_counter_hook(module, input, output)[source]
easycv.utils.flops_counter.bn_flops_counter_hook(module, input, output)[source]
easycv.utils.flops_counter.gn_flops_counter_hook(module, input, output)[source]
easycv.utils.flops_counter.deconv_flops_counter_hook(conv_module, input, output)[source]
easycv.utils.flops_counter.conv_flops_counter_hook(conv_module, input, output)[source]
easycv.utils.flops_counter.batch_counter_hook(module, input, output)[source]
easycv.utils.flops_counter.add_batch_counter_variables_or_reset(module)[source]
easycv.utils.flops_counter.add_batch_counter_hook_function(module)[source]
easycv.utils.flops_counter.remove_batch_counter_hook_function(module)[source]
easycv.utils.flops_counter.add_flops_counter_variable_or_reset(module)[source]
easycv.utils.flops_counter.add_flops_counter_hook_function(module)[source]
easycv.utils.flops_counter.remove_flops_counter_hook_function(module)[source]
easycv.utils.flops_counter.add_flops_mask_variable_or_reset(module)[source]

easycv.utils.gather module

easycv.utils.gather.gather_tensors(input_array)[source]
easycv.utils.gather.gather_tensors_batch(input_array, part_size=100, ret_rank=- 1)[source]

easycv.utils.json_utils module

Utilities for dealing with writing json strings.

json_utils wraps json.dump and json.dumps so that they can be used to safely control the precision of floats when writing to json strings or files.

class easycv.utils.json_utils.MyEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Bases: json.encoder.JSONEncoder

default(o)[source]

Implement this method in a subclass such that it returns a serializable object for o, or calls the base implementation (to raise a TypeError).

For example, to support arbitrary iterators, you could implement default like this:

def default(self, o):
    try:
        iterable = iter(o)
    except TypeError:
        pass
    else:
        return list(iterable)
    # Let the base class default method raise the TypeError
    return JSONEncoder.default(self, o)
iterencode(o, _one_shot=False)[source]

Encode the given object and yield each string representation as available.

For example:

for chunk in JSONEncoder().iterencode(bigobject):
    mysocket.write(chunk)
easycv.utils.json_utils.dump(obj, fid, float_digits=- 1, **params)[source]

Wrapper of json.dump that allows specifying the float precision used.

Parameters
  • obj – The object to dump.

  • fid – The file id to write to.

  • float_digits – The number of digits of precision when writing floats out.

  • **params – Additional parameters to pass to json.dumps.

easycv.utils.json_utils.dumps(obj, float_digits=- 1, **params)[source]

Wrapper of json.dumps that allows specifying the float precision used.

Parameters
  • obj – The object to dump.

  • float_digits – The number of digits of precision when writing floats out.

  • **params – Additional parameters to pass to json.dumps.

Returns

JSON string representation of obj.

Return type

output

easycv.utils.json_utils.compat_dumps(data, float_digits=- 1)[source]

handle json dumps chinese and numpy data :param data python data structure: :param float_digits: The number of digits of precision when writing floats out.

Returns

json str, in python2 , the str is encoded with utf8

in python3, the str is unicode type(python3 str)

easycv.utils.json_utils.PrettyParams(**params)[source]

Returns parameters for use with Dump and Dumps to output pretty json.

Example usage:

`json_str = json_utils.Dumps(obj, **json_utils.PrettyParams())` ```json_str = json_utils.Dumps(

obj, **json_utils.PrettyParams(allow_nans=False))```

Parameters

**params – Additional params to pass to json.dump or json.dumps.

Returns

Parameters that are compatible with json_utils.Dump and

json_utils.Dumps.

Return type

params

easycv.utils.logger module

easycv.utils.logger.get_root_logger(log_file=None, log_level=20)[source]

Get the root logger.

The logger will be initialized if it has not been initialized. By default a StreamHandler will be added. If log_file is specified, a FileHandler will also be added. The name of the root logger is the top-level package name, e.g., “easycv”.

Parameters
  • log_file (str | None) – The log filename. If specified, a FileHandler will be added to the root logger.

  • log_level (int) – The root logger level. Note that only the process of rank 0 is affected, while other processes will set the level to “Error” and be silent most of the time.

Returns

The root logger.

Return type

logging.Logger

easycv.utils.logger.print_log(msg, logger=None, level=20)[source]

Print a log message.

Parameters
  • msg (str) – The message to be logged.

  • logger (logging.Logger | str | None) – The logger to be used. Some special loggers are: - “root”: the root logger obtained with get_root_logger(). - “silent”: no message will be printed. - None: The print() method will be used to print log messages.

  • level (int) – Logging level. Only available when logger is a Logger object or “root”.

easycv.utils.metric_distance module

easycv.utils.metric_distance.LpDistance(query_emb, ref_emb, p=2)[source]
Input:

query_emb: [n, dims] tensor ref_emb: [m, dims] tensor p : p normalize

Output:

distance_matrix: [n, m] tensor

distance_matrix_i_j = (sigma_k(a_i_k**p - b_j_k**p))**(1/p)

easycv.utils.metric_distance.DotproductSimilarity(query_emb, ref_emb)[source]
easycv.utils.metric_distance.CosineSimilarity(query_emb, ref_emb)[source]
Input:

query_emb: [n, dims] tensor ref_emb: [m, dims] tensor

Output:

distance_matrix: [n, m] tensor

easycv.utils.misc module

easycv.utils.misc.tensor2imgs(tensor, mean=(0, 0, 0), std=(1, 1, 1), to_rgb=True)[source]
easycv.utils.misc.unmap(data, count, inds, fill=0)[source]

Unmap a subset of item (data) back to the original set of items (of size count)

easycv.utils.misc.add_prefix(inputs, prefix)[source]

Add prefix for dict key.

Parameters
  • inputs (dict) – The input dict with str keys.

  • prefix (str) – The prefix add to key name.

Returns

The dict with keys wrapped with prefix.

Return type

dict

easycv.utils.misc.reparameterize_models(model)[source]
reparameterize model for inference, especially forf
  1. rep conv block : merge 3x3 weight 1x1 weights

call module switch_to_deploy recursively

Parameters

model – nn.Module

easycv.utils.misc.deprecated(reason)[source]

This is a decorator which can be used to mark functions as deprecated. It will result in a warning being emitted when the function is used.

easycv.utils.misc.encode_str_to_tensor(obj)[source]
easycv.utils.misc.decode_tensor_to_str(obj)[source]

easycv.utils.preprocess_function module

easycv.utils.preprocess_function.bninceptionPre(image, mean=[104, 117, 128], std=[1, 1, 1])[source]
Parameters
  • image – pytorch Image tensor from PIL (range 0~1), bgr format

  • mean – norm mean

  • std – norm val

Returns

A image norm in 0~255, rgb format

easycv.utils.preprocess_function.randomErasing(image, probability=0.5, sl=0.02, sh=0.2, r1=0.3, mean=[0.4914, 0.4822, 0.4465])[source]
easycv.utils.preprocess_function.solarize(tensor, threshold=0.5, apply_prob=0.2)[source]

tensor : pytorch tensor

easycv.utils.preprocess_function.gaussianBlurDynamic(image, apply_prob=0.5)[source]
easycv.utils.preprocess_function.gaussianBlur(image, kernel_size=22, apply_prob=0.5)[source]
easycv.utils.preprocess_function.randomGrayScale(image, apply_prob=0.2)[source]
easycv.utils.preprocess_function.mixUp(image, alpha=0.2)[source]
easycv.utils.preprocess_function.mixUpCls(data, alpha=0.2)[source]

easycv.utils.profiling module

easycv.utils.profiling.profile_time(trace_name, name, enabled=True, stream=None, end_stream=None)[source]

Print time spent by CPU and GPU.

Useful as a temporary context manager to find sweet spots of code suitable for async implementation.

easycv.utils.profiling.benchmark_torch_function(iters, f, *args)[source]
easycv.utils.profiling.time_synchronized()[source]

easycv.utils.py_util module

easycv.utils.py_util.copy_attr(a, b, include=(), exclude=())[source]
easycv.utils.py_util.get_parent_path(path: str)[source]

get parent path, support oss-style path

easycv.utils.registry module

class easycv.utils.registry.Registry(name)[source]

Bases: object

__init__(name)[source]

Initialize self. See help(type(self)) for accurate signature.

property name
property module_dict
get(key)[source]
register_module(cls=None, force=False)[source]
easycv.utils.registry.build_from_cfg(cfg, registry, default_args=None)[source]

Build a module from config dict.

Parameters
  • cfg (dict) – Config dict. It should at least contain the key “type”.

  • registry (Registry) – The registry to search the type from.

  • default_args (dict, optional) – Default initialization arguments.

Returns

The constructed object.

Return type

obj

easycv.utils.test_util module

Contains functions which are convenient for unit testing.

easycv.utils.test_util.get_tmp_dir()[source]
easycv.utils.test_util.clear_all_tmp_dirs()[source]
easycv.utils.test_util.replace_data_for_test(cfg)[source]

replace real data with test data

Parameters

cfg – Config object

easycv.utils.test_util.RunAsSubprocess(f)[source]
easycv.utils.test_util.clean_up(test_dir)[source]
easycv.utils.test_util.run_in_subprocess(cmd)[source]
easycv.utils.test_util.dist_exec_wrapper(cmd, nproc_per_node, node_rank=0, nnodes=1, port='29527', addr='127.0.0.1', python_path=None)[source]

donot forget init dist in your function or script of cmd `python from mmcv.runner import init_dist init_dist(launcher='pytorch') `

easycv.utils.test_util.is_port_used(port, host='127.0.0.1')[source]
easycv.utils.test_util.get_random_port()[source]
easycv.utils.test_util.pseudo_dist_init()[source]
easycv.utils.test_util.computeStats(backend, timings, batch_size=1, model_name='default')[source]

compute the statistical metric of time and speed

easycv.utils.test_util.benchmark(predictor, input_data_list, backend='BACKEND', batch_size=1, model_name='default', num=200)[source]

evaluate the time and speed of different models

class easycv.utils.test_util.DistributedTestCase(methodName='runTest')[source]

Bases: unittest.case.TestCase

Distributed TestCase for test function with distributed mode. .. rubric:: Examples

import torch from mmcv.runner import init_dist from torch import distributed as dist

def _test_func(*args, **kwargs):

init_dist(launcher=’pytorch’) rank = dist.get_rank() if rank == 0:

value = torch.tensor(1.0).cuda()

else:

value = torch.tensor(2.0).cuda()

dist.all_reduce(value) return value.cpu().numpy()

class DistTest(DistributedTestCase):
def test_function_dist(self):

args = () # args should be python builtin type kwargs = {} # kwargs should be python builtin type self.start_with_torch(

_test_func, num_gpus=2, assert_callback=lambda x: self.assertEqual(x, 3.0), *args, **kwargs,

)

start_with_torch(func, num_gpus, assert_callback=None, save_all_ranks=False, *args, **kwargs)[source]
start_with_torchacc(func, num_gpus, assert_callback=None, save_all_ranks=False, *args, **kwargs)[source]
clean_tmp(tmp_file_list)[source]

easycv.utils.user_config_params_utils module

easycv.utils.user_config_params_utils.check_value_type(replacement, original)[source]

convert replacement’s type to original’s type, support converting str to int or float or list or tuple

easycv package

Subpackages

easycv.file package

Submodules
easycv.file.base module
class easycv.file.base.IOBase[source]

Bases: object

static register(options)[source]
open(path: str, mode: str = 'r')[source]
exists(path: str)bool[source]
move(src: str, dst: str)[source]
copy(src: str, dst: str)[source]
copytree(src: str, dst: str)[source]
makedirs(path: str, exist_ok=True)[source]
remove(path: str)[source]
rmtree(path: str)[source]
listdir(path: str, recursive=False, full_path=False, contains=None)[source]
isdir(path: str)bool[source]
isfile(path: str)bool[source]
abspath(path: str)str[source]
last_modified(path: str)datetime.datetime[source]
last_modified_str(path: str)str[source]
size(path: str)int[source]
md5(path: str)str[source]
re_remote = re.compile('(oss|https?)://')
islocal(path: str)bool[source]
is_writable(path)[source]
class easycv.file.base.IOLocal[source]

Bases: easycv.file.base.IOBase

open(path, mode='r')[source]
exists(path)[source]
move(src, dst)[source]
copy(src, dst)[source]
copytree(src, dst)[source]
makedirs(path, exist_ok=True)[source]
remove(path)[source]
rmtree(path)[source]
listdir(path, recursive=False, full_path=False, contains: Optional[Union[List[str], str]] = None)[source]
isdir(path)[source]
isfile(path)[source]
glob(path)[source]
abspath(path)[source]
last_modified(path)[source]
last_modified_str(path)[source]
size(path: str)int[source]
easycv.file.file_io module
easycv.file.file_io.set_oss_env(ak_id: str, ak_secret: str, hosts: Union[str, List[str]], buckets: Union[str, List[str]])[source]
class easycv.file.file_io.IO(max_retry=10, retry_wait=0.1, max_retry_wait=30)[source]

Bases: easycv.file.base.IOLocal

IO module to support both local and oss io. If access oss file, you need to authorize OSS, please refer to IO.access_oss.

__init__(max_retry=10, retry_wait=0.1, max_retry_wait=30)[source]

Initialize self. See help(type(self)) for accurate signature.

access_oss(ak_id: str = '', ak_secret: str = '', hosts: Union[str, List[str]] = '', buckets: Union[str, List[str]] = '')[source]

If access oss file, you need to authorize OSS as follows:

Method1: from easycv.file import io io.access_oss(

ak_id=’your_accesskey_id’, ak_secret=’your_accesskey_secret’, hosts=’your endpoint’ or [‘your endpoint1’, ‘your endpoint2’], buckets=’your bucket’ or [‘your bucket1’, ‘your bucket2’])

Method2:

Add oss config to your local file ~/.ossutilconfig, as follows: More oss config information, please refer to: https://help.aliyun.com/document_detail/120072.html ``` [Credentials]

language = CH endpoint = your endpoint accessKeyID = your_accesskey_id accessKeySecret = your_accesskey_secret

[Bucket-Endpoint]

bucket1 = endpoint1 bucket2 = endpoint2

``` Then run the following command, the config file will be read by default to authorize oss.

from easycv.file import io io.access_oss()

open(full_path, mode='r')[source]

Same usage as the python build-in open. Support local path and oss path.

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # Write something to a oss file. with io.open(‘oss://bucket_name/demo.txt’, ‘w’) as f:

f.write(“test”)

# Read from a oss file. with io.open(‘oss://bucket_name/demo.txt’, ‘r’) as f:

print(f.read())

Parameters

full_path – absolute oss path

exists(path)[source]

Whether the file exists, same usage as os.path.exists. Support local path and oss path.

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss ret = io.exists(‘oss://bucket_name/dir’) print(ret)

Parameters

path – oss path or local path

move(src, dst)[source]

Move src to dst, same usage as shutil.move. Support local path and oss path.

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # move oss file to local io.move(‘oss://bucket_name/file.txt’, ‘/your/local/path/file.txt’) # move oss file to oss io.move(‘oss://bucket_name/file.txt’, ‘oss://bucket_name/file.txt’) # move local file to oss io.move(‘/your/local/file.txt’, ‘oss://bucket_name/file.txt’) # move directory io.move(‘oss://bucket_name/dir1’, ‘oss://bucket_name/dir2’)

Parameters
  • src – oss path or local path

  • dst – oss path or local path

safe_copy(src, dst, try_max=3)[source]

oss always bug, we need safe_copy!

copy(src, dst)[source]

Copy a file from src to dst. Same usage as shutil.copyfile. If you want to copy a directory, please use easycv.io.copytree

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # Copy a file from local to oss: io.copy(‘/your/local/file.txt’, ‘oss://bucket/dir/file.txt’)

# Copy a oss file to local: io.copy(‘oss://bucket/dir/file.txt’, ‘/your/local/file.txt’)

# Copy a file from oss to oss:: io.copy(‘oss://bucket/dir/file.txt’, ‘oss://bucket/dir/file2.txt’)

Parameters
  • src – oss path or local path

  • dst – oss path or local path

copytree(src, dst)[source]

Copy files recursively from src to dst. Same usage as shutil.copytree. If you want to copy a file, please use easycv.io.copy.

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # copy files from local to oss io.copytree(src=’/your/local/dir1’, dst=’oss://bucket_name/dir2’) # copy files from oss to local io.copytree(src=’oss://bucket_name/dir2’, dst=’/your/local/dir1’) # copy files from oss to oss io.copytree(src=’oss://bucket_name/dir1’, dst=’oss://bucket_name/dir2’)

Parameters
  • src – oss path or local path

  • dst – oss path or local path

listdir(path, recursive=False, full_path=False, contains: Optional[Union[List[str], str]] = None)[source]

List all objects in path. Same usage as os.listdir.

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss ret = io.listdir(‘oss://bucket/dir’, recursive=True) print(ret)

Parameters
  • path – local file path or oss path.

  • recursive – If False, only list the top level objects. If True, recursively list all objects.

  • full_path – if full path, return files with path prefix.

  • contains – substr to filter list files.

return: A list of path.

remove(path)[source]

Remove a file or a directory recursively. Same usage as os.remove or shutil.rmtree.

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss # Remove a oss file io.remove(‘oss://bucket_name/file.txt’)

# Remove a oss directory io.remove(‘oss://bucket_name/dir/’)

Parameters

path – local or oss path, file or directory

rmtree(path)[source]

Remove directory recursively, same usage as shutil.rmtree

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.remove(‘oss://bucket_name/dir_name’) # Or io.remove(‘oss://bucket_name/dir_name/’)

Parameters

path – oss path

makedirs(path, exist_ok=True)[source]

Create directories recursively, same usage as os.makedirs

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.makedirs(‘oss://bucket/new_dir/’)

Parameters

path – local or oss dir path

isdir(path)[source]

Return whether a path is directory, same usage as os.path.isdir

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.isdir(‘oss://bucket/dir/’)

Parameters

path – local or oss path

Return: bool, True or False.

isfile(path)[source]

Return whether a path is file object, same usage as os.path.isfile

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.isfile(‘oss://bucket/file.txt’)

Parameters

path – local or oss path

Return: bool, True or False.

glob(file_path)[source]

Return a list of paths matching a pathname pattern. .. rubric:: Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss io.glob(‘oss://bucket/dir/*.txt’)

Parameters

path – local or oss file pattern

Return: list, a list of paths.

abspath(path)[source]
authorize(path)[source]
last_modified(path)[source]
last_modified_str(path)[source]
size(path: str)int[source]

Get the size of file path, same usage as os.path.getsize

Example

from easycv.file import io io.access_oss(your oss config) # only oss file need, refer to IO.access_oss size = io.size(‘oss://bucket/file.txt’) print(size)

Parameters

path – local or oss path.

Return: size of file in bytes

class easycv.file.file_io.OSSFile(bucket, path, position=0)[source]

Bases: object

__init__(bucket, path, position=0)[source]

Initialize self. See help(type(self)) for accurate signature.

write(content)[source]
flush(retry=0)[source]
close()[source]
seek(position)[source]
class easycv.file.file_io.BinaryOSSFile(bucket, path)[source]

Bases: object

__init__(bucket, path)[source]

Initialize self. See help(type(self)) for accurate signature.

class easycv.file.file_io.NullContextWrapper(obj)[source]

Bases: object

__init__(obj)[source]

Initialize self. See help(type(self)) for accurate signature.

easycv.file.utils module
easycv.file.utils.create_namedtuple(**kwargs)[source]
easycv.file.utils.is_oss_path(s)[source]
easycv.file.utils.is_url_path(s)[source]
easycv.file.utils.url_path_exists(url)[source]
easycv.file.utils.get_oss_config()[source]

Get oss config file from env OSS_CONFIG_FILE, default file is ~/.ossutilconfig.

easycv.file.utils.oss_progress(desc)[source]
easycv.file.utils.ignore_oss_error(msg='')[source]
easycv.file.utils.mute_stderr()[source]

easycv.runner package

Submodules
easycv.runner.ev_runner module
class easycv.runner.ev_runner.EVRunner(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, fp16_enable=False)[source]

Bases: mmcv.runner.epoch_based_runner.EpochBasedRunner

__init__(model, batch_processor=None, optimizer=None, work_dir=None, logger=None, meta=None, fp16_enable=False)[source]

Epoch Runner for easycv, add support for oss IO and file sync.

Parameters
  • model (torch.nn.Module) – The model to be run.

  • batch_processor (callable) – A callable method that process a data batch. The interface of this method should be batch_processor(model, data, train_mode) -> dict

  • optimizer (dict or torch.optim.Optimizer) – It can be either an optimizer (in most cases) or a dict of optimizers (in models that requires more than one optimizer, e.g., GAN).

  • work_dir (str, optional) – The working directory to save checkpoints and logs. Defaults to None.

  • logger (logging.Logger) – Logger used during training. Defaults to None. (The default value is just for backward compatibility)

  • meta (dict | None) – A dict records some import information such as environment info and seed, which will be logged in logger hook. Defaults to None.

  • fp16_enable (bool) – if use fp16

run_iter(data_batch, train_mode, **kwargs)[source]

process for each iteration.

Parameters
  • data_batch – Batch of dict of data.

  • train_model (bool) – If set True, run training step else validation step.

train(data_loader, **kwargs)[source]

Training process for one epoch which will iterate through all training data and call hooks at different stages.

Parameters

data_loader – data loader object for training

val(data_loader, **kwargs)[source]

Validation step which Deprecated, using evaluation hook instead.

save_checkpoint(out_dir, filename_tmpl='epoch_{}.pth', save_optimizer=True, meta=None, create_symlink=True)[source]

Save checkpoint to file.

Parameters
  • out_dir – Directory where checkpoint files are to be saved.

  • filename_tmpl (str, optional) – Checkpoint filename pattern.

  • save_optimizer (bool, optional) – save optimizer state.

  • meta (dict, optional) – Metadata to be saved in checkpoint.

current_lr()[source]

Get current learning rates.

Returns

Current learning rates of all

param groups. If the runner has a dict of optimizers, this method will return a dict.

Return type

list[float] | dict[str, list[float]]

load_checkpoint(filename, map_location=device(type='cpu'), strict=False, logger=None)[source]

Load checkpoint from a file or URL.

Parameters
  • filename (str) – Accept local filepath, URL, torchvision://xxx, open-mmlab://xxx, oss://xxx. Please refer to docs/source/model_zoo.md for details.

  • map_location (str) – Same as torch.load().

  • strict (bool) – Whether to allow different params for the model and checkpoint.

  • logger (logging.Logger or None) – The logger for error message.

Returns

The loaded checkpoint.

Return type

dict or OrderedDict

resume(checkpoint, resume_optimizer=True, map_location='default')[source]

Resume state dict from checkpoint.

Parameters
  • checkpoint – Checkpoint path

  • resume_optimizer – Whether to resume optimizer state

  • map_location (str) – Same as torch.load().

easycv.toolkit package

Subpackages
easycv.toolkit.prune package
Submodules
easycv.toolkit.prune.prune_utils module
easycv.toolkit.quantize package
Submodules
easycv.toolkit.quantize.quantize_utils module

Submodules

easycv.version module

easycv

Indices and tables