easycv.models.detection.utils package

Submodules

easycv.models.detection.utils.boxes module

easycv.models.detection.utils.boxes.bboxes_iou(bboxes_a, bboxes_b, xyxy=True)[source]
easycv.models.detection.utils.boxes.bbox2result(bboxes, labels, num_classes)[source]

Convert detection results to a list of numpy arrays. :param bboxes: shape (n, 5) :type bboxes: torch.Tensor | np.ndarray :param labels: shape (n, ) :type labels: torch.Tensor | np.ndarray :param num_classes: class number, including background class :type num_classes: int

Returns

bbox results of each class

Return type

list(ndarray)

easycv.models.detection.utils.boxes.box_cxcywh_to_xyxy(x)[source]
easycv.models.detection.utils.boxes.box_xyxy_to_cxcywh(x)[source]
easycv.models.detection.utils.boxes.box_iou(boxes1, boxes2)[source]
easycv.models.detection.utils.boxes.generalized_box_iou(boxes1, boxes2)[source]

Generalized IoU from https://giou.stanford.edu/ The boxes should be in [x0, y0, x1, y1] format Returns a [N, M] pairwise matrix, where N = len(boxes1) and M = len(boxes2)

easycv.models.detection.utils.boxes.bbox_overlaps(bboxes1, bboxes2, mode='iou', is_aligned=False, eps=1e-06)[source]

Calculate overlap between two set of bboxes.

FP16 Contributed by https://github.com/open-mmlab/mmdetection/pull/4889 .. note:

Assume bboxes1 is M x 4, bboxes2 is N x 4, when mode is 'iou',
there are some new generated variable when calculating IOU
using bbox_overlaps function:

1) is_aligned is False
    area1: M x 1
    area2: N x 1
    lt: M x N x 2
    rb: M x N x 2
    wh: M x N x 2
    overlap: M x N x 1
    union: M x N x 1
    ious: M x N x 1

    Total memory:
        S = (9 x N x M + N + M) * 4 Byte,

    When using FP16, we can reduce:
        R = (9 x N x M + N + M) * 4 / 2 Byte
        R large than (N + M) * 4 * 2 is always true when N and M >= 1.
        Obviously, N + M <= N * M < 3 * N * M, when N >=2 and M >=2,
                   N + 1 < 3 * N, when N or M is 1.

    Given M = 40 (ground truth), N = 400000 (three anchor boxes
    in per grid, FPN, R-CNNs),
        R = 275 MB (one times)

    A special case (dense detection), M = 512 (ground truth),
        R = 3516 MB = 3.43 GB

    When the batch size is B, reduce:
        B x R

    Therefore, CUDA memory runs out frequently.

    Experiments on GeForce RTX 2080Ti (11019 MiB):

    |   dtype   |   M   |   N   |   Use    |   Real   |   Ideal   |
    |:----:|:----:|:----:|:----:|:----:|:----:|
    |   FP32   |   512 | 400000 | 8020 MiB |   --   |   --   |
    |   FP16   |   512 | 400000 |   4504 MiB | 3516 MiB | 3516 MiB |
    |   FP32   |   40 | 400000 |   1540 MiB |   --   |   --   |
    |   FP16   |   40 | 400000 |   1264 MiB |   276MiB   | 275 MiB |

2) is_aligned is True
    area1: N x 1
    area2: N x 1
    lt: N x 2
    rb: N x 2
    wh: N x 2
    overlap: N x 1
    union: N x 1
    ious: N x 1

    Total memory:
        S = 11 x N * 4 Byte

    When using FP16, we can reduce:
        R = 11 x N * 4 / 2 Byte

So do the 'giou' (large than 'iou').

Time-wise, FP16 is generally faster than FP32.

When gpu_assign_thr is not -1, it takes more time on cpu
but not reduce memory.
There, we can reduce half the memory and keep the speed.

If is_aligned is False, then calculate the overlaps between each bbox of bboxes1 and bboxes2, otherwise the overlaps between each aligned pair of bboxes1 and bboxes2.

Parameters
  • bboxes1 (Tensor) – shape (B, m, 4) in <x1, y1, x2, y2> format or empty.

  • bboxes2 (Tensor) – shape (B, n, 4) in <x1, y1, x2, y2> format or empty. B indicates the batch dim, in shape (B1, B2, …, Bn). If is_aligned is True, then m and n must be equal.

  • mode (str) – “iou” (intersection over union), “iof” (intersection over foreground) or “giou” (generalized intersection over union). Default “iou”.

  • is_aligned (bool, optional) – If True, then m and n must be equal. Default False.

  • eps (float, optional) – A value added to the denominator for numerical stability. Default 1e-6.

Returns

shape (m, n) if is_aligned is False else shape (m,)

Return type

Tensor

Example

>>> bboxes1 = torch.FloatTensor([
>>>     [0, 0, 10, 10],
>>>     [10, 10, 20, 20],
>>>     [32, 32, 38, 42],
>>> ])
>>> bboxes2 = torch.FloatTensor([
>>>     [0, 0, 10, 20],
>>>     [0, 10, 10, 19],
>>>     [10, 10, 20, 20],
>>> ])
>>> overlaps = bbox_overlaps(bboxes1, bboxes2)
>>> assert overlaps.shape == (3, 3)
>>> overlaps = bbox_overlaps(bboxes1, bboxes2, is_aligned=True)
>>> assert overlaps.shape == (3, )

Example

>>> empty = torch.empty(0, 4)
>>> nonempty = torch.FloatTensor([[0, 0, 10, 9]])
>>> assert tuple(bbox_overlaps(empty, nonempty).shape) == (0, 1)
>>> assert tuple(bbox_overlaps(nonempty, empty).shape) == (1, 0)
>>> assert tuple(bbox_overlaps(empty, empty).shape) == (0, 0)
easycv.models.detection.utils.boxes.bbox2distance(points, bbox, max_dis=None, eps=0.1)[source]

Decode bounding box based on distances.

Parameters
  • points (Tensor) – Shape (n, 2), [x, y].

  • bbox (Tensor) – Shape (n, 4), “xyxy” format

  • max_dis (float) – Upper bound of the distance.

  • eps (float) – a small value to ensure target < max_dis, instead <=

Returns

Decoded distances.

Return type

Tensor

easycv.models.detection.utils.boxes.distance2bbox(points, distance, max_shape=None)[source]

Decode distance prediction to bounding box.

Parameters
  • points (Tensor) – Shape (B, N, 2) or (N, 2).

  • distance (Tensor) – Distance from the given point to 4 boundaries (left, top, right, bottom). Shape (B, N, 4) or (N, 4)

  • (Sequence[int] or torch.Tensor or Sequence[ (max_shape) – Sequence[int]],optional): Maximum bounds for boxes, specifies (H, W, C) or (H, W). If priors shape is (B, N, 4), then the max_shape should be a Sequence[Sequence[int]] and the length of max_shape should also be B.

Returns

Boxes with shape (N, 4) or (B, N, 4)

Return type

Tensor

easycv.models.detection.utils.boxes.batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False)[source]

Performs non-maximum suppression in a batched fashion.

Modified from torchvision/ops/boxes.py#L39. In order to perform NMS independently per class, we add an offset to all the boxes. The offset is dependent only on the class idx, and is large enough so that boxes from different classes do not overlap.

Note

In v1.4.1 and later, batched_nms supports skipping the NMS and returns sorted raw results when nms_cfg is None.

Parameters
  • boxes (torch.Tensor) – boxes in shape (N, 4).

  • scores (torch.Tensor) – scores in shape (N, ).

  • idxs (torch.Tensor) – each index value correspond to a bbox cluster, and NMS will not be applied between elements of different idxs, shape (N, ).

  • nms_cfg (dict | None) –

    Supports skipping the nms when nms_cfg is None, otherwise it should specify nms type and other parameters like iou_thr. Possible keys includes the following.

    • iou_thr (float): IoU threshold used for NMS.

    • split_thr (float): threshold number of boxes. In some cases the number of boxes is large (e.g., 200k). To avoid OOM during training, the users could set split_thr to a small value. If the number of boxes is greater than the threshold, it will perform NMS on each group of boxes separately and sequentially. Defaults to 10000.

  • class_agnostic (bool) – if true, nms is class agnostic, i.e. IoU thresholding happens over all boxes, regardless of the predicted class.

Returns

kept dets and indice.

  • boxes (Tensor): Bboxes with score after nms, has shape (num_bboxes, 5). last dimension 5 arrange as (x1, y1, x2, y2, score)

  • keep (Tensor): The indices of remaining boxes in input boxes.

Return type

tuple