easycv.models.detection.utils package¶
Submodules¶
easycv.models.detection.utils.boxes module¶
- easycv.models.detection.utils.boxes.bbox2result(bboxes, labels, num_classes)[source]¶
Convert detection results to a list of numpy arrays. :param bboxes: shape (n, 5) :type bboxes: torch.Tensor | np.ndarray :param labels: shape (n, ) :type labels: torch.Tensor | np.ndarray :param num_classes: class number, including background class :type num_classes: int
- Returns
bbox results of each class
- Return type
list(ndarray)
- easycv.models.detection.utils.boxes.generalized_box_iou(boxes1, boxes2)[source]¶
Generalized IoU from https://giou.stanford.edu/ The boxes should be in [x0, y0, x1, y1] format Returns a [N, M] pairwise matrix, where N = len(boxes1) and M = len(boxes2)
- easycv.models.detection.utils.boxes.bbox_overlaps(bboxes1, bboxes2, mode='iou', is_aligned=False, eps=1e-06)[source]¶
Calculate overlap between two set of bboxes.
FP16 Contributed by https://github.com/open-mmlab/mmdetection/pull/4889 .. note:
Assume bboxes1 is M x 4, bboxes2 is N x 4, when mode is 'iou', there are some new generated variable when calculating IOU using bbox_overlaps function: 1) is_aligned is False area1: M x 1 area2: N x 1 lt: M x N x 2 rb: M x N x 2 wh: M x N x 2 overlap: M x N x 1 union: M x N x 1 ious: M x N x 1 Total memory: S = (9 x N x M + N + M) * 4 Byte, When using FP16, we can reduce: R = (9 x N x M + N + M) * 4 / 2 Byte R large than (N + M) * 4 * 2 is always true when N and M >= 1. Obviously, N + M <= N * M < 3 * N * M, when N >=2 and M >=2, N + 1 < 3 * N, when N or M is 1. Given M = 40 (ground truth), N = 400000 (three anchor boxes in per grid, FPN, R-CNNs), R = 275 MB (one times) A special case (dense detection), M = 512 (ground truth), R = 3516 MB = 3.43 GB When the batch size is B, reduce: B x R Therefore, CUDA memory runs out frequently. Experiments on GeForce RTX 2080Ti (11019 MiB): | dtype | M | N | Use | Real | Ideal | |:----:|:----:|:----:|:----:|:----:|:----:| | FP32 | 512 | 400000 | 8020 MiB | -- | -- | | FP16 | 512 | 400000 | 4504 MiB | 3516 MiB | 3516 MiB | | FP32 | 40 | 400000 | 1540 MiB | -- | -- | | FP16 | 40 | 400000 | 1264 MiB | 276MiB | 275 MiB | 2) is_aligned is True area1: N x 1 area2: N x 1 lt: N x 2 rb: N x 2 wh: N x 2 overlap: N x 1 union: N x 1 ious: N x 1 Total memory: S = 11 x N * 4 Byte When using FP16, we can reduce: R = 11 x N * 4 / 2 Byte So do the 'giou' (large than 'iou'). Time-wise, FP16 is generally faster than FP32. When gpu_assign_thr is not -1, it takes more time on cpu but not reduce memory. There, we can reduce half the memory and keep the speed.
If
is_aligned
isFalse
, then calculate the overlaps between each bbox of bboxes1 and bboxes2, otherwise the overlaps between each aligned pair of bboxes1 and bboxes2.- Parameters
bboxes1 (Tensor) – shape (B, m, 4) in <x1, y1, x2, y2> format or empty.
bboxes2 (Tensor) – shape (B, n, 4) in <x1, y1, x2, y2> format or empty. B indicates the batch dim, in shape (B1, B2, …, Bn). If
is_aligned
isTrue
, then m and n must be equal.mode (str) – “iou” (intersection over union), “iof” (intersection over foreground) or “giou” (generalized intersection over union). Default “iou”.
is_aligned (bool, optional) – If True, then m and n must be equal. Default False.
eps (float, optional) – A value added to the denominator for numerical stability. Default 1e-6.
- Returns
shape (m, n) if
is_aligned
is False else shape (m,)- Return type
Tensor
Example
>>> bboxes1 = torch.FloatTensor([ >>> [0, 0, 10, 10], >>> [10, 10, 20, 20], >>> [32, 32, 38, 42], >>> ]) >>> bboxes2 = torch.FloatTensor([ >>> [0, 0, 10, 20], >>> [0, 10, 10, 19], >>> [10, 10, 20, 20], >>> ]) >>> overlaps = bbox_overlaps(bboxes1, bboxes2) >>> assert overlaps.shape == (3, 3) >>> overlaps = bbox_overlaps(bboxes1, bboxes2, is_aligned=True) >>> assert overlaps.shape == (3, )
Example
>>> empty = torch.empty(0, 4) >>> nonempty = torch.FloatTensor([[0, 0, 10, 9]]) >>> assert tuple(bbox_overlaps(empty, nonempty).shape) == (0, 1) >>> assert tuple(bbox_overlaps(nonempty, empty).shape) == (1, 0) >>> assert tuple(bbox_overlaps(empty, empty).shape) == (0, 0)
- easycv.models.detection.utils.boxes.bbox2distance(points, bbox, max_dis=None, eps=0.1)[source]¶
Decode bounding box based on distances.
- Parameters
points (Tensor) – Shape (n, 2), [x, y].
bbox (Tensor) – Shape (n, 4), “xyxy” format
max_dis (float) – Upper bound of the distance.
eps (float) – a small value to ensure target < max_dis, instead <=
- Returns
Decoded distances.
- Return type
Tensor
- easycv.models.detection.utils.boxes.distance2bbox(points, distance, max_shape=None)[source]¶
Decode distance prediction to bounding box.
- Parameters
points (Tensor) – Shape (B, N, 2) or (N, 2).
distance (Tensor) – Distance from the given point to 4 boundaries (left, top, right, bottom). Shape (B, N, 4) or (N, 4)
(Sequence[int] or torch.Tensor or Sequence[ (max_shape) – Sequence[int]],optional): Maximum bounds for boxes, specifies (H, W, C) or (H, W). If priors shape is (B, N, 4), then the max_shape should be a Sequence[Sequence[int]] and the length of max_shape should also be B.
- Returns
Boxes with shape (N, 4) or (B, N, 4)
- Return type
Tensor
- easycv.models.detection.utils.boxes.batched_nms(boxes, scores, idxs, nms_cfg, class_agnostic=False)[source]¶
Performs non-maximum suppression in a batched fashion.
Modified from torchvision/ops/boxes.py#L39. In order to perform NMS independently per class, we add an offset to all the boxes. The offset is dependent only on the class idx, and is large enough so that boxes from different classes do not overlap.
Note
In v1.4.1 and later,
batched_nms
supports skipping the NMS and returns sorted raw results when nms_cfg is None.- Parameters
boxes (torch.Tensor) – boxes in shape (N, 4).
scores (torch.Tensor) – scores in shape (N, ).
idxs (torch.Tensor) – each index value correspond to a bbox cluster, and NMS will not be applied between elements of different idxs, shape (N, ).
nms_cfg (dict | None) –
Supports skipping the nms when nms_cfg is None, otherwise it should specify nms type and other parameters like iou_thr. Possible keys includes the following.
iou_thr (float): IoU threshold used for NMS.
split_thr (float): threshold number of boxes. In some cases the number of boxes is large (e.g., 200k). To avoid OOM during training, the users could set split_thr to a small value. If the number of boxes is greater than the threshold, it will perform NMS on each group of boxes separately and sequentially. Defaults to 10000.
class_agnostic (bool) – if true, nms is class agnostic, i.e. IoU thresholding happens over all boxes, regardless of the predicted class.
- Returns
kept dets and indice.
boxes (Tensor): Bboxes with score after nms, has shape (num_bboxes, 5). last dimension 5 arrange as (x1, y1, x2, y2, score)
keep (Tensor): The indices of remaining boxes in input boxes.
- Return type
tuple