NMS

在传统非端到端的目标检测算法中,以Faster RCNN为例,由于一对多的标签匹配机制,一个GT可能会被分配给多个正样本。因此预测阶段,可能会有多个重叠框,因此需要非极大值抑制(Non Maximum Suppression, NMS)算法来消除冗余框。

算法步骤

  1. 按照分类概率排序,概率最高的框作为候选框;
  2. 其他所有与该候选框IoU高于一个阈值(人为设定的超参),这些框的概率为设为0;
  3. 然后在剩余的框里寻找概率最大的框,其他所有与这个框IoU的概率置为0;
  4. 重复上述步骤,直到所有框的IoU之间小于阈值,或者概率被置为0;
  5. 剩下的所有概率非0的框就是最终的候选框。

代码实现

github

# --------------------------------------------------------
# Fast R-CNN
# Copyright (c) 2015 Microsoft
# Licensed under The MIT License [see LICENSE for details]
# Written by Ross Girshick
# --------------------------------------------------------

import numpy as np

def nms(dets, thresh):
    x1 = dets[:, 0]
    y1 = dets[:, 1]
    x2 = dets[:, 2]
    y2 = dets[:, 3]
    scores = dets[:, 4]

    # 计算每个候选框的面积
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)
    # 根据分类分数(置信度)进行排序
    order = scores.argsort()[::-1]

    keep = []
    while order.size > 0:
        # 当前置信度最大的框,加入返回列表中
        i = order[0]
        keep.append(i)
        # 计算当前置信度最大的候选框,与其他所有框的IoU
        xx1 = np.maximum(x1[i], x1[order[1:]])
        yy1 = np.maximum(y1[i], y1[order[1:]])
        xx2 = np.minimum(x2[i], x2[order[1:]])
        yy2 = np.minimum(y2[i], y2[order[1:]])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        ovr = inter / (areas[i] + areas[order[1:]] - inter)

        # 保留IoU小于给定阈值的框
        inds = np.where(ovr <= thresh)[0]
        order = order[inds + 1]

    return keep

Soft-NMS

可以发现上述NMS实现中,当两个框的IoU超出阈值时,会直接将置信度低的框删除。这在单一物体的情况下没有问题,但是如果两个物体有重叠,则无法将两个物体有效检出,因此,Soft-NMS被提出。Soft-NMS的思想是,不直接删除重叠框,而是使用一个基于IoU的衰减函数,降低IoU大于阈值的重叠框的置信度,IoU越大,衰减程度越大。

NMS和Soft-NMS的区别

  • 经典的NMS的衰减函数:
    si={si,iou(M,bi)<Nt,0,iou(M,bi)Nt. s_{i}=\left\{ \begin{aligned} s_i & , & iou(M,b_i) < N_t, \\ 0 & , & iou(M,b_i) \geq N_t. \end{aligned} \right. 这里iou(M,bi)iou(M,b_i)表示该候选框与置信度最大(当前被选择的)的候选的IoU。NtN_t为设定的NMS阈值。

  • Soft-NMS-线性衰减
    si={si,iou(M,bi)<Nt,si(1iou(M,bi)),iou(M,bi)Nt. s_{i}=\left\{ \begin{aligned} s_i & , & iou(M,b_i) < N_t, \\ s_i(1 - iou(M,b_i)) & , & iou(M,b_i) \geq N_t. \end{aligned} \right.

  • Soft-NMS-高斯衰减
    si=sieiou(M,bi)2σ,biD s_{i}=s_ie^{-\frac{iou(M,b_i)^2}{\sigma}},\forall b_i \notin D D是我们最终检测框集合。

代码实现

import numpy as np

def soft_nms(dets,sc, Nt=0.3, sigma=0.5, thresh=0.001, method=2):
    """
    py_cpu_softnms
    :param dets:   boexs 坐标矩阵 format [y1, x1, y2, x2]
    :param sc:     每个 boxes 对应的分数
    :param Nt:     iou 交叠门限
    :param sigma:  使用 gaussian 函数的方差
    :param thresh: 最后的分数门限
    :param method: 使用的方法
    :return:       留下的 boxes 的 index
    """

    # indexes concatenate boxes with the last column
    N = dets.shape[0]
    indexes = np.array([np.arange(N)])
    # 给box添加索引
    dets = np.concatenate((dets, indexes.T), axis=1)

    # 计算每个box的面积
    y1 = dets[:, 0]
    x1 = dets[:, 1]
    y2 = dets[:, 2]
    x2 = dets[:, 3]
    scores = sc
    areas = (x2 - x1 + 1) * (y2 - y1 + 1)

    for i in range(N):
        # intermediate parameters for later parameters exchange
        tBD = dets[i, :].copy()
        tscore = scores[i].copy()
        tarea = areas[i].copy()
        pos = i + 1

        # 找出最大score及其下标
        if i != N-1:
            maxscore = np.max(scores[pos:], axis=0)
            maxpos = np.argmax(scores[pos:], axis=0)
        else:
            maxscore = scores[-1]
            maxpos = 0
        # 如果当前i的得分小于后面的最大score,则与之交换,保证i上的score最大
        if tscore < maxscore:
            dets[i, :] = dets[maxpos + i + 1, :]
            dets[maxpos + i + 1, :] = tBD
            tBD = dets[i, :]

            scores[i] = scores[maxpos + i + 1]
            scores[maxpos + i + 1] = tscore
            tscore = scores[i]

            areas[i] = areas[maxpos + i + 1]
            areas[maxpos + i + 1] = tarea
            tarea = areas[i]

        # 计算IoU
        xx1 = np.maximum(dets[i, 1], dets[pos:, 1])
        yy1 = np.maximum(dets[i, 0], dets[pos:, 0])
        xx2 = np.minimum(dets[i, 3], dets[pos:, 3])
        yy2 = np.minimum(dets[i, 2], dets[pos:, 2])

        w = np.maximum(0.0, xx2 - xx1 + 1)
        h = np.maximum(0.0, yy2 - yy1 + 1)
        inter = w * h
        ovr = inter / (areas[i] + areas[pos:] - inter)

        # Three methods: 1.linear 2.gaussian 3.original NMS
        if method == 1:  # linear
            weight = np.ones(ovr.shape)
            weight[ovr > Nt] = weight[ovr > Nt] - ovr[ovr > Nt]
        elif method == 2:  # gaussian
            weight = np.exp(-(ovr * ovr) / sigma)
        else:  # original NMS
            weight = np.ones(ovr.shape)
            weight[ovr > Nt] = 0

        scores[pos:] = weight * scores[pos:]

    # 返回最终框的索引
    inds = dets[:, 4][scores > thresh]
    keep = inds.astype(int)

    return keep