xxx 1 daedalus: breaking non-maximum suppression in object

15
XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object Detection via Adversarial Examples Derui Wang, Chaoran Li, Sheng Wen, Qing-Long Han, Fellow, IEEE, Surya Nepal, Xiangyu Zhang, and Yang Xiang, Fellow, IEEE Abstract—This paper demonstrates that Non-Maximum Sup- pression (NMS), which is commonly used in Object Detection (OD) tasks to filter redundant detection results, is no longer secure. Considering that NMS has been an integral part of OD systems, thwarting the functionality of NMS can result in unexpected or even lethal consequences for such systems. In this paper, an adversarial example attack which triggers malfunctioning of NMS in end-to-end OD models is proposed. The attack, namely Daedalus, compresses the dimensions of detection boxes to evade NMS. As a result, the final detection output contains extremely dense false positives. This can be fatal for many OD applications such as autonomous vehicles and surveillance systems. The attack can be generalised to different end-to-end OD models, such that the attack cripples various OD applications. Furthermore, a way to craft robust adversarial examples is developed by using an ensemble of popular detection models as the substitutes. Considering the pervasive nature of model reusing in real-world OD scenarios, Daedalus examples crafted based on an ensemble of substitutes can launch attacks without knowing the parameters of the victim models. Experimental results demonstrate that the attack effectively stops NMS from filtering redundant bounding boxes. As the evaluation results suggest, Daedalus increases the false positive rate in detection results to 99.9% and reduces the mean average precision scores to 0, while maintaining a low cost of distortion on the original inputs. It is also demonstrated that the attack can be practically launched against real-world OD systems via printed posters. Index Terms—Object detection, adversarial example, cyberse- curity. I. I NTRODUCTION O BJECT Detection (OD) is a fundamental operation in the field of computer vision and is repeatedly used in the areas of autonomous vehicles, robotics, surveillance system, and biometric authentications. Convolutional Neural Network (CNN) is embedded in the core of many state-of- the-art OD models. These models can be classified into two broad categories: single-stage detection models (e.g., SSD [2], RetinaNet [3], and You-Only-Look-Once (YOLO) [4]) and two-stage detection models (e.g., R-CNN [5] and Fast R- CNN [6]). Single-stage ones become successful since they perform end-to-end detection which has low computational D. Wang, C. Li, S. Wen, Q.L. Han, and Y. Xiang are with Swinburne University of Technology, Hawthorn, VIC 3122, Australia. D. Wang and C. Li are also with Data61, CSIRO, Australia (E-mail: {deruiwang, chaoranli, swen, yxiang}@swin.edu.au). S. Nepal is with Data61, CSIRO, Epping, NSW 1710, Australia (E-mail: [email protected]). Xiangyu Zhang is with Department of Computer Science, Pur- due University, West Lafayette, IN 47907, United States (E-mail: [email protected]). Fig. 1: A demo of the digital Daedalus attack on YOLO- v3. The upper-left image is an original image taken from the COCO dataset [1]. The OD results are shown in the upper- right image. An adversarial example as well as the OD results of the adversarial example are shown in the second row. The adversarial example significantly increases the number of final detection boxes after NMS and makes the detector non-viable. However, the perturbation on the adversarial example is still imperceptible. overhead and is competitively accurate. Current end-to-end detection models are mostly built upon Fully Convolutional Network (FCN) backbones. However, FCNs output a large number of redundant detection results that should be filtered before the final results are produced. To filter the redundant detection boxes, Non-Maximum Suppression (NMS) is performed as a key step in the OD procedure [7], [8]. NMS has been an integral component in many OD algorithms for around 50 years. It relies on non- differentiable, hard-coded rules to discard redundant boxes. Beyond the OD task, NMS and its variants are also being used in applications such as face recognition and keypoint detection [9], [10]. Some recent works tried to use a differ- entiable model to learn NMS [11]. Another work suggested using soft-NMS to improve the filtering performance [12]. Therefore, NMS is very important to OD systems, once the arXiv:1902.02067v3 [cs.CV] 1 Jun 2020

Upload: others

Post on 02-Jun-2022

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 1

Daedalus: Breaking Non-Maximum Suppression inObject Detection via Adversarial Examples

Derui Wang, Chaoran Li, Sheng Wen, Qing-Long Han, Fellow, IEEE, Surya Nepal, Xiangyu Zhang,and Yang Xiang, Fellow, IEEE

Abstract—This paper demonstrates that Non-Maximum Sup-pression (NMS), which is commonly used in Object Detection(OD) tasks to filter redundant detection results, is no longersecure. Considering that NMS has been an integral part ofOD systems, thwarting the functionality of NMS can resultin unexpected or even lethal consequences for such systems.In this paper, an adversarial example attack which triggersmalfunctioning of NMS in end-to-end OD models is proposed.The attack, namely Daedalus, compresses the dimensions ofdetection boxes to evade NMS. As a result, the final detectionoutput contains extremely dense false positives. This can befatal for many OD applications such as autonomous vehiclesand surveillance systems. The attack can be generalised todifferent end-to-end OD models, such that the attack cripplesvarious OD applications. Furthermore, a way to craft robustadversarial examples is developed by using an ensemble ofpopular detection models as the substitutes. Considering thepervasive nature of model reusing in real-world OD scenarios,Daedalus examples crafted based on an ensemble of substitutescan launch attacks without knowing the parameters of thevictim models. Experimental results demonstrate that the attackeffectively stops NMS from filtering redundant bounding boxes.As the evaluation results suggest, Daedalus increases the falsepositive rate in detection results to 99.9% and reduces the meanaverage precision scores to 0, while maintaining a low cost ofdistortion on the original inputs. It is also demonstrated that theattack can be practically launched against real-world OD systemsvia printed posters.

Index Terms—Object detection, adversarial example, cyberse-curity.

I. INTRODUCTION

OBJECT Detection (OD) is a fundamental operation inthe field of computer vision and is repeatedly used

in the areas of autonomous vehicles, robotics, surveillancesystem, and biometric authentications. Convolutional NeuralNetwork (CNN) is embedded in the core of many state-of-the-art OD models. These models can be classified into twobroad categories: single-stage detection models (e.g., SSD [2],RetinaNet [3], and You-Only-Look-Once (YOLO) [4]) andtwo-stage detection models (e.g., R-CNN [5] and Fast R-CNN [6]). Single-stage ones become successful since theyperform end-to-end detection which has low computational

D. Wang, C. Li, S. Wen, Q.L. Han, and Y. Xiang are with SwinburneUniversity of Technology, Hawthorn, VIC 3122, Australia. D. Wang and C.Li are also with Data61, CSIRO, Australia (E-mail: {deruiwang, chaoranli,swen, yxiang}@swin.edu.au).

S. Nepal is with Data61, CSIRO, Epping, NSW 1710, Australia (E-mail:[email protected]).

Xiangyu Zhang is with Department of Computer Science, Pur-due University, West Lafayette, IN 47907, United States (E-mail:[email protected]).

Fig. 1: A demo of the digital Daedalus attack on YOLO-v3. The upper-left image is an original image taken from theCOCO dataset [1]. The OD results are shown in the upper-right image. An adversarial example as well as the OD resultsof the adversarial example are shown in the second row. Theadversarial example significantly increases the number of finaldetection boxes after NMS and makes the detector non-viable.However, the perturbation on the adversarial example is stillimperceptible.

overhead and is competitively accurate. Current end-to-enddetection models are mostly built upon Fully ConvolutionalNetwork (FCN) backbones. However, FCNs output a largenumber of redundant detection results that should be filteredbefore the final results are produced.

To filter the redundant detection boxes, Non-MaximumSuppression (NMS) is performed as a key step in the ODprocedure [7], [8]. NMS has been an integral component inmany OD algorithms for around 50 years. It relies on non-differentiable, hard-coded rules to discard redundant boxes.Beyond the OD task, NMS and its variants are also beingused in applications such as face recognition and keypointdetection [9], [10]. Some recent works tried to use a differ-entiable model to learn NMS [11]. Another work suggestedusing soft-NMS to improve the filtering performance [12].Therefore, NMS is very important to OD systems, once the

arX

iv:1

902.

0206

7v3

[cs

.CV

] 1

Jun

202

0

Page 2: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 2

NMS component in an OD system does not work properly,the system fails.

CNN is known to be vulnerable to adversarial exampleattacks [13]. For a human eye, an adversarial image canbe indistinguishable from the original one, but it can causemisclassification in a CNN based classifier. Recently, adver-sarial examples have been crafted for OD and segmentationtasks [14]–[19]. Such attacks result in either the misclassi-fication or the appearance/disappearance of the objects. Onthe other hand, false positive rate is a critical metric in ODmissions such as inventory management [20] and militaryobject recognition [21]. An attack that boosts the false positiverate could be very lethal to these OD systems.

In this paper, we propose a novel adversarial example attackfor NMS, which can be universally applied to any other end-to-end differentiable OD model. A demonstration of the attackis shown in Fig. 1. As it can be seen, a properly craftedadversarial example can trigger malfunctioning of the NMScomponent in YOLO-v3, in a way that it outputs dense andnoisy detection results. We name this attack Daedalus sinceit creates a maze composed of false positive detection results.

First, Daedalus is evaluated and analysed on how it canreduce the OD performance by breaking NMS1. Furthermore,Daedalus is adapted for launching black-box attack. Based ona census in Section II-B, we observe two facts: 1) current ODtasks are thriving on a few types of OD models, even manymodels have been proposed; 2) current OD tasks often loadtrained parameters into a model instead of reparameterisingthe model. Therefore, if an attacker can generate a universaladversarial example which works effectively against all thepopular models, the example has a high chance to successfullysabotage a black-box model. Herein, we propose using anensemble of substitutes (i.e., an ensemble of popular detectionmodels as the substitutes) in the Daedalus attack to breakthe black-box OD models. We exploit the latest version ofYOLO (i.e., YOLO-v3 [22]), SSD [2], and RetinaNet [3] as thevictims. As suggested by the experiments, universal Daedalusexamples generated by using an ensemble of popular ODmodels can launch attacks in a black-box scenario. To furtherenhance the viability of the attack, we instantiate the attackas a physical poster that can break NMS of OD models inthe real world. In summary, the contributions of this paper arelisted as follows:

• A novel adversarial example attack which aims at break-ing NMS (unlike the existing state-of-the-art attacks thatmerely target misclassification) is proposed. As a con-sequence, the attack creates a large number of falsepositives in the detection results;

• An ensemble-of-substitutes attack is proposed to generateuniversal adversarial examples which can make NMSmalfunction in various models. In such way, black-boxattacks can be partially converted to white-box attacks;

• Physical posters containing Daedalus perturbations arecrafted to practically attack real-world OD applications.

1In this paper, the term breaking NMS means that NMS cannot filterredundant bounding boxes detected from an adversarial example.

The paper is organised as follows: Section II introducesthe background knowledge of adversarial example, objectdetection, and NMS. Section III describes the design ofDaedalus. Section IV compares and selects loss functions fromthe candidates, and systematically evaluates the performanceof the Daedalus attack. Section V discusses the possibility ofapplying Daedalus adversarial examples on soft-NMS and agroup of black-box OD models. Moreover, we propose usinga poster of the Daedalus perturbation to attack real-world ODapplications. We also post the dilemma of defending againstDaedalus in this section. Section VI reviews the related workrelevant to this study. Section VII draws a conclusion on thepaper and presents our future work.

II. PRELIMINARIES

A. Adversarial examples

Given a DNN model F and a benign data example x, anattacker can craft an adversarial example of x by altering thefeatures of x by adding a perturbation δ, such that F (x+δ) 6=F (x). Generating adversarial examples can be interpreted asoptimising the example features towards adversarial objectives.Gradient-based optimisation methods are commonly employedto search for adversarial examples. The optimisation objectivecan be generally parsed as follows:

argminδ|δ|p + c · L(F (x+ δ), y) (1)

s.t. x+ δ ∈ [0, 1]n,

wherein L is an adversarial objective set by the attacker, and yis either a ground truth output or a desired adversarial outputfor F . x+ δ produces an adversarial example xadv . |δ|p is theallowed distortion bounded by the p-norm of δ. A constant c isadopted to balance the adversarial loss and the distortion. Thisbox-constrained non-convex optimisation problem is typicallysolved by gradient descent methods.

B. A census of object detection models

To better understand the impact of attacks against OD tasks,we first need to overview the distribution of models used inOD. We conducted a census of OD projects on Github. Wecrawled project names, project descriptions, and readme filesfrom 1, 696 OD-related repositories on Github. Next, basedon the reviewed models in [23], 45 released OD models wereselected as keywords. For each repository, we counted the key-words in the readme file, and selected the most frequent one asthe model adopted for the repository. Finally, we grouped therepositories by models. The ranking of models that appearedmore than 50 times is presented in Table I. Based on theanalysis, 1, 561 out of 1, 696 repositories use models fromTable I. The observation suggests that models and backbonesused for OD tasks are highly limited in number. Moreover,we analysed the source codes from the repositories to searchfor operations of loading trained models and parameters. Weobserved that almost all the source codes provide an interfacefor loading pre-trained parameters into the model backbones.We further investigated names of the pre-trained models and

Page 3: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 3

TABLE I: Object detection Models on Github

Rank Model Popularity1 SSD 6052 YOLO-v3 2513 Faster R-CNN 2364 R-CNN 1785 YOLO-v2 1646 RetinaNet 757 YOLO-v1 52

Popular models/Total 1,561/1,696

parameters. We found that many repositories use the samepre-trained parameters in their models.

Based on the observations, black-box attacks towards ODmodels can be partially converted to white-box attacks. Anattacker can first craft a universal adversarial example whichbreaks all the popular white-box OD models, and then launchattacks on black-box models using the example. The attack hasa high potential to break the black-box models since that theymay have the same parameters with the white-box models.

C. Object detectionGiven an input frame, an end-to-end OD model outputs de-

tected regression boxes and classification probabilities througha FCN. Herein, we adopt the latest version of You Only LookOnce (YOLO-v3) detector [4] as an example to introducethe knowledge of end-to-end detection algorithms. YOLO hasachieved the state-of-the-art performance on standard detectiontasks such as PASCAL VOC [24] and COCO [1]. YOLO hasbeen studied extensively in the research of video OD, robotics,and autonomous cars [25]–[27].

The backbone of YOLO-v3 is a 53-layer FCN (i.e., Darknet-53). YOLO-v3 segments the input image using three meshgrids in different scales (i.e., 13× 13, 26× 26, 52× 52). Themodel outputs detection results on these three scales. Each gridcontains three anchor boxes, whose dimensions are obtainedby K-means clustering on the dimensions of ground truthbounding boxes. Each grid outputs three bounding box pre-dictions. Henceforth, there are totally 10647 bounding boxesin the output of YOLO-v3. The Darknet-53 backbone outputs afeature map which contains bounding box parameters, the boxconfidence, and object class probabilities. Specifically, for eachbounding box, the feature map includes its height th, width tw,center coordinates (tx, ty), as well as the class probabilitiesp1, p2, ..., pn and the objectness t0. t0 is the confidence that abox contains an object. Given the feature map, the position ofeach detection box is then calculated based on the anchor boxdimension priors pw and ph, and centre offsets (cx, cy) fromthe top-left corner of the image. The final box dimension andposition will then be:

bx = cx + θ(tx)

by = cy + θ(ty) (2)bw = pwe

tw

bh = pheth

where bw is the box width, and bh is the box height. bx andby are the coordinates of the box center. The box confidenceb0 is calculated as:

b0 = t0 ·max{p1, p2, ..., pn} (3)

YOLO-v3 finally outputs the bounding boxes, the boxconfidences, and the class probabilities. The boxes whoseobjectness is below a preset threshold will be discarded. Theoutputs will then be processed by NMS to generate the finaldetection results.

D. Non-Maximum Suppression (NMS)

NMS is integrated into OD algorithms to filter detectionboxes. NMS makes selections based on the Intersection overUnion (IoU) among detection boxes. IoU measures the ratioof the overlapped area over the union area between two boxes.NMS works in two steps: 1) For a given object category, all ofthe detected bounding boxes in this category (i.e., candidateset) are sorted based on their box confidence scores fromhigh to low; 2) NMS selects the box which has the highestconfidence score as the detection result, and then it discardsother candidate boxes whose IoU value with the selected boxis beyond the threshold. Then, within the remaining boxes,NMS repeats the above two steps until there is no remainingbox in the candidate set. Suppose the initial detection boxesof category c are Bc = bc1, b

c2, ..., b

cn, the corresponding box

confidence scores are Sc = sc1, sc2, ..., s

cn. Given an NMS

threshold Nt, we can write the NMS algorithm as Algorithm 1:

Algorithm 1: Non-Maximum SuppressionInput: Bc, Sc, NtInitialisation:Dc ← {}while Bc 6= empty do

m← argmaxSc

M ← bcmDc ← Dc ∪MBc ← Bc −Mfor bci ∈ Bc do

if IoU(M, bci ) ≥ Nt thenBc ← Bc − bciSc ← Sc − sci

Output: Dc, Sc.

NMS recursively discards the redundant detection boxesfrom the raw detection proposals in all object categories, untilall raw proposals have been processed.

III. THE DAEDALUS ATTACK

The proposed Daedalus attack is introduced in this section.To clarify the intuition behind the attack, we first analysethe vulnerability that we exploit to break NMS. Then, weintroduce the general optimisation techniques used for makingadversarial examples. Subsequently, we propose a series ofadversarial loss functions to generate Daedalus adversarialexamples. At last, we assemble the algorithms together topresent the Daedalus attack, which can force a detectionnetwork to output an adversarial feature map that causes themalfunctioning of NMS.

Page 4: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 4

Fig. 2: The schemes we used to attack NMS. The schemeon the left side directly minimises IoUs between all boxpairs. The second one minimises the dimension for all boxes,and maximises the Euclidean distance among box centres forall boxes. The third scheme takes a low-cost approximation,which only minimises the dimension for all boxes.

A. Attacking NMS

We first introduce the NMS vulnerability which leads to theattack. NMS filters boxes based on their IoUs. Specifically,after the raw detection boxes are obtained, the boxes with aconfidence score below a given threshold will be discarded.Next, for a given object class and a box of this class withthe highest box confidence score (i.e., the final detection box),NMS will discard other remaining boxes based on their IoUswith the final detection box. This mechanism makes NMSvulnerable to attacks that suppress the IoUs among outputbounding boxes. Once the IoUs are suppressed below therequired threshold for NMS filtering, NMS can no longerfunction in the regular way. In this case, most of the redundantbounding boxes will be kept in the final detection results.

Three elemental schemes are considered to break NMS.First, for algorithms such as YOLO-v3, it is necessary to makemost of the bounding boxes survive the first round of filteringthat discards boxes based on the box confidences. Hence, weneed to maximise the box confidences. Second, to compressthe IoUs, we can directly minimise the IoU for each pair ofboxes. Alternatively, we can minimise the expectation of thebox dimension over all the bounding boxes, and maximisethe expectation of the Euclidean distance between box centresover all pairs of boxes. The schemes are illustrated in Fig. 2.

B. Generating adversarial examples

A benign example is denoted as x. An adversarial exampleof x is denoted as x′. The adversarial perturbation is δ =x′ − x. Therefore, the distortion calculated based on p-normof the perturbation is ‖δ‖p. We can then formulate the attackas the following optimisation problem:

argminδ‖δ‖p + c · f(x+ δ) (4)

s.t. x+ δ ∈ [0, 1]n,

wherein f is an adversarial loss function. We will proposea set of potential loss functions f in the next section. cis a constant to balance the distortion and the adversarial

loss f . δ is the adversarial perturbation. An adversary op-timises δ to search for an adversarial example whose pixelvalues are bounded between 0 and 1. This is a typical box-constraint optimisation problem. To bound the pixel valuesof the generated adversarial examples between 0 and 1, weadopt the idea of changing variables from the paper [28].We change the optimisation variables from δ to ω. There aredifferent functions we can use for changing the variables (e.g.,sigmoid function, arctan function, hyperbolic tangent function,etc.) Among these functions, the hyperbolic tangent functiongenerally produces higher gradients than others. Henceforth, tomake the optimisation converge faster, we select the hyperbolictangent function to change the variables. For δi in δ, we canapply a hyperbolic tangent function such that:

δi =1

2(tanh(ωi) + 1)− xi. (5)

We then optimise the above task based on the new variablesω. We apply binary search to find the best c during theoptimisation. Finally, the adversarial perturbation δ can becalculated easily through the inverse of Function 5.

C. Adversarial loss functions

In this section, three loss functions are designed to findadversarial examples which trigger malfunctioning of NMS.Based on the discussion in Section III-A, we formulate threepotential adversarial loss functions. We will evaluate each lossfunction in Section IV. There are totally n detection boxes(e.g., n = 10647 in YOLO-v3). Supposing the object detectorcan be regarded as a function F . Given an input image x, theoutputs are F (x) = {Bx, By, Bw, Bh, B0, P}. Herein, Bx ={bx0 , bx1 , ..., bxn}, By = {by0, b

y1, ..., b

yn}, Bw = {bw0 , bw1 , ..., bwn },

and Bh = {bh0 , bh1 , ..., bhn}. They are the dimensions and coor-dinates of the n output bounding boxes. B0 = {b00, b01, ..., b0n}are the objectness scores and P = {p0, p1, ..., pn} are the classprobabilities as introduced in Section II-C.

The attack can specify an object classified as category λto attack. If we want to attack multiple object categories, wecan include these categories in a set Λ. Based on the abovediscussion, we define three loss functions, f1, f2, and f3, asfollows:

f1 =1

‖Λ‖∑λ∈Λ

Ei:argmax(pi)=λ

{[b0i ·max(pi)− 1]2+ (6)

Ej:argmax(pj)=λ

IoUij},

f2 =1

‖Λ‖∑λ∈Λ

Ei:argmax(pi)=λ

{[b0i ·max(pi)− 1]2+ (7)

(bwi · bhiW ×H

)2 + Ej:argmax(pj)=λ

1

(bxi − bxj )2 + (byi − byj )2},

f3 =1

‖Λ‖∑λ∈Λ

Ei:argmax(pi)=λ

{[b0i ·max(pi)− 1]2+ (8)

(bwi · bhiW ×H

)2},

Page 5: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 5

wherein, IoUij is the IoU between the i-th and the j-thbounding boxes. Box dimensions are scaled between 0 and1, dividing bw and bh by the input image width W and heightH , such that the term is invariant towards changing inputdimension.

In f1, we first minimise the expectation of mean squarederror between the box confidence and 1 for all the boxes inwhich the detected objects are in the attacked category setΛ. Henceforth, the boxes will not be discarded due to lowbox confidence scores. In the second term, f1 minimises theexpectation of IoUs for all pairs of bounding boxes in Λ. Whenthe IoUs fall below the NMS threshold, the boxes can evadethe NMS filter. Alternatively, we minimise the expectation ofbox dimensions and maximise the expectation of box distancesin f2. f2 approximates the effect of directly minimising IoUby compressing box dimensions and distributing boxes moreevenly on the detected region.

Pairwise computation for obtaining IoUs and box distancescould be expensive if the output feature map contains too manyboxes. Therefore, we also propose f3 as a more efficient lossfunction. f3 only minimises the expectation of box dimensionsinstead of directly minimising the expectation of the IoUs.Minimising f3 leads to smaller bounding boxes. When theboxes become small enough, there will be no intersectionamong the boxes. In other words, the IoUs among the boxeswill be suppressed to zero. As a consequence, the boxes canavoid being filtered by NMS.

D. Ensemble of substitutes

In this section, an ensemble method is developed to convertblack-box attacks into white-box ones based on the observationon the current OD tasks (q.v. Section II-B). The transferabilityof adversarial examples is usually considered as a pathwayfor launching black-box attacks. The transferability of mis-classifying adversarial examples has been investigated in [29].However, instead of causing misclassification, the Daedalusattack has a different attacking purpose (i.e., disabling NMS).It is difficult to transfer Daedalus examples among OD modelswith different backbones since the feature maps extracted bythe backbones are highly divergent. Herein, we take anotherpathway to attack black-box models in the fields of OD.

To launch a black-box attack using Daedalus, an ensembleof popular OD models is used as the substitute to generateadversarial examples. This heavily relies on the observed factthat current OD tasks tend to reuse few popular OD modelswith pre-trained backbones. As a consequence, the generatedadversarial examples can be effective against all the popularmodels. Such a universal adversarial example has a highchance to break NMS of a black-box model. To find suchan example, we can optimise a loss function as an expectationover OD models:

f = Em∼M

fm, (9)

herein, M is a set of OD models with popular backbones,which includes the popular OD models and backbones. fm isthe loss value of the m-th model in M , calculated based onone of Equations 6,7, and 8.

E. The Daedalus attack against NMS

In this section, the above steps are assembled to form theDaedalus attack. Actually, the attack can be applied to allFCN-based detectors. FCN-based detectors output the detectedbox dimension, the box position and the classification proba-bilities in a fully differentiable, end-to-end manner. Therefore,the attack can compute end-to-end adversarial gradients andoptimise adversarial examples for FCN-based detectors.

Algorithm 2: The L2 Daedalus attackInput: x, Λ, γ, binary steps, η, max iteration, cmax, cminInitialisation:c← 10lossinit ← f(x)δ ← 0x∗ ← x+ δfor n starts from 0 to binary steps do

for i starts from 0 to max iteration doselect boxes in Λ to calculate loss f(x∗)δ ← δ − η5δ [‖δ‖+ c · f(x∗)]x∗ ← x∗ + δ

x′ ← the best x∗ found.if f(x∗) <= lossinit · (1− γ) then

cmax = min(c, cmax)c← 0.5 · (cmax + cmin)

elsecmin = max(c, cmin)c← 0.5 · (cmax + cmin)

Output: Adversarial example x′.

Algorithm 3: The L0 Daedalus attackInput: x, Λ, γ, binary steps, η, max iteration, cmax, cminInitialisation:c← 10lossinit ← f(x)δ ← 0x∗ ← x+ δfor n starts from 0 to binary steps do

for i starts from 0 to max iteration doselect boxes in Λ to calculate loss f(x∗)λ = argmaxλ∈δ5δ[‖δ‖+ c · f(x∗)]δ ← δ − η5λ [‖δ‖+ c · f(x∗)]x∗ ← x∗ + δ

x′ ← the best x∗ found.if f(x∗) <= lossinit · (1− γ) then

cmax = min(c, cmax)c← 0.5 · (cmax + cmin)

elsecmin = max(c, cmin)c← 0.5 · (cmax + cmin)

Output: Adversarial example x′.

Given an OD task with an image x as an input, the numberof bounding boxes from NMS is maximised. As an example,YOLO-v3 outputs three feature maps that are in differentscales. The three feature maps are concatenated into one finalfeature map. The details of the YOLO-v3 feature map arediscussed in Section II-C. We add a transformation layer afterthe feature map to obtain the final detection results. Thetransformation layer applies sigmoid transformations on thetx and ty in the feature map to get box centre coordinates bxand by . Exponential transformations are applied on tw and thto obtain the box width bw and the box height bh, respectively.

Page 6: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 6

We then calculate the values of the loss functions definedin Section III-C and Section III-D. Next, we minimise theloss value together with the distortion to generate Daedalusadversarial examples. During the optimisation, we introduce ahyper-parameter γ to control the strength of the attack.

The attack is bounded by two distortion metrics, namelyL2-norm and L0-norm. Herein, an Lp-norm is defined asLp = (

∑ni |δi|p)

1p , where δi is the perturbation on the i-th

pixel. Therefore, an L2-norm attack limits the perturbationsadded on all the pixels, while an L0-norm attack limits the totalnumber of pixels to be altered. We develop these two versionsof attack to study the effectiveness of all-pixel perturbationand selective-pixel perturbation while generating Daedalusadversarial examples.

For the L2 Daedalus attack, we can directly minimiseEquation 4. However, in the L0 Daedalus attack, since L0-norm is a non-differentiable metric, we alternatively perturbthe pixels that have the highest adversarial gradients in eachiteration, and then other gradients are clipped to be 0. Thedetailed algorithm of the L2 attack is presented in Algorithm 2,and the L0 attack is presented in Algorithm 3. Notice thatDaedalus only requires the box parameters to launch theattack. This means that the attack can be applied to thedetection models that output feature maps which contain thegeometric information of bounding boxes, even if they usedifferent backbone networks.

IV. EVALUATION

Extensive experiments are carried out to evaluate the per-formance of the Daedalus attack. The experiments are run ona machine with four RTX 2080ti GPUs and 128G of memory.The code of Daedalus is available on Github2.

A. Experiment settings

In the experiments, YOLO-v3 is picked as the model tobe attacked. A recommended objectness threshold of 0.5 isused in YOLO-v3. We first select the best loss function, andthen we quantify the performance of the Daedalus attackbased on YOLO-v3 with NMS. YOLO-v3 is employed as thesubstitute to generate 416×416 adversarial perturbations in theevaluations. During the binary search to find the best constantc, the maximum number of search steps is set to 5.L0 and L2 Daedalus examples of images in the COCO

2017val dataset [1] are crafted for the evaluation. For eachattack, first, to investigate the effectiveness of different confi-dence values (γ), we sweep γ from 0.1 to 0.9 and perturb 10COCO examples under each γ. Second, 100 COCO examplesare randomly perturbed under both a low (0.3) and a high(0.7) value for γ. First, to unify the evaluation results, all ofthe 80 object categories from the COCO dataset are includedin the target set Λ to be attacked. Second, the categories of‘person’ and ‘car’ are selected to be attacked. The graphicalresults are included in the supplementary file of the paper.

To prove that the attack can be applied on other detectors,the results of attacking RetinaNet-ResNet-50 and SSD are also

2https://github.com/NeuralSec/Daedalus-attack.

presented. RetinaNet-ResNet-50 uses ResNet-50 [30] as thebackbone. SSD employs VGG-16 as its backbone [2]. Thebackbones differ from that of YOLO-v3. The output featuremaps are different as well. An ensemble of the three ODmodels are used as the substitute to craft universal adversarialexamples. Furthermore, to demonstrate the effectiveness ofthe attack in the real world, Daedalus perturbations are madeinto physical posters to launch attacks against real-world ODapplications. As a showcase, we show the success of thephysical Daedalus attack in an indoor OD scene.

B. Loss function selectionThe proposed adversarial loss functions (Equations 6, 7,

and 8) are compared in this section. We select ten randomimages from the COCO dataset. For each function, adversarialexamples of these ten images are generated. We record theruntime per example and the volatile GPU utilisation of eachloss function. The performance of each function is measuredby looking at the success rate of finding adversarial examplesat different confidence scores. Since the L0 attack essentiallyhas the same optimisation process as the L2 does, we onlyrun the L2 attack for the comparison purpose. The results arereported in Table II.

The recorded runtime is averaged over the runtime ofgenerating all examples under the same loss function. TheGPU utilisation is also averaged over all examples. We useonly one GPU to generate an adversarial example in thissection. It can be observed that f3 (i.e., Equation 8) hasthe best success rate of finding adversarial examples underdifferent γ values. Moreover, f3 has the lowest runtime aswell as the lowest GPU utilisation compared to f1 and f2.

Based on the comparisons, f3 is selected as the loss functionfor Daedalus. The Daedalus attack is evaluated in the followingsections with f3.

C. Quantitative performanceIn this section, the attack is evaluated from two perspectives.

First, we evaluate how the Daedalus attack performs under dif-ferent NMS thresholds. Second, we investigate how the attackconfidence level (γ) affects detection results. The performanceof the attack is assessed on a YOLO-v3 detector. The attackperformance is quantified based on three metrics: the FalsePositive (FP ) rate, the mean Average Precision (mAP ), andthe distortion of the example.

First, to investigate the effect of NMS threshold on theattack, we obtain the detection results of each adversar-ial/benign example under a series of NMS thresholds, whichrange between 0.5 and 0.95. Second, to assess the effectof confidence value (γ) on attack performance, adversarialexamples are crafted for confidences from 0.1 to 0.9.False positive rate. The FP rate is defined as the ratio of falsepositive detection boxes with respect to the total number ofdetection boxes.

FP rate =NφN, (10)

wherein Nφ is the number of false positive detection boxes,and N is the total number of bounding boxes found by the

Page 7: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 7

TABLE II: Loss function comparison

Loss Function Metric Value of γ0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

f1

Runtime 2375.62 sGPU-Util 79%

Success rate 100% 100% 100% 90% 40% 40% 20% 0% 0%

f2

Runtime 1548.57 sGPU-Util 77%

Success rate 100% 100% 100% 100% 100% 90% 40% 20% 0%

f3

Runtime 1337.06 sGPU-Util 73%

Success rate 100% 100% 100% 100% 100% 100% 70% 40% 40%

Fig. 3: The false positive rate at IoU threshold of 0.5 withrespect to NMS threshold and confidence of attack. The NMSthreshold ranges from 0.5 to 0.95. The confidence ranges from0.1 to 0.9 in L2 attack, 0.1 to 0.8 in L0 attack.

Fig. 4: The false positive rate at IoU threshold of 0.75 withrespect to NMS threshold and confidence of attack. The NMSthreshold ranges from 0.5 to 0.95. The confidence ranges from0.1 to 0.9 in L2 attack, 0.1 to 0.8 in L0 attack.

model. FP rate measures the proportion of redundant boxesthat are not filtered by NMS. FP is maximally one and thatis when there is no correct detection result. A higher value ofFP indicates a higher success rate for the attack.

The IoU threshold in NMS is varied from 0.5 to 0.95with steps of 0.05 to obtain detection outputs under differentthresholds. The detection results of Daedalus examples madeunder confidences from 0.1 to 0.9 (0.1 to 0.8 for L0 since itcannot find examples with 0.9 confidence) is then obtained.This is because the L0 cannot further discard perturbationsfrom the L2 examples under a high γ. We have reported theFP rates in the results based on the IoU thresholds of 0.5and 0.75 in Fig. 3 and Fig. 4, respectively. Each FP rate isaveraged over 10 Daedalus examples. Next, we measure theaverage FP rate of 100 examples crafted under a confidencevalue of 0.3/0.7. The trends of the average FP rate at IoUthresholds of 0.5 and 0.75 with respect to the NMS thresholdare plotted in Fig. 5 and Fig. 6, respectively.

It can be observed that both L0 and L2 attacks achieve

Fig. 5: L0 attack false positive rate at IoU threshold of 0.5 withrespect to NMS threshold. Each FP rate is averaged from thedetection results of 100 Daedalus examples.

Fig. 6: L0 attack false positive rate at IoU threshold of 0.75with respect to NMS threshold. Each FP rate is averaged fromthe detection results of 100 Daedalus examples.

FP rates of at least 90% under different NMS thresholds,even with the lowest attack confidence of 0.1. Increasing theNMS threshold slightly reduces the FP rate of low-confidenceattacks. However, it barely has an impact on the performanceof high-confidence attacks by changing the NMS threshold.FP rate increases with the increase of γ. It can get as highas 99.9% for a high-confidence Daedalus attack, even with astricter NMS threshold.Mean average precision. The mAP is commonly used toevaluate the performance of OD model. It averages the Aver-age Precisions (AP s) over all detected object categories. Thismetric is defined as follows:

mAP =1

M

M∑i=1

APi, (11)

in which M is the number of object categories. We adopted theall-point interpolation AP from Pascal VOC challenge [24] to

Page 8: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 8

Adversarial

BenignBenign

Adversarial

Fig. 7: mAP IoU=.50 of Daedalus example detection resultswith respect to NMS threshold and attack confidence. Thethreshold ranges from 0.5 to 0.95.

Fig. 8: mAP IoU=.75 of Daedalus example detection resultswith respect to NMS threshold and attack confidence. Thethreshold ranges from 0.5 to 0.95.

calculate mAP .The recommended COCO evaluation metrics mAP IoU=.50

and mAP IoU=.75, which are the mAP values calculated atIoU thresholds of 0.5 and 0.75, respectively, are adopted tomeasure the detection precisions on benign examples andDaedalus examples. Fig. 7 and Fig. 8 show the mAP valuesof the detection results with changing NMS threshold and γ.We also plotted the average mAP IoU=.50 and mAP IoU=.75

for 100 examples made under confidence values of 0.3 and0.7 in Fig. 9 and Fig. 10, respectively.

Based on the results, both L0 and L2 attacks decrease mAPof the detector from 45% ∼ 59% to a value between 0% to17.8%. mAP drops to 0% when the confidence of the attacksurpasses 0.2.Distortion of the examples. The distortions of the Daedalusadversarial examples crafted based on L2-norm and L0-normare evaluated in this part. The maximum, the minimum andthe average distortion of 10 adversarial examples are recordedfor each attack confidence value. The results are summarisedin Table III. It can be seen that the distortion of the adversarialexamples has a positive correlation with γ. The distortionincreases slowly before the confidence reaches 0.6. Once theconfidence goes beyond 0.6, there is a rapid increase in thedistortion amount. Additionally, compared to the L2 attack, theL0 one cannot find any adversarial examples at the confidenceof 0.9.

Based on the evaluation, both L0 and L2 attacks requiremoderate distortions to make low-confidence examples. Com-pared to the L0 attack, the L2 attack introduces less distortionwhen the confidence is between 0.2 and 0.7, and it is slightly

Fig. 9: mAP IoU=.50 of the L0 and L2 Daedalus attacks withrespect to NMS threshold. The threshold ranges from 0.5 to0.95. Each mAP is averaged from the detection results of 100Daedalus examples.

Fig. 10: mAP IoU=.75 of the L0 and L2 Daedalus attacks withrespect to NMS threshold. The threshold ranges from 0.5 to0.95. Each mAP is averaged from the detection results of 100Daedalus examples.

better in finding adversarial examples with the confidence of0.9. However,the L0 attack can reduce the distortion whenthe confidence goes above 0.8. It is known that FCN-basedobject detectors usually segment the input image by a meshgrid. Each grid cell has its own responsibility for detectingobjects. Low-level perturbation based on L0-norm may leavesome of the grid cells unperturbed. Therefore, these cells canstill output normal detection results, which can be filtered byNMS. As a result, the L0 attack perturbs most of the pixels togenerate a well-qualified adversarial example, which implieshaving high distortions in the example. High-level perturbationis imposed on each pixel when the confidence goes above 0.8.However, the L0 attack can discard the perturbation on somepixels to reduce the overall distortion.

Page 9: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 9

TABLE III: Distortions under different confidences

Attack Distortion Value of γ0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

L2 DaedalusMaximum 157.49 259.25 113.24 324.53 598.45 4138.47 5915.90 25859.67 19553.05Average 130.59 218.86 96.51 166.27 344.75 1670.35 1835.04 9953.59 14390.76

Minimum 94.30 175.07 75.24 95.30 185.25 351.55 366.49 210.24 11894.38

L0 DaedalusMaximum 162.74 513.54 555.11 1200.48 1001.86 42144.96 3891.80 3364.29 N/AAverage 87.81 346.71 440.21 651.77 736.40 12765.82 1992.30 1501.58 N/A

Minimum 53.26 165.50 339.44 402.48 436.15 3713.67 729.38 848.36 N/A

V. ANALYSIS OF THE ATTACK

In this section, we first investigate whether the Daedalusattack can be extended to a variation of NMS algorithm,namely soft-NMS [12] or not. Second, we demonstrate theuniversality of the attack by applying it to attack RetinaNet-Resnet-50. Third, an ensemble of substitutes is proposed tocreate universal Daedalus examples that can attack multi-ple object detectors. The proposed ensemble attack makesDaedalus examples transferable so that they can be used inblack-box scenarios. Subsequently, posters made of Daedalusperturbations are used to launch attacks against real-world ODsystems. Finally, we discuss the problem of defending againstthe attack.

A. Breaking Soft-NMS

In addition to breaking NMS, the Daedalus attack is testedagainst soft-NMS, which iteratively decreases the confidencescore of overlapping boxes instead of discarding them [12].Soft-NMS has linear and Gaussian versions. We attack bothversions to investigate whether soft-NMS can handle it or not.We randomly select an image to craft its L0 and L2 Daedalusexamples to attack a soft-NMS-equipped YOLO-v3 detector.The detection results for L0 adversarial examples are displayedin Fig. 11a. For L2 adversarial examples, the detection resultsare similarly illustrated in Fig. 11b. From the results one canfind that both L0 and L2 can break linear soft-NMS as wellas Gaussian soft-NMS.

The FP rate of the attack for soft-NMS is also measured.The results are shown in Fig. 12. The L0 adversarial exampleswhich are made with different confidence values result in aFP rate of 100%. The L2 attack also leads to 100% FP ratewhen the attack confidence is above 0.2.

B. Universal Daedalus examples

Based on the census in Section I, three popular models,namely YOLO-v3, SSD, and RetinaNet, are employed toverify the idea of ensemble attack. We do not introducemore models as substitutes, simply due to the limited GPUmemory. According to Table I, a universal Daedalus examplefor YOLO-v3, SSD, and RetinaNet can already break 54.9%of the 1,696 repositories. We investigate whether a universalDaedalus example can be found to break all the models. In theexperiments, YOLO-v3 employs Darknet-53 as its backbonenetwork. RetinaNet and SSD have different backbone net-works and use ResNet-50 and VGG-16, respectively. Since thethree models are backed by different backbones, they generatedivergent feature maps.

Adversarial Example NMS

Soft-NMS Linear Soft-NMS Gaussian

(a)

Adversarial Example

Soft-NMS Linear p,,.. nn++a,lnl,,pottedplant,,nnt : ..• p, i: pottedplant . Jiu, , . , pottedpla

po..,u11cs l�<>tl . n-··-..i..• .. .,t 1>ottedph .

i> c 1, p pottedplant ___ ;, po1tedplan t �· I·•· '"" · · D<r 11,d- · ·:-: -. , dpl edplunt · ·· pottud1>k1nt �pot potte ant .•.• Jle I

II j ;,�, �:.'; pottedp,, oottedol, pottedplant t · . [pp pottedplan pottedplant font '..:''J"' .. ;

i: 'I' c,1. ··-t , "" .. • ,.,w .. • bottle . I 1 , • p,)tfedpla ,t t n·t • • •. • r. . r ... - -r ... � •>ttedplant .• >ttle ""'"'·

1 'po"h·,nttorlr,1,, ·-, bottle ;, D pottedplant .; pottedplant - I .!e:io·!fle

tt . . It t111i L 1,1\,flHID po· 1111:11: pn "poltedplant t 11• 1' pott • 1 >ul'l-1.'t · : _om111 ·, 110<* ubo�

NMS p,,,, nn++ .. ..ino, I pottedplant pottedplant p,p POffedplanf C::S1"1\ • ·r t,IUIICS\,....,IUII

oc uu .. u,u, , < ttec u tt dpl t .,nt·. nn++, ..lnl .. nt I PO 8 an

pc 1, p pottedplant ,t pl1tec1J,lu111 11plant :., 1.,. ...-· __ :dur"·1··�;,,:lplnnt · .. pottud1>I :11 pc· nnt potte :!plant .•.• Jle I

I j ;.i:-··" pottedp,, oottec.ol, pottedplant .• [ p pottedplan pottedplant , · ·· ·: wvttle

i:!1 · ·· .... , •···• ••• ,. ...... bottle r.•' p,:1tedfpcttedplant .'." :." ""'"'� I . p,: r .... . , • .... -...-,-.1 t ·-• >ttle hnttle I I· ,·,�tt ... +1., • b

--··· ;, • pottE·dplant .; pottedplant - Dlt, .. •'!lie, I'� •·• "' pntt,,, otted ia"nt t ,:bottle

p·r·ne1·, pott P P

.II .II." I . j 111/·· ··• met a • n:•: � I be�

Soft-NMS Gaussian P · 1 ·, nntta..lnl, pottedplant plnnt '· �- 1 ··-.. ppottedplant ·fl"" , . · ·pottedplan

tt p0 f,JUIIC:S I Cl ,.nn++n..lnl,,-,t. 1)0 8(111. P IIC p pottedplant l11nt I' po1tedplant tt '• • · · • po· 1ocr · · • · · •

pottud1>k1�t .. : · · pc,ttec I' a pc.-lodpla�-��9·1 l j ;.��··;Dotte;,�n+-·• "" pottedplant t . [pp pottedplan pottedplant ,la)lt 1'. •• u•J111•

P111c1r ..... , 1'""' .... ..,.,w,,, bottle

r.• p,)tted1 pc ttedplant • · •. • ,..,. I

Po· ' ' ,--.. -...-,-:,t .::. >ttle bottle I.. 1111·,nttoA ... l,l, 1., ,. D pottedplant .1 pottedplant , I .!e:io·!fle,

tt· . · 1• 1sL "'"""9 po·h,i:1p Pn

0�pottedplant ... �· • ,,,1-i-1.'t P _om11 a , 11<><*

(b)

Fig. 11: (a) The L0 Daedalus attack against soft-NMS. Thetop left image is an L0 Daedalus example crafted under aconfidence of 0.3. The top right one shows the detection resultsfrom NMS, as a reference. The left bottom one shows theresults of linear soft-NMS, and the right bottom one showsthe results of Gaussian soft-NMS. (b) The L2 Daedalus attackagainst soft-NMS. The top left image is an L2 Daedalusexample crafted under a confidence of 0.3. The top right oneshows the detection results from NMS, as a reference. Theleft bottom one shows the results of linear soft-NMS, and theright bottom one shows the results of Gaussian soft-NMS.

Fig. 12: The FP rate of the Daedalus attack against YOLO-v3with soft-NMS. We evaluated the FP rate at IoU thresholds of0.5 and 0.75.

The detection results of the adversarial example are dis-played in Fig. 13. It can be observed that the adversarial ex-ample can trigger NMS malfunctioning for all the models. Thedetection results for more adversarial examples crafted usingthe ensemble of substitutes can be found in the supplementaryfile of this paper. Furthermore, when there is a large numberof OD models in the ensemble, an attacker may adopt methodssuch as [31] to find proper coefficient for each term in the lossfunction to enhance the attack performance.

Page 10: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 10

RetinaNet SSD YOLO-v3

BenignAdversarial

Fig. 13: The detection results of a Daedalus adversarial ex-ample made by ensemble-of-substitutes approach. We detectthe adversarial example using YOLO-v3, RetinaNet, and SSD.The first row contains a benign example and its detectionresults. The second row contains an adversarial example andthe detection results of the adversarial example.

C. Physical Daedalus examplesTo launch the Daedalus attack in the real world, the pertur-

bation found by the attack can be instantiated into a poster.When the poster is captured by the camera of an objectdetection system, it will trigger the NMS malfunctioning.Given a scene to attack, we first film a video of the scene.The video contains a set X of image frames. Next, we canoptimise a poster of the perturbation δ as follows:

argminδ

Ex∼X Eφ f3(x, δ,Φ) + SNPS(δ, β) (12)

Herein, Φ is a set of random transformations (e.g., zooming,rotation, etc.). The detailed transformations are summarisedin the supplementary file. SNPS is a Sub-sampled Non-Printability Score of the poster δ. Non-Printability Score(NPS) is used for measuring the error between a printed pixeland its digital counterparts. Sum of NPSs of all pixels in animage can be used to measure the image printability [32].However, it is expensive to compute the NPS of a large image.Therefore, given a sample rate β and pixels of δ, we samplea subset of the pixels to compute the SNPS. It is found thatcalculating the NPS from only 0.1% ∼ 1% of the pixels issufficient for making an effective Daedalus poster. Moreover,it is observed that the poster can be generalised across differentscenes.

Next, the robustness of three different sized (i.e.,400×400,720× 720, and 1280× 1280) Daedalus posters are evaluatedin an indoor environment. The effectiveness of the posters isrecorded in Table IV.

In the scenario of Smart Inventory Management (SIM), anOD model will be used to identify the items that are placedinto or withdrawn from an inventory. We simulate an attackagainst the case of depositing a MacBook and a monitor intothe inventory. A demonstration of the attack can be found inthe supplementary file.

D. Adaptive defencesDaedalus minimises detection boxes to trigger a NMS

malfunction. Henceforth, a straight forward defence might

Fig. 14: Distributions of bounding box dimension from de-tection results of YOLO-v3. The distributions are based ondetection results of 100 COCO images and corresponding 100adversarial examples, respectively. In total, there are 812 boxesin the benign results and 286,469 adversarial boxes in theadversarial results. As a defence, boxes whose dimension isbelow 103.62 in the NMS process are discarded.

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9Confidence

0.00.10.20.30.40.50.60.70.80.9

FP-ra

te/m

AP

[email protected] [email protected] [email protected] [email protected]

Fig. 15: The FP-rate and the mAP of YOLO-v3 with NMSdefence. The defence barely changes the FP-rate and the mAP.

be limiting the minimal allowed detection box size in theNMS process (i.e., NMS defence). To investigate this possibledefence, we first study the outputs of YOLO-v3 given 100benign examples and corresponding 100 Daedalus examples.The distribution of the detection box dimension is visualised inFig. 14. According to Fig. 14, an empirical threshold of 103.62

is selected to filter adversarial boxes from detected boxes.Given this defence, the FP rate and the mAP of YOLO-v3are again evaluated on 10 Daedalus examples. The evaluationresults are plotted in Fig. 15. According to the results, NMSdefence is not an effective defence.

A second adaptive defence might be MagNet [33]. Andenoise autoencoder is trained to reform Daedalus examplesback to benign ones, and measure the reconstruction error todetect strong perturbations. Therefore, the evaluation of thedefence is twofold: 1) whether the defence can increase mAPand reduce FP rate via reforming an adversarial example;2) whether the reconstruction error can be detected whenan adversarial example cannot be turned into a benign one.To answer these two questions, we measure the FP rate, the

Page 11: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 11

TABLE IV: Effectiveness of the Daedalus posters

Metric EffectivenessSmall poster (400× 400) Medium poster (720× 720) Large poster (1280× 1280)

Horizontal angle −16.5◦ ∼ 16.5◦ −18◦ ∼ 18◦ −19.5◦ ∼ 19.5◦

Vertical angle −11.5◦ ∼ 11.5◦ −12◦ ∼ 12◦ −12◦ ∼ 12◦

Distance 0.12 ∼ 1.2 metres 0.21 ∼ 1.5 metres 0.2 ∼ 1.8 metres

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9Confidence

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

FP-ra

te/m

AP

2000

4000

6000

8000

10000

12000

MSE

[email protected] [email protected] [email protected] [email protected] MSE

Fig. 16: The FP-rate and the mAP of YOLO-v3 with MagNetdefence. Reforming Daedalus examples by an autoencodercan slightly increase the performance of YOLO-v3. However,the [email protected] still stays above 0.6 while the [email protected] above 0.7. The [email protected] remains below 0.45 whilethe [email protected] remains below 0.3. Furthermore, the modelperformance drops when the confidence of the attack goesabove 0.7.

mAP and the Mean Squared Error (MSE) between a Daedalusexample and a reformed example (i.e., the reconstruction error)of a defended YOLO-v3 model. The results are plotted inFig. 16. According to the evaluation results, first, reformingthe adversarial examples can barely increase the performanceof the detection model. Second, The reconstruction errorincreases only after the confidence of the attack reaches 0.5.This means it is difficult to detect attacks with confidencelevels below 0.5. Interestingly, it is observed that the detectedobjects in an image reformed by a denoise autoencoder cannotretain proper positions and categories. The Daedalus attackposes an intimidating threat to OD tasks. As revealed by theattack, NMS is no longer a secure solution for OD.

VI. RELATED WORK

Researchers have extensively studied the methods for craft-ing adversarial examples against machine-learning-based clas-sifiers. The algorithms for generating such examples can beeither gradient-based attacks or gradient-free attacks. The for-mer type of attacks finds adversarial examples by minimisingthe cost of an adversarial objective set by the attacker, basedon gradients information. The attacks can be either one-step gradient descent [34] or iterative gradient descent [28],[35], [36]. Additionally, attackers can compute a Jacobiansaliency map of features and perturb those salient features [37].Based on these algorithms, there are evolved attacks that usedifferent distortion metrics to make adversarial examples moreimperceptible to human eyes [38]–[40]. Recently, Expectationover Transformation (EoT) algorithm has been proposed tosynthesise robust adversarial examples [41], [42]. On the other

hand, gradient-free attacks rely on zeroth-order optimisationtechniques to search adversarial examples against black-boxmodels [43]–[46].

Beyond the above attacks, there are attacks targeting atdomains such as malware detection and object detection [47].For example, dense adversarial generation (DAG) algorithmwas proposed to create adversarial examples for OD andsegmentation [15]. Lu et al. crafted robust adversarial ex-amples that caused misclassification in object detector [14].RP2 was proposed to craft adversarial example of real-worldroad signs [48]. Subsequently, RP2 was extended to attackYOLO-v2 [16]. There are also other attempts to craft morerobust physical examples [17]–[19]. However, current attacksmainly result in misclassification/appearance/disappearance ofthe detected objects [17], [18]. The Daedalus attack creates ad-versarial examples that cause malfunctioning of NMS, whichis different from all the previous attacks in nature.

VII. CONCLUSION

In this paper, we propose a novel type of adversarialexample which aims at disabling NMS functionality in objectdetection models. The proposed attack, named Daedalus, cancontrol both the strength of the generated adversarial examplesand the object class to be attacked. The attacked model willoutput extremely noisy detection results such that it is nolonger functional. The attack can reduce the mAP of detectionto nearly 0%. The false positive rate for Daedalus examplescan go up to 99.9%. Meanwhile, the required distortionto launch the Daedalus attack is imperceptible. This attackaims at breaking NMS instead of causing misclassifications.Unlike misclassification-targeting adversarial examples, it isdifficult to defend against such attack since the attackedfeature map is much more complicated than the classificationlogits. Transferability of the Daedalus attack is affected by theselected substitutes. We rely on minimising the expectation ofbox dimensions over a set of feature maps to make robustexamples, which can make NMS malfunction in multipledetection models. There are some remaining problems we aimto address in future. For example, considering the complexityof OD feature maps, it is difficult to make universal adver-sarial examples that can launch zero-knowledge attacks. TheDaedalus attack reveals a vulnerability lies in object detectionalgorithms, which can have fatal consequences in the realworld. Nevertheless, defending method against Daedalus isstill a missing piece. We will also try to address this problemin our future work.

REFERENCES

[1] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan,P. Dollar, and C. L. Zitnick, “Microsoft coco: Common objects incontext,” in Proceedings of the European Conference on ComputerVision (ECCV). Springer, 2014, pp. 740–755.

Page 12: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 12

[2] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y. Fu, andA. C. Berg, “Ssd: Single shot multibox detector,” in Proceedings of theEuropean Conference on Computer Vision (ECCV). Springer, 2016,pp. 21–37.

[3] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, “Focal lossfor dense object detection,” in Proceedings of the IEEE InternationalConference on Computer Vision (ICCV). IEEE, 2017, pp. 2980–2988.

[4] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only lookonce: Unified, real-time object detection,” in Proceedings of the IEEEconference on Computer Vision and Pattern Recognition (CVPR). IEEE,2016, pp. 779–788.

[5] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich featurehierarchies for accurate object detection and semantic segmentation,”in Proceedings of the IEEE conference on Computer Vision and PatternRecognition (CVPR). IEEE, 2014, pp. 580–587.

[6] R. Girshick, “Fast r-cnn,” in Proceedings of the IEEE InternationalVonference on Computer Vision (ICCV). IEEE, 2015, pp. 1440–1448.

[7] A. Neubeck and L. Van Gool, “Efficient non-maximum suppression,”in Proceedings of the International Conference on Pattern Recognition(ICPR), vol. 3. IEEE, 2006, pp. 850–855.

[8] Y. Cong, D. Tian, Y. Feng, B. Fan, and H. Yu, “Speedup 3-d texture-lessobject recognition against self-occlusion for intelligent manufacturing,”IEEE transactions on cybernetics, vol. 49, no. 11, pp. 3887–3897, 2018.

[9] F. Schroff, D. Kalenichenko, and J. Philbin, “Facenet: A unified embed-ding for face recognition and clustering,” in Proceedings of the IEEEconference on Computer Vision and Pattern Recognition (CVPR). IEEE,2015, pp. 815–823.

[10] W. Wu, Y. Yin, X. Wang, and D. Xu, “Face detection with differentscales based on faster r-cnn,” IEEE transactions on cybernetics, vol. 49,no. 11, pp. 4017–4028, 2018.

[11] J. H. Hosang, R. Benenson, and B. Schiele, “Learning non-maximumsuppression.” in Proceedings of the IEEE conference on Computer Visionand Pattern Recognition (CVPR). IEEE, 2017, pp. 6469–6477.

[12] N. Bodla, B. Singh, R. Chellappa, and L. S. Davis, “Soft-nms-improvingobject detection with one line of code,” in Proceedings of the IEEEInternational Conference on Computer Vision (ICCV). IEEE, 2017,pp. 5562–5570.

[13] N. Akhtar and A. Mian, “Threat of adversarial attacks on deep learningin computer vision: A survey,” IEEE Access, vol. 6, pp. 14 410–14 430,2018.

[14] J. Lu, H. Sibai, and E. Fabry, “Adversarial examples that fool detectors,”arXiv preprint arXiv:1712.02494, 2017.

[15] C. Xie, J. Wang, Z. Zhang, Y. Zhou, L. Xie, and A. Yuille, “Adversarialexamples for semantic segmentation and object detection,” in Proceed-ings of the IEEE International Vonference on Computer Vision (ICCV).IEEE, 2017, pp. 1369–1378.

[16] D. Song, K. Eykholt, I. Evtimov, E. Fernandes, B. Li, A. Rahmati,F. Tramer, A. Prakash, and T. Kohno, “Physical adversarial examplesfor object detectors,” in Proceedings of the 12th USENIX Workshop onOffensive Technologies (WOOT 18). USENIX Association, 2018.

[17] S.-T. Chen, C. Cornelius, J. Martin, and D. H. P. Chau, “Shapeshifter:Robust physical adversarial attack on faster r-cnn object detector,”in Joint European Conference on Machine Learning and KnowledgeDiscovery in Databases. Springer, 2018, pp. 52–68.

[18] S. Thys, W. Van Ranst, and T. Goedeme, “Fooling automated surveil-lance cameras: adversarial patches to attack person detection,” in Pro-ceedings of the IEEE conference on Computer Vision and PatternRecognition (CVPR) Workshops. IEEE, 2019, pp. 0–0.

[19] Y. Zhao, H. Zhu, R. Liang, Q. Shen, S. Zhang, and K. Chen, “Seeingisn’t believing: Towards more robust adversarial attack against real worldobject detectors,” in Proceedings of the 2019 ACM SIGSAC Conferenceon Computer and Communications Security. ACM, 2019, pp. 1989–2004.

[20] N. K. Verma, T. Sharma, S. D. Rajurkar, and A. Salour, “Object identi-fication for inventory management using convolutional neural network,”in 2016 IEEE Applied Imagery Pattern Recognition Workshop (AIPR).IEEE, 2016, pp. 1–6.

[21] Z. Yang, W. Yu, P. Liang, H. Guo, L. Xia, F. Zhang, Y. Ma, andJ. Ma, “Deep transfer learning for military object recognition undersmall training set condition,” Neural Computing and Applications, pp.1–10, 2018.

[22] J. Redmon and A. Farhadi, “Yolov3: An incremental improvement,”arXiv preprint arXiv:1804.02767, 2018.

[23] L. Liu, W. Ouyang, X. Wang, P. Fieguth, J. Chen, X. Liu, andM. Pietikainen, “Deep learning for generic object detection: A survey,”arXiv preprint arXiv:1809.02165, 2018.

[24] M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisser-man, “The pascal visual object classes (voc) challenge,” InternationalJournal of Computer Vision, vol. 88, no. 2, pp. 303–338, 2010.

[25] M. J. Shafiee, B. Chywl, F. Li, and A. Wong, “Fast yolo: A fast youonly look once system for real-time embedded object detection in video,”arXiv preprint arXiv:1709.05943, 2017.

[26] K. Lu, X. An, J. Li, and H. He, “Efficient deep network for vision-basedobject detection in robotic applications,” Neurocomputing, vol. 245, pp.31–45, 2017.

[27] J. Zhang, M. Huang, X. Jin, and X. Li, “A real-time chinese traffic signdetection algorithm based on modified yolov2,” Algorithms, vol. 10,no. 4, p. 127, 2017.

[28] N. Carlini and D. Wagner, “Towards evaluating the robustness of neuralnetworks,” in Proceedings of the IEEE Symposium on Security andPrivacy (S&P). IEEE, 2017, pp. 39–57.

[29] R. Gurbaxani and S. Mishra, “Traits & transferability of adversarialexamples against instance segmentation & object detection,” arXivpreprint arXiv:1808.01452, 2018.

[30] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for imagerecognition,” in Proceedings of the IEEE conference on Computer Visionand Pattern Recognition (CVPR). IEEE, 2016, pp. 770–778.

[31] O. Sener and V. Koltun, “Multi-task learning as multi-objective opti-mization,” in Advances in Neural Information Processing Systems, 2018,pp. 527–538.

[32] M. Sharif, S. Bhagavatula, L. Bauer, and M. K. Reiter, “Accessorize toa crime: Real and stealthy attacks on state-of-the-art face recognition,”in Proceedings of the ACM SIGSAC Conference on Computer andCommunications Security. ACM, 2016, pp. 1528–1540.

[33] D. Meng and H. Chen, “Magnet: a two-pronged defense against adver-sarial examples,” arXiv preprint arXiv:1705.09064, 2017.

[34] I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessingadversarial examples,” Computer Science, 2014.

[35] C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow,and R. Fergus, “Intriguing properties of neural networks,” ComputerScience, 2013.

[36] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in thephysical world,” arXiv preprint arXiv:1607.02533, 2016.

[37] N. Papernot, P. Mcdaniel, S. Jha, M. Fredrikson, Z. B. Celik, andA. Swami, “The limitations of deep learning in adversarial settings,”pp. 372–387, 2016.

[38] H. Hosseini and R. Poovendran, “Semantic adversarial examples,” inProceedings of the IEEE conference on Computer Vision and PatternRecognition (CVPR) Workshops. IEEE, 2018, pp. 1614–1619.

[39] L. Engstrom, D. Tsipras, L. Schmidt, and A. Madry, “A rotation anda translation suffice: Fooling cnns with simple transformations,” arXivpreprint arXiv:1712.02779, 2017.

[40] E. Yang, T. Liu, C. Deng, and D. Tao, “Adversarial examples forhamming space search,” IEEE transactions on cybernetics, vol. PP,no. 99, pp. 1–12, 2018.

[41] A. Athalye and I. Sutskever, “Synthesizing robust adversarial examples,”arXiv preprint arXiv:1707.07397, 2017.

[42] T. B. Brown, D. Mane, A. Roy, M. Abadi, and J. Gilmer, “Adversarialpatch,” arXiv preprint arXiv:1712.09665, 2017.

[43] P.-Y. Chen, H. Zhang, Y. Sharma, J. Yi, and C.-J. Hsieh, “Zoo: Zerothorder optimization based black-box attacks to deep neural networkswithout training substitute models,” in Proceedings of the 10th ACMWorkshop on Artificial Intelligence and Security. ACM, 2017, pp. 15–26.

[44] S. Liu, B. Kailkhura, P.-Y. Chen, P. Ting, S. Chang, and L. Amini,“Zeroth-order stochastic variance reduction for nonconvex optimization,”in Advances in Neural Information Processing Systems, 2018, pp. 3727–3737.

[45] A. N. Bhagoji, W. He, B. Li, and D. Song, “Practical black-boxattacks on deep neural networks using efficient query mechanisms,” inProceedings of the European Conference on Computer Vision (ECCV).Springer, 2018, pp. 158–174.

[46] M. Alzantot, Y. Sharma, S. Chakraborty, and M. Srivastava, “Genat-tack: Practical black-box attacks with gradient-free optimization,” arXivpreprint arXiv:1805.11090, 2018.

[47] X. Chen, C. Li, D. Wang, S. Wen, J. Zhang, S. Nepal, Y. Xiang, andK. Ren, “Android hiv: A study of repackaging malware for evadingmachine-learning detection,” IEEE Transactions on Information Foren-sics and Security, vol. 15, pp. 987–1001, 2019.

[48] I. Evtimov, K. Eykholt, E. Fernandes, T. Kohno, B. Li, A. Prakash,A. Rahmati, and D. Song, “Robust physical-world attacks on machinelearning models,” arXiv preprint arXiv:1707.08945, vol. 2, no. 3, p. 4,2017.

Page 13: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 13

APPENDIX

MORE results of the digital Daedalus attack are includedin this section.

Be

nig

nD

ete

cte

dD

ete

cte

dD

ete

cte

d0

.3 c

on

fide

nce

0.7

co

nfid

en

ce

Fig. 17: Adversarial examples made by the L0 attack. Thefirst row contains the original images. The third row containsthe low-confidence (γ = 0.3) adversarial examples. The fifthrow contains the high-confidence (γ = 0.7) examples. Thedetection results from YOLO-v3 are in the rows below them.The confidence controls the density of the redundant detectionboxes in the detection results.

Benig

nD

ete

cte

dD

ete

cte

dD

ete

cte

d0.3

confid

ence

0.7

confid

ence

Fig. 18: Adversarial examples made by the L2 attack. Thefirst row contains the original images. The third row containsthe low-confidence (γ = 0.3) adversarial examples. The fifthrow contains the high-confidence (γ = 0.7) examples. Thedetection results from YOLO-v3 are in the rows below them.The confidence controls the density of the redundant detectionboxes in the detection results.

Benig

n D

ete

cte

d A

dvers

aria

l Dete

ted

Fig. 19: The detection results of Daedalus adversarial exam-ples made by the L2 attack towards RetinaNet-ResNet-50. Theadversarial examples are crafted based on a confidence of 0.3.

Benign Y

OLO

-v3 RetinaN

et SSD A

dversarial YOLO

-v3 RetinaN

et SSD

Fig. 20: The detection results of Daedalus adversarial ex-amples made by the ensemble L2 attack towards YOLO-v3,RetinaNet, and SSD. The adversarial examples are craftedbased on a confidence of 0.3. The first row displays thebenign examples. The second, the third, and the fourth rowsare the detection results of the benign examples from YOLO-v3, RetinaNet, and SSD, respectively. The Daedalus examplesare displayed in the fourth row. The detection results of theadversarial examples from YOLO-v3, RetinaNet, and SSD arein the following rows.

Benign

Detected

Detected

Adversarial

Fig. 21: Attacking the object category which has the mostnumber of the detection boxes before NMS. The attackedmodel is YOLO-v3.

Fig. 22: Demonstration of attacking object in a specifiedcategory for YOLO-v3. We select ‘person’ to attack.

Page 14: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 14

Fig. 23: Demonstration of attacking object in a specifiedcategory for YOLO-v3. We select ‘car’ to attack.

A. Parameters of the attack

The details of the transformations used for crafting theDaedalus posters are summarised in Table V. The tablepresents the details for making a wp × hp poster.

B. Demonstration of the attack

As shown in Fig. 24 and Fig. 25, The Daedalus attack canfool a smart inventory management (SIM) system by includingfalse positives in the identification results3.

(a) (b)

Fig. 24: (a) Detection results of a MacBook in a benignsetting. (b) Detection results of the MacBook when we applya Daedalus poster attacking all object categories.

(a) (b)

Fig. 25: (a) Detection results of a monitor in a benign setting.(b) Detection results of the monitor when we apply a Daedalusposter attacking the ‘car’ category.

3A demo video is on YouTube: https://youtu.be/U1LsTl8vufM.

Page 15: XXX 1 Daedalus: Breaking Non-Maximum Suppression in Object

XXX 15

TABLE V: List of Transformations

Inputs Perturbation size: wp × hp, video frame size: wv × hvTransformation ParameterRandom noise U(0, 0.01)

Aspect ratio (α) 416/wv : 416/hvZooming (z) 0.1×min(wv

wp, hvhp

) ∼ 0.7×min(wvwp, hvhp

)

Rotation −π/10→ π/10Position (top left coordinates) (x ∼ U(0, 416− α · z · wp), y ∼ U(0, 416− α · z · hp)

SNPS rate (β) 0.001 ∼ 0.01