![Page 1: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/1.jpg)
How to Build Realistic Machine Learning Systems for Security?
Sadia Afroz ICSI and Avast
Rajarshi Gupta Avast
![Page 2: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/2.jpg)
Machine Learning is necessary for detecting malware at scale
![Page 3: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/3.jpg)
Evtimov, Ivan, et al. (2017). ”Robust physical-world attacks on deep learning models."
arXiv preprint arXiv:1707.08945.
Goodfellow, I. J., Shlens, J., & Szegedy, C. (2014). Explaining and harnessing adversarial examples.
arXiv preprint arXiv:1412.6572.
…but Machine Learning is unreliable, inexplicable and easily fooled
![Page 4: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/4.jpg)
Is machine learning useful for security?
![Page 5: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/5.jpg)
Malware + Benign
Features
Model
Extract features
Train a model
Let’s build a malware detector using machine learning
![Page 6: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/6.jpg)
Malware + Benign
Features
Model
Extract features
Train a model
Let’s build a malware detector using machine learning
New file Malware
![Page 7: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/7.jpg)
Quality of the data ==> Quality of the model
Malware + Benign
Features
Model
Extract features
Train a model
New file Malware
Let’s build a malware detector using machine learning
![Page 8: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/8.jpg)
!8
CODE SAMPLE
![Page 9: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/9.jpg)
!9
Is this malware?
CODE SAMPLE
![Page 10: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/10.jpg)
!10
CODE SAMPLE X
![Page 11: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/11.jpg)
!11
Is this malware?
CODE SAMPLE X
![Page 12: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/12.jpg)
!12
The answer depends on WHO you ask and WHEN you askIs this malware?
CODE SAMPLE X
![Page 13: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/13.jpg)
!13
X According to VirusTotal…
CODE SAMPLE
https://www.virustotal.com/gui/file/3120b563781b5ead9fdebc906818836329f362bf8e3ea7ee3dbfd4ceb0ebd8dd/detection
![Page 14: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/14.jpg)
!13
X According to VirusTotal…
Sep 2019
CODE SAMPLE
https://www.virustotal.com/gui/file/3120b563781b5ead9fdebc906818836329f362bf8e3ea7ee3dbfd4ceb0ebd8dd/detection
![Page 15: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/15.jpg)
!13
X According to VirusTotal…
~42% AVs considered it malware
Sep 2019
CODE SAMPLE
https://www.virustotal.com/gui/file/3120b563781b5ead9fdebc906818836329f362bf8e3ea7ee3dbfd4ceb0ebd8dd/detection
![Page 16: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/16.jpg)
!13
X According to VirusTotal…
~42% AVs considered it malware
Jan 2020Sep 2019
CODE SAMPLE
https://www.virustotal.com/gui/file/3120b563781b5ead9fdebc906818836329f362bf8e3ea7ee3dbfd4ceb0ebd8dd/detection
![Page 17: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/17.jpg)
!13
X According to VirusTotal…
~72% AVs considered it malware
~42% AVs considered it malware
Jan 2020Sep 2019
CODE SAMPLE
https://www.virustotal.com/gui/file/3120b563781b5ead9fdebc906818836329f362bf8e3ea7ee3dbfd4ceb0ebd8dd/detection
![Page 18: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/18.jpg)
How can we protect users from malware when we don’t know what malware is?
![Page 19: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/19.jpg)
Malware
Run the file
Analyze (static +dynamic)
What is malware?
Users’ machine
![Page 20: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/20.jpg)
Malware
Run the file
Analyze (static +dynamic)
What is malware?
Virtual machine
![Page 21: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/21.jpg)
Malware
Run the file
Analyze (static +dynamic)
What is malware?
Sandbox
![Page 22: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/22.jpg)
Malware
Run the file
Analyze (static +dynamic)
What is malware?
Sandbox
![Page 23: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/23.jpg)
Malware
Run the file
Analyze (static +dynamic)
What is malware?
Malware is highly suspicious files
Sandbox
![Page 24: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/24.jpg)
Malware
Run the file
Analyze (static +dynamic)
What is malware?
Malware is highly suspicious filesToo time consuming!
Sandbox
![Page 25: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/25.jpg)
What is malware?Solution: Get labels from other sources
We studied 40 papers from 2001-2019 to check where they get their ground truth from
![Page 26: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/26.jpg)
What is malware?Solution: Get labels from other sources
01020304050
Collection AV Label Manual
We studied 40 papers from 2001-2019 to check where they get their ground truth from
![Page 27: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/27.jpg)
What is malware?Solution: Get labels from other sources
01020304050
Collection AV Label Manual
We studied 40 papers from 2001-2019 to check where they get their ground truth from
![Page 28: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/28.jpg)
What is malware?Solution: Get labels from other sources
01020304050
Collection AV Label Manual
We studied 40 papers from 2001-2019 to check where they get their ground truth from
![Page 29: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/29.jpg)
What is malware?Solution: Get labels from other sources
01020304050
Collection AV Label Manual
We studied 40 papers from 2001-2019 to check where they get their ground truth from
![Page 30: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/30.jpg)
01020304050
Collection AV Label Manual
What is malware?
We studied 40 papers from 2001-2019 to check where
they get their ground truth from
![Page 31: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/31.jpg)
01020304050
Collection AV Label Manual
What is malware?
9 use labels by one AV
We studied 40 papers from 2001-2019 to check where
they get their ground truth from
![Page 32: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/32.jpg)
01020304050
Collection AV Label Manual
What is malware?
9 use labels by one AV
2 papers: Malware >=4, Benign == 0
We studied 40 papers from 2001-2019 to check where
they get their ground truth from
![Page 33: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/33.jpg)
01020304050
Collection AV Label Manual
What is malware?
9 use labels by one AV
2 papers: Malware >=4, Benign == 0
2 papers: Malware >=5, Benign <=1
We studied 40 papers from 2001-2019 to check where
they get their ground truth from
![Page 34: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/34.jpg)
01020304050
Collection AV Label Manual
What is malware?
9 use labels by one AV
2 papers: Malware >=4, Benign == 0
2 papers: Malware >=5, Benign <=1
1 paper: Malware >=10, Benign == 0
We studied 40 papers from 2001-2019 to check where
they get their ground truth from
![Page 35: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/35.jpg)
01020304050
Collection AV Label Manual
What is malware?
9 use labels by one AV
2 papers: Malware >=4, Benign == 0
2 papers: Malware >=5, Benign <=1
1 paper: Malware >=10, Benign == 0
1 paper: Malware == ALL, Benign == 0
We studied 40 papers from 2001-2019 to check where
they get their ground truth from
![Page 36: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/36.jpg)
01020304050
Collection AV Label Manual
What is malware?
9 use labels by one AV
2 papers: Malware >=4, Benign == 0
2 papers: Malware >=5, Benign <=1
1 paper: Malware >=10, Benign == 0
1 paper: Malware == ALL, Benign == 0
1 paper: Malware == Majority, Benign == 0
We studied 40 papers from 2001-2019 to check where
they get their ground truth from
![Page 37: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/37.jpg)
01020304050
Collection AV Label Manual
What is malware?
9 use labels by one AV
2 papers: Malware >=4, Benign == 0
2 papers: Malware >=5, Benign <=1
1 paper: Malware >=10, Benign == 0
1 paper: Malware == ALL, Benign == 0
1 paper: Malware == Majority, Benign == 0
1 paper: Malware == Weighted Majority, Benign == 0
We studied 40 papers from 2001-2019 to check where
they get their ground truth from
![Page 38: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/38.jpg)
How to compare different approaches?
![Page 39: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/39.jpg)
What is malware?A
ISec
201
5
![Page 40: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/40.jpg)
What is malware?A
ISec
201
5
![Page 41: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/41.jpg)
What is malware?
• Number of very large and professional companies share their labels on VirusTotal
AIS
ec 2
015
![Page 42: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/42.jpg)
What is malware?
• Number of very large and professional companies share their labels on VirusTotal
• Great correlation in general, especially for top companies• 96% agreement after 3 days• 99% agreement after 3 weeks
AIS
ec 2
015
![Page 43: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/43.jpg)
Professional Heuristics for Ground Truth
# of days since first occurrence of sample
Avast Results (100k samples in Sep 2019)
Our (professional) rule of thumb of malware ground truth: One week delayed results on VT from Top Few (<10) companies is good enough
![Page 44: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/44.jpg)
Does the overall performance of the classifiers matter?
![Page 45: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/45.jpg)
Does the overall performance of the classifiers matter?
Which of the classifiers are best?
![Page 46: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/46.jpg)
Which of the classifiers are best?
Depends upon where you look!
Does the overall performance of the classifiers matter?
![Page 47: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/47.jpg)
Adversarial attacks
Graph credit: Nicholas Carlini, Google Brain;
More than 1500 papers on adversarial ML
Adversarial attacks
![Page 48: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/48.jpg)
Adversarial attacks
Graph credit: Nicholas Carlini, Google Brain;
More than 1500 papers on adversarial ML
Only 36 (2.4%) papers focus on evading malware detectors
![Page 49: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/49.jpg)
Can adversarial malware evade malware detectors?
![Page 50: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/50.jpg)
Can adversarial malware evade malware detectors?
![Page 51: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/51.jpg)
Can adversarial malware evade malware detectors?
Are adversarial attacks harmful for users?
![Page 52: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/52.jpg)
Extract features 0 1 1 0
1 1 1 1
1 0 0 0
0 0 0 0
1 1 1 1
Feature vector
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 53: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/53.jpg)
Extract features 0 1 1 0
1 1 1 1
1 0 0 0
0 0 0 0
1 1 1 1
Feature vector
Evading Machine Learning Model
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 54: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/54.jpg)
Extract features 0 1 1 0
1 1 1 1
1 0 0 0
0 0 0 0
1 1 1 1
Feature vector
Evading Machine Learning ModelChecking Harm to Users
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 55: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/55.jpg)
New Section+ =New Section
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 56: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/56.jpg)
New Section+ =New Section
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 57: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/57.jpg)
New Section+ =New Section
Adversarial attacksAdversarial attacks: feature space vs problem space
The new section can override an existing section
![Page 58: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/58.jpg)
When adding a new section at the end of the last section, if the sample has overlay data, the new section will overwrite the overlay data.
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 59: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/59.jpg)
Adversarial attacksAdversarial attacks: feature space vs problem space
New section 4
![Page 60: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/60.jpg)
New section 4
Section header
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 61: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/61.jpg)
New section 4
Section headerNew section header
Override existing sections
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 62: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/62.jpg)
New section 4
Section headerNew section header
Override existing sections
Adversarial attacksAdversarial attacks: feature space vs problem space
![Page 63: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/63.jpg)
Are adversarial attacks harmful to users?
![Page 64: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/64.jpg)
Are adversarial attacks harmful to users?
papers changed the malware files
![Page 65: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/65.jpg)
Are adversarial attacks harmful to users?
papers changed the malware files
9/36
![Page 66: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/66.jpg)
Are adversarial attacks harmful to users?
papers changed the malware files
9/36papers tried
to execute the adversarialsamples
![Page 67: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/67.jpg)
Are adversarial attacks harmful to users?
papers changed the malware files
9/36papers tried
to execute the adversarialsamples
4/36
![Page 68: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/68.jpg)
Are adversarial attacks harmful to users?
papers changed the malware files
9/36papers tried
to execute the adversarialsamples
4/36papers check if the modified malware is harmful to users
![Page 69: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/69.jpg)
Are adversarial attacks harmful to users?
papers changed the malware files
9/36papers tried
to execute the adversarialsamples
4/36papers check if the modified malware is harmful to users
0/36
![Page 70: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/70.jpg)
[1] Xu et al., NDSS Talk: Automatically Evading Classifiers (including Gmail’s).
Are adversarial attacks harmful to users?
![Page 71: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/71.jpg)
* Hashes and hand written rules
Is evading one classifier enough?
![Page 72: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/72.jpg)
Sample
* Hashes and hand written rules
Is evading one classifier enough?
![Page 73: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/73.jpg)
Sample Signature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 74: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/74.jpg)
Sample
Malware
Benign
Signature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 75: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/75.jpg)
Static Sample
Malware
Benign
Not MatchedSignature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 76: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/76.jpg)
Static Sample
Benign
Malware Malware
Benign
Not MatchedSignature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 77: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/77.jpg)
Static Sample
Benign
Maybe benign
Malware Malware
Benign
Not MatchedSignature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 78: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/78.jpg)
Static Sample
Benign
Maybe benign Dynamic
Malware Malware
Benign
Not MatchedSignature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 79: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/79.jpg)
Static Sample
Benign
Maybe benign Dynamic
Malware Malware
Benign
Malware
Benign
Not MatchedSignature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 80: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/80.jpg)
Static Sample
Benign
Maybe benign Dynamic Maybe Malware
More Analysis
Malware Malware
Benign
Malware
Benign
Not MatchedSignature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 81: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/81.jpg)
Static Sample
Benign
Maybe benign Dynamic Maybe Malware
More Analysis
Malware Malware
Benign
Malware
Benign
Not MatchedSignature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 82: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/82.jpg)
Static Sample
Benign
Maybe benign Dynamic Maybe Malware
More Analysis
Malware Malware
Benign
Malware
Benign
Not MatchedSignature*
* Hashes and hand written rules
Is evading one classifier enough?
![Page 83: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/83.jpg)
Static Sample
Benign
Maybe benign Dynamic Maybe Malware
More Analysis
Malware Malware
Benign
Malware
Benign
Not MatchedSignature*
We are here
* Hashes and hand written rules
Is evading one classifier enough?
![Page 84: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/84.jpg)
Static Sample
Benign
Maybe benign Dynamic Maybe Malware
More Analysis
Malware Malware
Benign
Malware
Benign
Not MatchedSignature*
We are here
* Hashes and hand written rules
Is evading one classifier enough?
![Page 85: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/85.jpg)
Who is the adversary?
Adversary has full access
Adversary has no access
White box
Black box
![Page 86: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/86.jpg)
Who is the adversary?
Adversary has full access
Adversary has no access
White box
Black box
![Page 87: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/87.jpg)
Who is the adversary?
Adversary has full access
Adversary has no access
White box
Grey box
Black box
![Page 88: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/88.jpg)
Who is the adversary?
Adversary has full access
Adversary has no access
White box
Adversary has full access to the features
Grey box
Black box
![Page 89: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/89.jpg)
Who is the adversary?
Adversary has full access
Adversary has no access
White box
Adversary has full access to the features
Adversary can dounlimited queries
Grey box
Black box
![Page 90: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/90.jpg)
Who is the adversary?
Adversary has full access
Adversary has no access
White box
Adversary has full access to the features
Adversary can dounlimited queries
Adversary has accessto the training data
Grey box
Black box
![Page 91: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/91.jpg)
Who is the adversary?
Adversary has full access
Adversary has no access
White box
Adversary has full access to the features
Adversary can dounlimited queries
Adversary has accessto the training data
Adversary can buildsubstitute classifiers
Grey box
Black box
![Page 92: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/92.jpg)
Consistent ground truth
Measurable adversary
Proper evaluation
How to Build Realistic Machine Learning Systems for Security?
![Page 93: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/93.jpg)
Questions?
Rajarshi Gupta VP, Head of AI
Avast
Deepali GargSenior Data Scientist
Avast
Fabrizio Bondi AI Manager
Avast
Heng YinAssociate Professor
UC Riverside
Wei SongPhD Student UC Riverside
Xuezixiang LiPhD Student UC Riverside
Research contributors
Sadia Afroz
![Page 94: How to Build Realistic Machine Learning Systems for Security?...How to Build Realistic Machine Learning Systems for Security? Sadia Afroz ICSI and Avast Rajarshi Gupta Avast Machine](https://reader033.vdocuments.net/reader033/viewer/2022051803/5febddef86d3a579617d6083/html5/thumbnails/94.jpg)