data-driven cyber security to counterfeit malicious attacks · data-driven cyber security to...
TRANSCRIPT
![Page 1: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/1.jpg)
Data-driven Cyber Security to Counterfeit Malicious Attacks
Yang Xiang
Swinburne University of Technology, Australia
![Page 2: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/2.jpg)
![Page 3: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/3.jpg)
Cybersecurity Lab Core Capabilities
• FinTech and blockchain• Risk and decision making• Trustworthiness• Data privacy• Spam detection
Applicationsecurity
• Security analytics• Threat prediction• Machine learning for cyber• Social networks security• Insider attacks detection
Data security
• Network, SDN, NFV security• Cloud security• CPS/IoT security• Ransomware/Malware• Autonomous security
System securityHar
dw
are
–So
ftw
are
–D
ata
–Se
rvic
e
![Page 4: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/4.jpg)
Real-world DataSecurityModellingReasoning
![Page 5: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/5.jpg)
Research Methodology
Data-driven Cyber
Security
Cyber threat
analysis
Model security problem
Data collection
Machine learning
customization
![Page 6: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/6.jpg)
Examples
Data-driven Cyber
Security
Software vulnerability
detection
ML-based malware detection
Twitter spam
detection
Network traffic
classification
![Page 7: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/7.jpg)
Software vulnerability detection
![Page 8: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/8.jpg)
500,000ServersAffected
MillionsServers
Attacked
150CountriesAffected
$4Billion
Loss
7~8%CPU Loss
IntelSGX
2014
$xxx Loss
2017
2017
2018
![Page 9: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/9.jpg)
Challenge
1Software
Complexity
45million
lines
61million
lines
70million
lines
100+million lines
![Page 10: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/10.jpg)
Challenge
2Vulnerability
Numbers
6,480
6,447
14,714
20,000+54+/day
2015
2016
2017
2018
![Page 11: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/11.jpg)
Challenge
3Lackof Data
Efficiency
Effectiveness
Securityconsiderati
on notprioritised
Insufficientresources
Lack oflabelled
data
Lack ofdatasets
Labour-intensive feature
engineering
Insufficientsecurity
knowledge
Scalability
![Page 12: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/12.jpg)
Observations
• Abstract Syntax Trees (ASTs): an effective code representations.
• Software source code shares similar statistical properties to natural language.
• Vulnerabilities from different projects share common knowledge, which is discoverable by deep learning algorisms.
![Page 13: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/13.jpg)
CNN
Representations learning
The input Low-level features Mid-level features High-level features
Latent, abstract features describing programming patterns/characteristics
CNN
AST
RNN
![Page 14: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/14.jpg)
Methodology
![Page 15: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/15.jpg)
Network Architecture
![Page 16: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/16.jpg)
Data
Feature Engineering
ML Algorithms
Evaluations
Taxonomy – Our Work
Source code
Binary / Assembly
Pattern-based
Text-based
Code Properties
Trees – Abstract Syntax Tree (AST)
Graphs
Function Call Graphs
Data Flow Graphs
Control Flow Graphs
Dependency Graphs
Program SliceCode Gadgets
Imports/API calls
Rules / Templates
Bag-of-words
Word2Vec / FastText / Code2Vec…
-- Code metrics
Logistic Regression
SVM
Random Forest
Markov model….
Conventional
RNN
DNN
Deep belief network
Deep learning
LSTM
GRU
OthersGenetic Algorithm --
Accuracy
Efficiency
Detection Granularity
Precision
Recall
F-measureDetection Performance
Top-k precision/recall
![Page 17: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/17.jpg)
The Datasets
457vulnerablefunctions
32,531non-
vulnerablefunctions
6open-source
projects
1,000+releases
NVDCVE
repositories
![Page 18: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/18.jpg)
Results
![Page 19: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/19.jpg)
Results
![Page 20: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/20.jpg)
Results
![Page 21: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/21.jpg)
Binary Vulnerability Detection
![Page 22: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/22.jpg)
Future Work
Binary-level
detection
Instruction-level
granularity
Specific-typevulnerability
detection
Focusing on scenarios where the source code is unavailable
Identifying multiple instructions (reverse-engineering) that are
potentially vulnerable
Focusing on vulnerabilities causedby missing checks (e.g. numeric
errors).
![Page 23: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/23.jpg)
Example 2 - ML-based malware detection
![Page 24: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/24.jpg)
Example 3 – Twitter spam detection
![Page 25: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/25.jpg)
Example 4 - Network traffic classification
![Page 26: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/26.jpg)
Research Methodology
Collect data for security
problem
Extract raw or low level
features
Apply data analysis
Security professionals Domain knowledge Model analytics
Data-driven Cyber Security
![Page 27: Data-driven Cyber Security to Counterfeit Malicious Attacks · Data-driven Cyber Security to Counterfeit Malicious Attacks Yang Xiang Swinburne University of Technology, Australia](https://reader030.vdocuments.net/reader030/viewer/2022041100/5ed8cecb6714ca7f47689a4e/html5/thumbnails/27.jpg)
Resources
• G. Lin, J. Zhang, W. Luo, L. Pan, Y. Xiang, O. D. Vel, and P. Montague, “Cross-Project Transfer Representation Learning for Vulnerable Function Discovery,” IEEE Transactions on Industrial Informatics, vol. 14, no. 7, pp. 3289-3297, 2018.
• C. Chen, Y. Wang, J. Zhang, Y. Xiang, W. Zhou, and G. Min, “Statistical Features Based Real-time Detection of Drifted Twitter Spam,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 4, pp. 914-925, 2017.
• J. Zhang, X. Chen, Y. Xiang, W. Zhou, and J. Wu, “Robust Network Traffic Classification,” IEEE/ACM Transactions on Networking, vol. 23, no. 4, pp. 1257-1270, 2015.
• S. Cesare, Y. Xiang, and W. Zhou, “Control Flow-based Malware Variant Detection,” IEEE Transactions on Dependable and Secure Computing, vol. 11, no. 4, pp. 307-317, 2014.
• S. Cesare, Y. Xiang, and W. Zhou, “Malwise - An Effective and Efficient Classification System for Packed and Polymorphic Malware,” IEEE Transactions on Computers, vol. 62, no. 6, pp. 1193-1206, 2013.
• J. Zhang, Y. Xiang, Y. Wang, W. Zhou, Y. Xiang, and Y. Guan, “Network Traffic Classification Using Correlation Information,” IEEE Transactions on Parallel and Distributed Systems, vol. 24, no. 1, pp. 104-117, 2013.
Sponsors & Collaborators