cyber-security: some thoughts v.s. subrahmanian center for digital international government computer...
TRANSCRIPT
Cyber-Security: Some Thoughts
V.S. SubrahmanianCenter for Digital International Government
Computer Science Dept. & UMIACSUniversity of Maryland
[email protected]/~vs/
1
Parts of this talk reflect joint work with M. Albanese, S. Jajodia, C. Molinaro, A. Pugliese, N. Rullo, C. Thomas
V.S. Subrahmanian, Geo-Intelligence India 2013
Disclaimers
V.S. Subrahmanian, Geo-Intelligence India 2013 2
• All work described in this talk only uses open-source data.
• All work in this talk is basic research tested wherever possible against real-world data.
• All work reported in this talk has been published in the scientific literature.
Talk Outline
• Terminology– Vulnerabilities– Exploits
• Technology– Monitoring networks for known attacks– Monitoring networks for unknown attacks– Social media (Sybil, sockpuppet) attacks
3V.S. Subrahmanian, Geo-Intelligence India 2013
Terminology• Vulnerability: Feature of software
that can be used by an attacker – usually in a way unanticipated by the software designer – to attack a system. US National Vulnerability Database (nvd.nist.gov) contains over 56K vulnerabilities together with suggested patches.
• Exploit – a piece of code that takes advantage of a vulnerability to carry out an attack. Databases of exploits also exist, e.g. some sites claim over 22K exploits in their database
4V.S. Subrahmanian, Geo-Intelligence India 2013
The Cyber Trade: The Scary Part
• “Exploits as a service” is now cheap and efficient for attackers [criminals, nation states]
• Exploits (or parts thereof) for different kinds of attacks can be bought for a very small price compared to the prices for artifacts used in kinetic attacks
5V.S. Subrahmanian, Geo-Intelligence India 2013
- DatabaseReal-time
ObservationData
- Network- Resource
use- and
more
ALEActivity
Learning Engine
Known Activities -
Bad
Known Activities -
Good
PASSParallel Activity Search
System
Parallel Unexplained Activity Detection
Security AnalystInterface
tMAGICActivity Detection Engine
Unexplained ActivityDetection Engine
OFFLINE ONLINE
V.S. Subrahmanian, Geo-Intelligence India 2013 6
Attack Graphs
7
Attack Graphs
•C’s are conditions•V’s are vulnerabilities•C4 and C5 are both needed to exploit vulnerability V4.•Vulnerability V4 causes condition C6.
Temporal Attack Graphs
•Only worry about vulnerabilities.•Figure on left says vulnerability V4 can be exploited if V3 and either V1 or V2 can be exploited.•Probabilistic versions exist.
Databases of vulnerabilities and attack graphs are available
V.S. Subrahmanian, Geo-Intelligence India 2013
Attack Graphs Can be Merged
8
Merging a large set of attack graphs means that you can solve a task once to search for multiple
occurrences within a single stream of transactional data !
V.S. Subrahmanian, Geo-Intelligence India 2013
Attack Graphs
9
Attack graphs can be built semi-automatically to monitor live network traffic. But two key problems need to be solved:
•How to monitor huge volumes of traffic ?•How to identify unexpected activities that you did not know about in the past and add them to your activity knowledge base ?• Activities are both bad (attacks) and good (innocuous). • Need models of both good and bad activities in order
to identify what is abnormal or unexplained.
V.S. Subrahmanian, Geo-Intelligence India 2013
Finding Known ActivitiesPASS Parallel Activity Search System
• Developed algorithm to identify all instances of a [known] activity in an observation stream that have at least a certain probability.
• Demonstrated the ability to automatically detect activities in a stream of observation data arriving at 500K+ observations per second on a 8-node cloud.
• Demonstrated the ability to identify unexplained behavior in observation streams with precision over 80% and recall over 70%.
10V.S. Subrahmanian, Geo-Intelligence India 2013
Unexplained Activities• How can we look for
activities that have never been anticipated?
• Answer– Set up a framework to
continuously track unexplained activities;
– Present unexplained activities quickly to a security analyst who• Flags it as a bad activity or• Flags it as an OK activity
– Update repertoire of known activity models with this security analyst feedback.
11
• What is an unexplained activity?
• It’s a sequence (not necessarily contiguous) of events that are inconsistent with all known activity models (good or bad)
• Unexplained does not necessarily mean bad.
• Also a lot of work on statistical anomaly detection [not in my lab].
V.S. Subrahmanian, Geo-Intelligence India 2013
Example Unexplained Activity
12V.S. Subrahmanian, Geo-Intelligence India 2013
Unexplained Activity Detection
13
Totally unexplained
Partially unexplained
Tested using network traffic from a university. Wireshark used to capture network traffic; SNORT used for activity models.V.S. Subrahmanian, Geo-
Intelligence India 2013
Unexplained Activity Detection
14
Tested using network traffic from a university. Wireshark used to capture network traffic; SNORT used for activity models.
Looking for more top-K increases runtime
Looking at more worlds
increases runtime
Increasing reduces run-
time
Increasing sequence length reduces runtime
V.S. Subrahmanian, Geo-Intelligence India 2013
An Election Social Media Attack
15V.S. Subrahmanian, Geo-Intelligence India 2013
Election Social Media Attack
16V.S. Subrahmanian, Geo-Intelligence India 2013
Social Media Attacks
• A major state-backed threat.• SMAs cause a viral increase in the number of
social media posts in support of a particular cause or position.
• SMAs can destabilize decision making by a country by providing a false picture of support for or against a given position.
17V.S. Subrahmanian, Geo-Intelligence India 2013
Other Relevant Work• Algorithms to identify common patterns in huge
networks (1B+ edges)• Ability to update identified patterns in huge
networks as the network changes (540M+ edges)• Algorithms to find a set of K nodes that optimizes
an arbitrary objective function on a network (31M+ edges)
• Algorithms to identify important nodes in attributed, weighted networks
• Learning to cluster malware variants
18V.S. Subrahmanian, Geo-Intelligence India 2013
Current Directions
• Learning Activity Models – given that there is some set of low level events that can be detected, can we learn the stochastic temporal automata directly from the data in a semi-supervised manner?
• Parallel Unexplained Activity Detection – can we scale up our current algorithms to identify unexplained activities in high throughput streams?
19V.S. Subrahmanian, Geo-Intelligence India 2013
Contact Information
V.S. SubrahmanianDept. of Computer Science & UMIACSUniversity of MarylandCollege Park, MD 20742.Tel: 301-405-6724Email: [email protected]: www.cs.umd.edu/~vs/
20V.S. Subrahmanian, Geo-Intelligence India 2013