ai in cyber security: a balancing force or disruptor? · ai in cyber security: a balancing force or...

SESSION ID:SESSION ID:

#RSAC

Vijay Dheap

AI In Cyber Security: A Balancing Force or Disruptor?

AIR-T08

@dheap

#RSAC

Topics

2

A quick primer on Artificial Intelligence (AI)

Motivations and applications of AI in cyber security

AI adoption by malicious actors

Challenges and risks with AI for Cyber Security

Conclusion: Beginning your AI journey

#RSAC

A quick primer on Artificial Intelligence

#RSAC

What is AI?

4

Definition of artificial intelligence. 1 :a branch of computer science dealing with the simulation of intelligent behavior in computers. 2 :the capability of a machine to imitate intelligent human behavior.

- Merriam Webster Dictionary

AI

Vision

Logical Analysis

Natural Language Processing

Learned Behaviors

Autonomous Movement

…

Speech

#RSAC

Machine Learning: Building block of AI

5

A subfield of computer science that enables computers to learn without being explicitly programmed- Arthur Samuel in 1959

Supervised Unsupervised ReinforcementInferring a general rule or mathematicalfunction from labeled training data to be applied to other data

Primary Use Cases• Regression Analysis

o Deriving correlation relationships between variables and estimating the strength of those relationships

o Prediction & Forecastingo Example: Vulnerability prioritization

• Classification:o Produces a model from a training

set that can assign unseen inputs into different categories

o Example: Sentiment detection

Detecting the presence of patterns or models from unlabeled data

Primary Use Cases• Clustering

o Data is divided into different groups based on one or more attributes

o Example: Peer group determination

• Dimensionality Reductiono process of reducing the number of

random variables under consideration

o Example: Feature Selection and extraction of false positives

Refining behavior based on external assessment and feedback

Primary Use Cases• Trade-off Analysis

o Balancing short term or long term reward

o Selection of rewards based on context

o Example: Risk assessment

• Optimizationo Maximizing reward function

through simulations or observations

o Example: security configuration optimization

#RSAC

Traditional Data Analytics vs. Machine Learning

6

Classical statistics aims to formalize relationships between variables in the form of mathematical formulas.

Machine learning can offer greater precision in its results because it avoids the simplification assumptions typically incorporated into manually created statistical models

The quality of the underlying data influences both traditional data analytics and machine learning since both are operating on the data syntactically

#RSAC

Deep Learning: Evolution of Machine Learning

7

Application to learning tasks of artificial neural networks(ANNs) that contain more than one hidden layer. The "deep" architectures can vary considerably with each implementation being optimized for different tasks or goals

Reasons for Deep Learning1. Pattern Complexity: the number of patterns

to recognize

2. Pattern Reuse: Learn intricate patterns by building on the work of the previous layer

Example: Anomaly Detection

#RSAC

Cognitive Reasoning

8

Identification Model 1

Inference: Hypothesis Generation

Domain Knowledge

Unknown Event



Evidence Gathering

and Semantic

Processing

Probabilistic Conclusion

Potentially incomplete context

Context enrichment

Continuous learning

Continuous learning

Example: Context enrichment of a security incident, quantification of impact and cataloging evidence of the kill chain

#RSAC

Motivations and applications of AI in Cyber Security

#RSAC

The current cyber landscape favors malicious actors

10

Global information solutions company, Equifax, has reported a major cybersecurity incident affecting 143 million consumers in the US.

Anthem: Hacked Database Included 78.8 Million People

SEC reveals it was hacked, information may have been used for illegal stock trades

”Big Four” accounting firm Deloitte was likely breachedin October or November 2016, but wasn’t discoveredby the firm until March 2017

The Cost of Cyber Security Operations Continues to Increase without Mitigating Risk

#RSACCyber Security Imperatives to achieve a more favorable equilibrium

11

Scale Security OpsWith an increasing volume and sophistication of attacks organizations need a force multiplier for their security teams

Assist DecisionsGiven a broader threat surface area security teams need assistance in effective and accurate data driven decision making

Improve ResponsivenessDue to increasing risk of compromise, institutions and individuals will demand faster response to breaches

Be ProactiveThe goal is to instrument proactive security controls to minimize exposure to emerging threats

#RSAC

The Appeal of AI for Cyber Security

12

Automate Operational Tasks

Present Complete Security Context

Mitigate Human Biases

Recall Relevant Knowledge

Develop Predictive Capabilities

Derive Actionable Intelligence

Provide Expert Advice

Propose Best Practices

Assess Risk

Remove Noise

#RSAC

Revisiting the Cyber Security Lifecycle

13

Key Requirements:• Monitoring for anomalous activity• Identification of current or

potential threats

Activities of Security Professionals:• System Integration &

Configuration• Threat hunting

Traditional Operations:• Instrumented and automated but

requires deployment integration• Pattern/signature or rule based

detection• Generates significant volume of

alerts

Key Requirements:• Validation of escalated incidents

as security concerns• Risk assessment of detected

security threats – scope and magnitude

Activities of Security Professionals:• Security Analysis• Threat Hunting

Traditional Operations:• Highly manual investigative

process• Inconsistent methodologies• Prone to human biases and

plagued by lack of skills

Key Requirements:• Formulating responses to specific

risks or incidents• Tracking incidents from detection

to investigation to resolution

Activities of Security Professionals:• Incident Response• System Administration/ Program

Management

Traditional Operations:• Time intensive manual security

assessments• Heavily reliant on experiential

knowledge of individuals• Delays in instituting responses

Detection of Threats & Risks Investigation & Qualification of Security Alerts

Incident Response & Governance

#RSAC

Applied AI: Detection of Threats & Risks

14

Naive Bayes classifiers for Spam filtering

Holt Winters algorithm for Network Anomaly Detection

Clustering for baselining behaviors for anomaly detection

Peer group analysis for insider threat detection

Optimizing attack detection through observational reinforcement and deep nets

Present Near FutureLate 1990’s and early 2000’s Mid to Late 2000’s Early to mid 2010’s

Incr

easi

ng A

I sop

hist

icat

ion

and

mat

urity

Incident Classification: recognizing type and nature of threat

Anomaly Detection: behavior analysis of users, network, assets, data, applications

Information Synthesis: profiling actors and activities by constructing dynamic models

Proactive Prioritization: anticipating outcomes based on current events and historical knowledge

Cognitive threat hunting

Malware detection using various techniques: clustering, classification decision trees and deep nets

Long history of automated data analysis, and initial focus of most AI initiatives in cyber security

Detection of malware-generated domains with Recurrent Neural Models

#RSAC

Applied AI: Alert Investigation and Qualification

15

Incr

easi

ng A

I sop

hist

icat

ion

and

mat

urity

False positive and noise reduction using Principal Component Analysis and deep nets

Dimensionality Reduction: understanding the key traits that influence risk

Attack Tactics Resolution: Dissecting the process and goals of an attack

Threat Research: automated incident specific information gathering

Enriched Incident Context: identifying affected entities and relationships among them

15

Present Near FutureLate 1990’s and early 2000’s Mid to Late 2000’s Early to mid 2010’s

Natural Language Processing (NLP) for semantic analysis powered by deep nets

Deducing behaviors and motivations using cognitive and time series analysis

Clustering and classification of indicators by threat type

Key investment segment to address skills shortage in security analysis to improve security outcomes

#RSAC

Applied AI: Incident Response

16

Incr

easi

ng A

I sop

hist

icat

ion

and

mat

urity

16

Present Near FutureEarly to mid 2010’s

Response Simulation: projecting the security and business repercussions of a response plan

Impact Assessment: identifying all affected entities and quantifying the scope of risk

Orchestration: automation of incident response plan

Recommended Actions: suggested response plan based on historical outcomes

Decision matrices and observational learning to mimic manual cognitive processes

Cognitive analysis for best practice selection

Risk classification of security incidents using regression analysis and deep nets

Trade-off analysis for process disruption and security risk calculation

Significant potential to scale security operations and improve overall response times.

#RSAC

A Case Study: Network Analysis powered by AI

17

Data Machine Learning Models Security Value

Network Traffic

Phishing Kits

Malicious Traffic Classification

Principal components of malicious domains and sites

Detection and validation of suspicious

traffic, threat actors, malicious domains and

susceptible internal systems and users

#RSAC

AI adoption by malicious actors

#RSAC

Malicious Actors and AI

19

Increasing Success & Falling CostsCurrent tools and tactics are already delivering greater success while reducing costs –so any new investment in AI must promise higher returns

High Value Advanced TargetsTarget organizations or specific outcomes that were previously deemed too risky of exposure now could potentially become feasible with AI

Short time windowWhen highly prevalent vulnerabilities are announced, some organizations may not respond quickly – AI could allow malicious actors to capitalize on that window of opportunity

Individualized Large-scale CompromiseToday centralized targets are prized targets but while they have high yield they become public. AI could allow for large scale decentralized compromises that are hidden

#RSAC

AI Enhanced Kill Chain

20

Surveillance & Research Breach Exploit

• Understanding security controls:o Standard practiceso Specific Target

• Monitoring processes and activitieso Institutional practiceso Specific users

• Learn about IT infrastructures and solutions to reveal vulnerabilities

• Natural Context-aware messagingo Email, text, tweets etc

• Adaptive toolso Environment-aware

behavior modificationo Evolving malware

• Reputation Spoofing

• Diversionary or Evasive Tactics to confuse security controlso Generate noise

• Dynamic Tacticso Embedded data transferso Entity baseline

modificationo False security event

generation

#RSAC

Examples of AI Enhanced Kill Chain

21

Surveillance & Research Breach Exploit

• Blackbox probing of security controls to gather results and possibly confidence (ex. through vendor product testing)

• Intelligent NLP powered crawlers to monitor social media and forum activities of individuals/aliases

• Automated analysis of security bulletins, policies and controls powered by NLP

• NLP powered social engineering bots

• Organization specific topic selection

• Simulating speech or writing style of colleagues

• Intelligent Malware• Countermeasure aware

polymorphism• Information altering to

mislead classifiers• Generative adversarial

network (GAN) to bypass machine learning based detection

• Time synchronized DDOS smokescreen for data exfiltration

• Injection of benign activity similar to future malicious behavior to confuse anomaly detection

• Steganography to bundle sensitive data into copies of benign transmissions

#RSAC

Challenges and risks with AI for Cyber Security

#RSAC

Uniqueness of Applied AI in Cyber Operations

23

Active Adversary Data Availability Time Value Tradeoff

Assume every action taken will be witnessed and an equal or greater effort will be invested to counter it

AI requires high volumes of high quality data to learn. Data silos and varying formats can affect training

Given dynamic cyber landscape use cases need to stand the test of time and context or else can negate value

#RSAC

Cautionary Tales

24

• Purely syntactic analysis can lead to overfitting to specific attributes or features of the data

• Lack of understanding of how an algorithm arrives at its answer can hide flaws in its design or data selection

Today its still easy to throw off an recognition systems by strategically inserting non-relevant data. If attackers know how a information is classified, they can trick the learned behavior into faulty interpretation

Altered street signs confuse driverless cars

#RSAC

Getting Started

#RSAC

Building AI powered Security Controls

26

Use Case Definition

Gathering Training

Data

Features Selection

Machine Learning Model

Selection

Tuning Action Results

Focus the problem statement to improve odds of success and minimize time to value

Curate data to minimize possibility of poisoning the learned model and guarantee required richness

Employ security domain knowledge and context to validate the analytical process

Develop data science and machine learning expertise to match algorithms to problem

Refine the model to improve robustness and iteratively broaden the scope of solution to expand value

Emphasize consumability of implementation to operationalize the solution

#RSAC

Embracing AI: An Example

27

Use Case Definition

Gathering Training

Data

Features Selection

Machine Learning Model

Selection

Tuning Action Results

Goal: Insider threat –Identify risky identities and prevent exfiltration of data

Security Data sources: SIEM, endpoint logs, proxy logs, Access control logs, application logs, network traffic, email content

Data Preparation: normalization, filtering, annotation etc.

Attributes of risky identity: negative sentiment, inconsistent activity, access to critical or sensitive data

Build:- Sentiment analytics of email content or social media activity- Peer group analysis to highlight behavioral anomalies

Buy: validate use case coverage and accuracy

Train: Machine learning models tested on internal data sources to minimize false positives

Operational Focus: Visualization for dashboards and notifications and data export into existing security tools – SIEMs, Firewalls, Device Management etc.

#RSAC

Summary & Conclusion

28

Collectively AI security teams have to merely achieve a Nash equilibrium state such that their detection and response matches new threats posed by attackers • For malicious actors such an

equilibrium will not provide the necessary returns to justify the investment

It is important for organizations to invest in data science competencies to build solutions for unique security requirements and assess vendor solutions for more generalized or advanced use cases• Several tools and APIs now make

machine learning and AI accessible to a broader range of developers

AI facilitates collective defense -rapid sharing of insights within the security community. However, without a collaboration based approach AI will not discourage malicious actors because the total cost of security for each organization may outweigh the perceived business risk

#RSAC

Thank you!

ai in cyber security: a balancing force or disruptor? · ai in cyber security: a balancing force or...

Documents