filar seymour oreilly_bot_story_
TRANSCRIPT
Security + Design * Data Science: A Bot Story
Bobby Filar & Rich Seymour
O’Reilly Security 2017
October 31, 2017
About Us
• Data Scientist
• Background in NLP
• @filar
• Data Scientist
• Background in HPC
• @rseymour
Bobby Filar Rich Seymour
• Endpoint Protection Platform
• ML/Domain Expertise
• @EndgameInc
• endgame.com
Endgame
State of Security Software
Detect Review Analyze Identify Notify Collect Validate React
Dwell Time Containment Time
Reduce Dwell Time using domain knowledge + ML
State of Security Software
Detect Review Analyze Identify Notify Collect Validate React
Dwell Time Containment Time
Reduce Dwell Time using domain knowledge + ML
• User and Entity Behavior Analytics (UEBA)
• Detection/Prevention
• Network Monitoring
State of Security Software
Detect Review Analyze Identify Notify Collect Validate React
Dwell Time Containment Time
Containment Time depends on real-time data availability and intuitivenessof the user interface
Opportunities to
Expand ML
Solutions
Reducing Containment Time Via ML is
Difficult!
• How do you optimize a human?
• How do you handle a diverse user base?
• Where do you find quality data?
• How can we optimize workflows?
User-Centric Design Study
Design Process
• Discovery• Understanding our users,
confirming/disproving biases, capturing
organizational workflows
• Concepting• Creating design requirements solutions
• Prototyping and User Testing• Feature creation and taking it back into the
‘wild’ for testing
Avoid Our Bias
Find What Users Need
Why Do This?
Objective Capture team dynamics and worker roles within security organization to identify challenges common across security teams
User Group Team Type Environment Collection Method
A Traditional SOC Day-to-day use User interviews
B Novice Training Team Mock Scenario Side-by-side
monitoring,
Retrospective & User
interviews
C Internal Red vs. Blue Mock Scenario Mirrored Scenario as
User Group B
D Traditional SOC & Consulting
group
Day-to-day use User testing
How our customers see it…
Insufficient Resources
• Onboarding & training
new hires & retention
• Limited time to review
alerts and incidents
Lack of easy-to-use
automated tools
• Difficult for non-
programmers to use
• Easy for programmers
to mess up!
Security platforms are just
difficult to use!
• Forces conformity
• Requires level of expertise
to extract value
Let’s meet the users
Tier 1
AnalystTier 3
Analyst
Forensic
Hunter
SOC
Manager
”Pretty much any alert or call here
goes through me first. It can be a
lot to handle."
Tier 1
Analyst
"I’m doing the hard work here,
also I could really use a Tier 2."
Tier 3
Analyst
"Don’t call me unless the server
room is on fire, I’m working on my
new radare2 theme."
Forensic
Hunter
"I have a meeting with the CISO,
then 3 one-on-ones and a report
due, no time for a quote."
SOC
Manager
Findings: Day in Life of a Tier I
Data
Deluge
Lack of
Context
Repetitive
Processes
Searching not
Analyzing
Lack of
Expertise
Lack of
Time
Findings: Pain Points
• Lacks context• Is it actually bad?• Is it anywhere else?• Did it talk to the network?
• Lacks connectivity• Is this alert tied to any others?
• Pivot on single IOC• Hash• Filename• IP address
Alert Type: Suspicious Binary
Alert Created: Feb 11, 2017
Severity: High
Confidence: 73%
File Path: C:\Temp\malware.exe
File Size: 45700
MD5: 5d41402abc4b2a76b9719d911017c592
File Created: Feb 11, 2017
What do you do when there are 100s of these each day?
Diverse Data Sources
Process File Users Network Registry
Solution
Make a Bot!
Hello, how can I help you?
You Can Start Making a Bot Today!
• Emphasize chatOps withHubot, Chattie, etc
• If you can leverage outside APIs!
• Define what you want to automate
• The bot transforms text into functions with arguments
search(process=‘fish’)
garbage input
usable output
scrubscrubscrub
But…
Solution
• An intelligent assistant can assist in the automation of tasks
• Guides user through complex tasks• Mimics human conversation• Natural Language Understanding
determines user intent
• Imagined an assistant that provides ability to:
• Ask questions• Execute workflows• Educate users• Recommend next steps
Hello, how can I help you?
Found the right tools (talk to us after for how)
spaCy
CRFsuite
From Research to Reality
Building an Intelligent Assistant From Scratch
Requirement #1
Eliminate query syntax via natural language
Natural Language vs. Query Language
Query LanguageSELECT * FROM TABLE process_event WHERE process_name == “odinaff.exe”;
Natural Language vs. Query Language
Query LanguageSELECT * FROM TABLE process_event WHERE process_name == “odinaff.exe”;
Natural LanguageSearch process event data for process odinaff.exe
Natural Language vs. Query Language
Query LanguageSELECT * FROM TABLE process_event WHERE process_name == “odinaff.exe”;
Natural LanguageSearch process event data for process odinaff.exe
RealityDid odinaff.exe run on any endpoints?
Interaction Types
Turn-based Conversation
User: Search processes
Artemis: Okay. Please provide a hash or filename
User: odinaff.exe
Artemis: Got it. Which endpoints would you like to target?
User: Windows 10 machines.
Artemis: Okay. Would you like to launch this search?
User: Yes
Artemis: Searching process data for odinaff.exe on
Windows10 endpoints.
Interaction Types
Turn-based Conversation
User: Search processes
Artemis: Okay. Please provide a hash or filename
User: odinaff.exe
Artemis: Got it. Which endpoints would you like to target?
User: Windows 10 machines.
Artemis: Okay. Would you like to launch this search?
User: Yes
Artemis: Searching process data for odinaff.exe on
Windows10 endpoints.
Resolve
Confirm
Handle
Interaction Types
Goal-oriented Conversation
User: Show me process event data for odinaff.exe
on all Windows 10 endpoints
Artemis: Are you sure?
User: Yes!
Interaction Types
API-Driven Investigations
curl 'api/v1/event_search' -H "Content-Type: application/json" -H
'authorization: <api_key> --data-binary '{"intent":"search_process",
"parameters": {
"process_name":"odinaff.exe",
"filepath": "C:\\Temp\\*.exe"
}
}'
Back to the Users!Requirement #2
Provide Context-Driven Alert Triage
•Understand what a user is currently viewing
• Ingest metadata in current view
•Map nested fields to common nomenclature
•Access indicators by saying “this <field>”
ContextSharing
Endpoint data
Endpoint data
Process data
Endpoint data
Process data
User data
“Search process data for this MD5 on all endpoints”
“Was this domain requested on any other endpoint?”
“Has user admin logged into this endpoint?”
Requirement #3
Educate Users on Platform Features
• Assist in feature discovery
• Reduce need to consult docs
• Recommend best practices
• Focus analysts in current view• Avoid copy/paste operations
WhisperText
Did it run anywhere else?
Where else is the file?
Has anyone else seen it in the wild?
Has it sent any data out of my
network?
Requirement #4
Recommend Next Steps
Short-term Memory
• Avoid defeated users
• Disambiguate user input
• Eliminate repetitive tasks
• Guide users through recommended actions using playbooks
search_network
search_network
search_networ
ksearch_proces
s
unknown
search_networ
k
unknown
unknown
“It looks like you’re trying to…”
“Would you like to run that
same query on this data?”
Repetition
Avoid Frustration
Long-Term MemoryCurrent User
Location Action
Alert List Accessed
Malware Alert
Malware Alert Open Artemis???
… …
Long-Term MemoryCurrent User
Location Action
Alert List Accessed
Malware Alert
Malware Alert Open Artemis???
… …
Location Action
Process Inject. Search process
Results view Kill process
Endpoint List Search DNS
Results View Search process
Results View Kill Process
Most Common Actions
Across Organization
Alert List Accessed
Malware Alert
Malware Alert Search Process
Results View Search Network
Results View Grab Memory
Long-Term MemoryCurrent User
Location Action
Alert List Accessed
Malware Alert
Malware Alert Open Artemis???
… …
Location Action
Process Inject. Search process
Results view Kill process
Endpoint List Search DNS
Results View Search process
Results View Kill Process
Most Common Actions
Across Organization
85% Of analysts perform
this sequence
Alert List Accessed
Malware Alert
Malware Alert Search Process
Results View Search Network
Results View Grab Memory
Long-Term MemoryCurrent User
Location Action
Alert List Accessed
Malware Alert
Malware Alert Open Artemis???
… …
Location Action
Process Inject. Search process
Results view Kill process
Endpoint List Search DNS
Results View Search process
Results View Kill Process
Most Common Actions
Across Organization
85% Of analysts perform
this sequence“We recommend the
following actions to triage
this alert”
Alert List Accessed
Malware Alert
Malware Alert Search Process
Results View Search Network
Results View Grab Memory
Requirement #5
Expedite Focused Collection
Structured Workflows & Automation
Domain expertise + Workflow *Automation• Use playbooks
• Endgame provides some built-in• Alert Remediation
• Quick Responses
• Multi-step analysis
• Orgs should be empowered to add their own
Make Security Software Work for You!
PowerShell Misuse Playbook
Traditional Investigation
1.Narrow scope to limited endpoints
2.Understand adversary TTPs
3.Gather events from limited endpoints
4.Analyze events from for signs of TTPs
5.Discover suspicious activity
6.Decode obfuscated commands
7.Pinpoint PowerShell activity
8.Expand scope to next set of endpoints
9.Repeat…
Endgame Artemis
“Find powershell activity”
Automatically discovers and
analyzes malicious activity across
your endpoints in minutes
Back to the Users!
“I want easy access to response actions”
“I want better insight into what my team is working on”
“I want to share new IOCs with my teammates”
“I still want an option for a structured query language”
How our customers NOW see it…
Better Use of Resources
• Helps with onboarding
& training new hires
• Decreases time to
review alerts and
incidents
Automation via NLP
• Easy for non-
programmers to use
• Exposes response
actions via API
Platform that works for you
• Encourages
personalization
• Reduces level of expertise
to extract value
ClosingThoughts
• Data science can be applied to more than just good/bad
• User-centric studies help capture diverse team environments
• Intelligent assistants can help alleviate rigid interfaces