binaryedge - security data metrics and measurements at scale - bsideslisbon 2015
TRANSCRIPT
![Page 1: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/1.jpg)
BE ready. BE safe. BE secure.
Bsides Lisbon 2015 Security data Metrics and measurements at
scale1
![Page 2: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/2.jpg)
binaryedge.io
Who
TIAGO HENRIQUES
• BSc Software Engineering / University of Brighton
• MSc Computer Security and Forensics / University of Bedfordshire
• 8 Years experience in Information Security consultancy, leadership and research
CEO and Founder @ BinaryEdge
TIAGO MARTINS
• BSc and MSc Computer Science / University of Lisbon
• 7 Years experience developing real-time systems and high-volume data processing
CTO and Co-Founder @ BinaryEdge
ROBERTO BARBOSA
• More than 20 years on the IT sector
• ex-Engineer at Sun Microsystems
• Former Philip Morris corporate Auditor
• expert on High Scalability and Availability on the Finance Sector (UBS, Citigroup and Leonteq) and mobile startup.
COO and Head of DataScience at BinaryEdge
2
![Page 3: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/3.jpg)
binaryedge.io
WhereEurope
SWITZERLANDHeadquarter
Market
Management
PortugalWorkforce
United KingdomMarket
Workforce
3
![Page 4: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/4.jpg)
binaryedge.io
What
Machine Learning Data Mining
security
data
data
data
data
4
![Page 5: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/5.jpg)
binaryedge.io
NEW COMMODITY
DATA IS THE NEW OIL
5
![Page 6: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/6.jpg)
binaryedge.io
NEW CURRENCY
6
![Page 7: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/7.jpg)
binaryedge.io
DATA BUSINESS MODEL
Gather and Sell raw Data
collect supply
store host
filter refine
enhance enrich
simplify access
consult advise
Hold onto someone else’s data for them
Strip out problematic records or data fields or release interesting data subsets
Blend in other datasets to create a new and interesting picture
Help people cherry-pick the data they want in the format they prefer
Provide guidance on others’ data efforts
7
![Page 8: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/8.jpg)
binaryedge.io
ORGANIsATION
8
![Page 9: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/9.jpg)
binaryedge.io
BECOMING EXPONENTIAL ORGANIZATION
SIDEAS
CALE
Staff on Demand
MASSIVE TRANSFORMATIVE PURPOSE
MTP
MARKET
Interfaces
Dashboards
Experimentation
Autonomy
Social
Community and Crowd
Algorithms
Lease Assets
Engagement
LEFT BRAIN
ordercontrolstability
RIGHT BRAIN
creativitygrowthuncertainty
9
![Page 10: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/10.jpg)
binaryedge.io
BusinessOperations
Product
Security
DEVdata
science
UI
sysops
data
agent
UX
backend
product
owner
frontendmobile
information
retrievalInternet
of
Things
Internet
of
Money
Internet
of
People
Internet
of
Content
machine
learning
math/stats
design
Tester
QA
quality
support
security
experts
security
experts
audits
devops
marketing
human
resources
finance
sales
sales rep
sales rep
assistant
assistant
social
media
consulting
ORGANISATION RELATIONSHIP
10
![Page 11: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/11.jpg)
binaryedge.io
Goals Requirements Results• Easy to UNDERSTAND • understandability Simple Architecture
• EASY TO extend • extensibility Loosely Coupled Services
• EASY TO change • changeability Built for replacement
• EASY TO replace • replaceability Self-dependency
• EASY TO deploy • deployability Immutability
• EASY TO scale • scalability Responsibility Segretation
• EASY TO recover • resilience Decoupling and Isolation
• EASY TO connect • uniform interface API based
• EASY TO afford • cost efficienT On-demand computing
DATA ARCHITECTURE DESIGN
11
![Page 12: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/12.jpg)
binaryedge.io
PHASEIMPORTANCEEFFORTMILESTONESless
MOREGREATER
average
milestone
intelligencedata information knowledge
July August September October November December 2015
sensor agent(minions)
backend feed API User API data visualizationscalability
machine learning threat
classification
deeplearning
storage &archiving
search &classification
data analyticsimage
processingpredictiveanalytics
POC data process analysis intel
PRODUCT IMPROVEMENTS 2015
12
![Page 13: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/13.jpg)
binaryedge.io
ENGINEERING
13
![Page 14: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/14.jpg)
No legacy to maintain Lots of experience in the team Lots of technologies to pick from Micro service based approach
Metrics collection at large scale
Very young startup
Technologies? Architecture? Prototype?
but where to start?
14
![Page 15: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/15.jpg)
Metrics collection at large scale
15
![Page 16: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/16.jpg)
Architecture
Focus on architecture
simple resilient scalable replaceable components
Technology independent
16
![Page 17: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/17.jpg)
architecture overview
17
![Page 18: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/18.jpg)
HTTP API Command line clients Modules
• Python • NodeJS • Go
Third-party APIs
architecture - job request
API oriented
job types Data Collection Data Processing / Analytics
18
![Page 19: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/19.jpg)
Agents listen for work in channels technologies Multiple types of agents
Agents
architecture - job execution
GO Python NodeJS Scala Java
RabbitMQ NSQ
Redis Apollo
job control
19
![Page 20: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/20.jpg)
ActiveMQ NATS Kafka Kestrel NSQ RabbitMQ Redis QPID HornetQ Apollo
http://bravenewgeek.com/dissecting-message-queues/
architecture - job execution
Messaging
Broker
ActiveMQ
20
![Page 21: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/21.jpg)
http://bravenewgeek.com/dissecting-message-queues/
architecture - job execution
zeromq nanomsg
Messaging
Brokerless
21
![Page 22: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/22.jpg)
architecture - job execution
Amazon Microsoft Google realtime.co …
Messaging
Cloud
22
![Page 23: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/23.jpg)
Agents can feed other agents
Different types of enrichment • Clean data • Process data • Alarms
architecture - data enrichment
23
![Page 24: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/24.jpg)
All information is stored • RAW data • Processed data Geolocate of information Encrypted data for each client Data Storage
Cloud Services • Amazon S3 • Amazon DynamoDB • AzureDocumentDB • Azure Storage • Google Cloud Storage • Google BigQuery • Rackspace Cloud Files • Constant Cloud Storage • Skylable • RunAbove
architecture - store
Database Solutions • MongoDB • ElasticSearch • Cassandra • Riak • LUCEne
24
![Page 25: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/25.jpg)
Delivering data • Realtime - Streaming • Storage for Analytics • API • Raw
Data Analytics • Kibana • InfluxDB • Druid
architecture - serving
25
![Page 26: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/26.jpg)
Data Processing • Apache Spark • Hadoop • Amazon Kinesis
Data Intelligence • Amazon Machine Learning/EMR • Google Prediction API • Azure Machine Learning
architecture - serving
26
![Page 27: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/27.jpg)
Our agents are very simple • Simple tasks • Easy to maintain and adapt
agents/ minions
Agents can be located/run anywhere • Geo distribution • Clouds • Dedicated Servers • Raspberry Pis in Tiago Henriques’ dual gbit connection
27
![Page 28: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/28.jpg)
New Relic Logentries Server Density Cloud watch Grafana logstash
monitoring
28
![Page 30: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/30.jpg)
binaryedge.io
Machine Learning
30
![Page 31: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/31.jpg)
binaryedge.io
CHALLENGES IN DATA MINING
MODELLING LARGE SCALE NETWORKS
DISCOVERY OF THREATS
Network dynamics and Cyberattacks
Privacy Preservation in data mining
31
![Page 32: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/32.jpg)
ip addressurl address
linked urls
internal
external
Company
registration
people
phonesocial
search
photos
family&friend
behavior
news
forums
sub-reddits
topics
likes
metadata
BGP
co-hosted sites
shared infrastructure
AS membership
AS Peer
list of IPs
AS
whois
contact
geolocation
phone
social networks
office locations
portscan
services
web
http https certificate configuration authorities entities
web server framework headers cookies
screenshots
dns
domains AXFR MX records
banners
image classifier
threat
SMB
VNC
RDP
files
files apps
SWusers
OCR
data pointscontingency
irrelevant
© 2015 binaryedge.io
malware
32
![Page 33: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/33.jpg)
binaryedge.io
Machine Learning techniques
• Artificial Neural Network (ANN)
• Support vector machine (SVM)
• Decision trees
• bayesian networks (BNS)
• K-Nearest neighbour (knn)
• Hidden Markov Model (HMM)
33
![Page 34: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/34.jpg)
binaryedge.io
Machine Learning - Why?
Classification
Detection
Clustering
Automation
correlation
prediction
analysis
34
![Page 35: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/35.jpg)
binaryedge.io
Measurements on our own data
Support - Indicates which percentage of data on storage shows correlation
Confidence - Indicates probability of our assumption being correct
35
![Page 36: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/36.jpg)
binaryedge.io
Improving our own data
• Kalman Filter
• AdaBoost (Adaptive Boost)
SAMPLE 1 SAMPLE 1.2 SAMPLE 1.3 SAMPLE 1.4
DEEPER DATA POINT SUPPORT
MANUAL CLASSIFICATION
Weight 6/10 Portscan
Weight 3/10 GEolocation
Weight 5/10 OCR screenshot
Weight 9/10 Previous known
Correct data
36
![Page 37: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/37.jpg)
binaryedge.io
DATA chain
CollectionData
Processing ML of Data Report Storage
37
![Page 38: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/38.jpg)
binaryedge.io
CYBER INNOVATION LOOP
Observations
guide & control
cultural
IDENTITY
new information
previous experience
analysis &
synthesisDecisions Action
feedback
feedback
interaction with
environment
interaction with
environment
SECURITY
FEEDS
REAL WORD DATA
RWD
MODELS
guide & control
observe ORIENT DECIDE act
feed forward
feed forward
feed forward
38
![Page 39: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/39.jpg)
binaryedge.io
CYBER INNOVATION LOOP
INFORMATION
HYPOTHESISdirectives
facts classification
resolution
ASSE
SSM
ENT
enac
tmen
t
knowledge
data
• classification knowledge transforms fact to information • assessment knowledge transforms information to hypothesis • resolution knowledge transforms hypothesis to directive • enactment knowledge transforms directive to fact
39
![Page 40: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/40.jpg)
binaryedge.io
CYBERSECURITY DATA SCIENCE
TELEMETRY
SENSOR DATA
CONTEXTUAL DATA
HISTORICAL DATA
REAL TIME PREDICTIONS AND DECISIONS
agents
agents
REAL WORD DATA
RWD
RECOMMENDERCLASSIFIER SOCIAL THREAT FRAUD
features MODELS VALUE
data INTELLIGENCEdata engineering
custom
dashboard
40
![Page 41: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/41.jpg)
binaryedge.io
DEMO
41
![Page 42: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/42.jpg)
binaryedge.io
DEMO
42
![Page 43: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/43.jpg)
binaryedge.io
DEMO
43
![Page 44: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/44.jpg)
binaryedge.io
DEMO
44
![Page 45: BinaryEdge - Security Data Metrics and Measurements at Scale - BSidesLisbon 2015](https://reader034.vdocuments.net/reader034/viewer/2022042512/55ce767abb61eb24058b4704/html5/thumbnails/45.jpg)
contingency irrelevantthreat safe
BE ready. BE safe. BE secure.
BINARYEDGE.IO
Finsterrütistrasse 4, 8134 Adliswil, ZURICH
Switzerland
+ 41 78 632 32 90 Email : [email protected]
www.binaryedge.io
45