bsidessf 02/12/2017 - make alerts great again

43
Daniel Popescu [email protected] / @danielpopes Make Alerts Great Again

Upload: daniel-popescu

Post on 11-Apr-2017

90 views

Category:

Engineering


0 download

TRANSCRIPT

Page 2: BSidesSF 02/12/2017 - Make Alerts Great Again

2015 - now():- Security Engineer - Yelp

2008 - 2015:- Software Engineer - MSFT

Daniel Popescu

Page 3: BSidesSF 02/12/2017 - Make Alerts Great Again

Yelp’s MissionConnecting people with great

local businesses.

Page 4: BSidesSF 02/12/2017 - Make Alerts Great Again

Yelp StatsAs of Q3 2016

97M 3274%115M

Page 5: BSidesSF 02/12/2017 - Make Alerts Great Again

3000+ Servers4000+ Employees500+ Microservices

A TON of logs

Yelp Infrastructure

Page 6: BSidesSF 02/12/2017 - Make Alerts Great Again

Collect Logs Stream Index Visualize and Alert

osquery

Elastalert Kibana

Yelp Security Infrastructure

Page 7: BSidesSF 02/12/2017 - Make Alerts Great Again

Lack of VisibilityNot ActionableNo StandardsFunctionally Correct?False Positives

Common Alerting Pitfalls

Page 8: BSidesSF 02/12/2017 - Make Alerts Great Again

Historical StatisticsActionableIncident Response StepsFunctionally Correct

Yelp Security Alerts

Page 9: BSidesSF 02/12/2017 - Make Alerts Great Again

Alerts span multiple systems- Elasticsearch- Splunk

Alert metrics unknown- Count- Frequency

No comprehensive dashboard

Lack of Visibility: Problem

Page 10: BSidesSF 02/12/2017 - Make Alerts Great Again

Alert Reporter- Weekly Report- Multiple Alert Sources- Insights

- Frequencies- Self Service- Delivery Mechanism

Lack of Visibility: Solution

Page 11: BSidesSF 02/12/2017 - Make Alerts Great Again
Page 12: BSidesSF 02/12/2017 - Make Alerts Great Again

Email- Only a reporting channel- No ownership

Ticketing- Better than email- No enforcement

Not Actionable: Problem

Page 13: BSidesSF 02/12/2017 - Make Alerts Great Again
Page 14: BSidesSF 02/12/2017 - Make Alerts Great Again
Page 15: BSidesSF 02/12/2017 - Make Alerts Great Again

{ "eventName": "CreateRole", "requestParameters": { "roleName": "rds-monitoring-role" }, "userIdentity": { "userName": "ioannis+admin" }}

Page 16: BSidesSF 02/12/2017 - Make Alerts Great Again

{ "eventName": "AddUserToGroup", "requestParameters": { "groupName": "admins", "userName": "jsendor" }, "userIdentity": { "userName": "mattc" }}

Page 17: BSidesSF 02/12/2017 - Make Alerts Great Again

{ "eventName": "RemoveUserFromGroup", "requestParameters": { "groupName": "RequireMFA", "userName": "martin" }, "userIdentity": { "userName": "martin" }}

Page 18: BSidesSF 02/12/2017 - Make Alerts Great Again

{ "eventName": "AuthorizeSecurityGroupIngress", "requestParameters": { "cidrIp": "0.0.0.0/0", "fromPort": 1, "toPort": 65535 }, "userIdentity": { "userName": "lmatthew" }}

Page 19: BSidesSF 02/12/2017 - Make Alerts Great Again
Page 20: BSidesSF 02/12/2017 - Make Alerts Great Again
Page 21: BSidesSF 02/12/2017 - Make Alerts Great Again

No more emailsJIRA Service Desk- SLAs- Queues

Not Actionable: Solution

Page 22: BSidesSF 02/12/2017 - Make Alerts Great Again
Page 23: BSidesSF 02/12/2017 - Make Alerts Great Again

Actionable Alerting Service (AAS)- Finds assignees for alerts- Escalate when SLA breached- Looks at JIRA ticket metadata

Not Actionable: Solution

Page 24: BSidesSF 02/12/2017 - Make Alerts Great Again

Alerts automatically assigned to actors- Common Administrative Tasks- Infrastructure Changes- Honor System (kinda)- Mistakes- Malware?

Self Service Alerts

Page 25: BSidesSF 02/12/2017 - Make Alerts Great Again

Self Service - Human

duo_data { "action": "integration_create", "description": { "iname": "Auth API" "type": "rest" }, "eventtype": "administrator", "object": "Auth API", "username": "alect"}

Page 26: BSidesSF 02/12/2017 - Make Alerts Great Again

Self Service - Non Human

{ "actor": "svc-dasher", "event": { "name": "CREATE_ORG_UNIT", "parameters": { "ORG_UNIT_NAME": "TestJMA" }, "type": "ORG_SETTINGS" }}

Page 27: BSidesSF 02/12/2017 - Make Alerts Great Again

Schedule name in alert metadataAssign alert to current on-point

Pagerduty Schedule

Page 28: BSidesSF 02/12/2017 - Make Alerts Great Again

Pagerduty Schedule

Page 29: BSidesSF 02/12/2017 - Make Alerts Great Again

Alert Owner

Page 30: BSidesSF 02/12/2017 - Make Alerts Great Again

What to do when SLA is breached- Ping user in JIRA- Ping user in IRC / Slack channel- Ping user’s manager in JIRA

Escalation Channels

Page 31: BSidesSF 02/12/2017 - Make Alerts Great Again

SLA Past Due - JIRA Ping

Page 32: BSidesSF 02/12/2017 - Make Alerts Great Again

SLA Past Due - CC Manager

Page 33: BSidesSF 02/12/2017 - Make Alerts Great Again

No RFC for authoring alertsFeature Set Awareness

No Standards: Problem

Standards

Page 34: BSidesSF 02/12/2017 - Make Alerts Great Again

New Alerts Runbook- Priorities- Mandatory Fields- Delivery Mechanism- Feature Set- Testing

No Standards: Solution

Page 35: BSidesSF 02/12/2017 - Make Alerts Great Again

Alert Definition Bugs- Typos- Bad assumptions

Data Sources- Flatlines- Drop in volume

Functionally Correct?: Problem

Page 36: BSidesSF 02/12/2017 - Make Alerts Great Again

End-to-End Testing- For 100% of new alerts- Subset of existing alerts

Flatline alerts- Test them too

Functionally Correct?: Solution

Page 37: BSidesSF 02/12/2017 - Make Alerts Great Again

There will be false positives- Windows malware on mac- New production services- Sketchy? DNS requests

False Positives: Problem

Page 38: BSidesSF 02/12/2017 - Make Alerts Great Again

Automation- Incident Response- Tools and Scripts

Constant alert improvement

False Positives: Solution

Page 39: BSidesSF 02/12/2017 - Make Alerts Great Again

Measuring SuccessActive Tickets- Manageable- SLA Met > 50%

Positive Reception- Corp Eng- Operations

Security Team- Happy

Page 40: BSidesSF 02/12/2017 - Make Alerts Great Again

RecapProblem Solution

Visibility Alert Reporter + JIRA Service Desk

Actionability JIRA + Self Service Alerts + Pagerduty +Actionable Alerting Service

Standardization Runbook For New Alerts

Functional Correctness End-To-End Tests

False Positives Automation

Page 41: BSidesSF 02/12/2017 - Make Alerts Great Again

Make your alerts actionable!

Make sure you have visibility into your alerting metrics!

Make sure your alerts actually work!

TLDR;

Page 42: BSidesSF 02/12/2017 - Make Alerts Great Again

@YelpEngineering

fb.com/YelpEngineers

engineeringblog.yelp.com

github.com/yelp