advances in cloud-scale machine learning for cyber … · blue team kill chain recon delivery...

42
SESSION ID: SESSION ID: #RSAC Mark Russinovich Advances in Cloud-Scale Machine Learning for Cyber-Defense EXP – T11 CTO, Microsoft Azure Microsoft Corporation @markrussinovich

Upload: doancong

Post on 13-Aug-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

SESSIONID:SESSIONID:

#RSAC

MarkRussinovich

AdvancesinCloud-ScaleMachineLearningforCyber-Defense

EXP– T11

CTO,MicrosoftAzureMicrosoftCorporation@markrussinovich

#RSAC

SQLServer+R MicrosoftRServer Hadoop+R Spark+R MicrosoftCNTK

AzureMachineLearning

CortanaIntelligenceSuite

RTools/PythonToolsforVisualStudio

AzureNotebooks(JuPyTer) CognitiveServices BotFramework Cortana

Office365 Bing Skype Xbox360 Dynamics365HoloLens

Intelligenceineverysoftware

#RSAC

Microsoft’sdailycloudsecurityscale

10sofPBsoflogs

300+millionactiveMicrosoftAccountusers

Detected/reflectedattacks

>10,000location-detected

attacks

1.5millioncompromiseattemptsdeflected

1.3+billionAzureActive

Directorylogons

#RSAC

WHATISATTACK DISRUPTION?

#RSAC

RedTeamKillChain

Recon Delivery Persist Move ElevateFoothold Exfiltrate

#RSAC

BlueTeamKillChain

Recon Delivery Persist Move ElevateFoothold Exfiltrate

Gather Detect Triage Context PlanAlert Execute

forAttackDetection

#RSAC

BlueTeamKillChain

Recon Delivery Persist Move ElevateFoothold Exfiltrate

Gather Detect Triage Context PlanAlert Execute

forAttackDisruption

#RSAC

FalsePositives ManualTriage

ChallengesforAttackDisruption

#RSAC

FalsePositivesLoseabilitytotriage

Recon Delivery Persist Move ElevateFoothold Exfiltrate

Gather Detect Triage Context PlanAlert Execute

#RSAC

FalsePositivesFACTYoucannot salvageafalsepositivewithjustVisualization.Youneedbettersolutions.

#RSAC

FalsePositivesEvolutionofsecuritydetectiontechniques

DataProgram/Rules

Output

TRADITIONALPROGRAMMING

MACHINELEARNING

Hand-craftedrulesbysecurityprofessionalsCon:Rulesarestatic,anddon’tchangewithchangesinenvironment=>FalsePositives!

DataOutput/Labels

Program

Systemadaptstochangesinenvironmentasnewdataisprovided,andre-trained

OursupervisedlearningapproachenablesdetectionwithoutgeneratingmanyFPs

#RSAC

LabelsinMicrosoftForsupervisedlearning,Azuregetslabeleddatathrough:

Domainexperts,customerswhoprovidefeedback

fromAlerts

AutomatedAttackbots

Labelsfromotherproduct

groups(includingO365,WindowsSeville)

SurgicalRedteamexercises(OneHunt)

BugBountyMSRC

#RSAC

FalsePositives ManualTriage

#RSAC

ManualTriage

Gather Detect Triage Context PlanAlert Execute

ForAttackDisruption,weneedtothinkbeyonddetection

Weneedtochangethis

#RSAC

Adaptable

SuccessfulDetection

Explainable Actionable

Properties ofaSuccessfulMachineLearningSolution

#RSAC

EVOLVINGATTACKS

Constantlychangingenvironmentsleadstoconstantlychangingattacks• Newservices• Newfeaturesforexistingservices

EVOLVINGLANDSCAPE

FrequentdeploymentsNewservicescomingonlineUsagespikes

AdaptableAdaptableinCloudisDifficultWhy?

#RSAC

AdaptableExplainabilityWhy?Surfacingasecurityeventtoanend-usercanbeuselessifthereisnoexplanation

Explainability ofresultsshouldbeconsideredatearliestpossiblestageofdevelopment

Explainable

Bestdetectionsignalwithnoexplanationmightbedismissed/overlooked

<Example– Howdoyouexplainthistoananalyst>UserId Time EventId Feature1 Feature2 Feature3 Feature4 … Score

1a4b43 2016-09-0102:01 4688 0.3 0.12 3.9 20 … 0.2

73d87a 2016-09-0103:15 4985 0.4 0.8 0 11 … 0.09

9ca231 2016-09-0105:10 4624 0.8 0.34 9.2 7 … 0.9

5e9123 2016-09-0105:32 4489 2.5 0.85 7.6 2.1 … 0.7

1e6a7b 2016-09-0109:12 4688 3.1 0.83 3.6 6.2 … 0.1

33d693 2016-09-0114:43 4688 4.1 0.63 4.7 5.1 … 0.019

7152f3 2016-09-0119:11 4688 2.7 0.46 3.9 1.4 … 0.03

Resultswithoutexplanationarehardtointerpret

#RSAC

AdaptableActionableDetections

DetectionsmustresultindownstreamactionGoodexplanationwithoutbeingactionableisoflittlevalue

Explainable

• Policydecisions• Resetuserpassword

EXAMPLES

Actionable

#RSAC

Adaptable

Explainable Actionable

Basic AdvancedSophisticationofAlgorithms

LessUsefulMoreUseful

UsefulnessofAlerts

SuccessfulDetection

Outlier

SuccessfulDetectionsincorporatedomainknowledgethroughdisparatedatasetsandrules

SecurityInterestingAlerts

Framework foraSuccessful Detection

Anomaly

SecurityDo

mainKn

owledge

MoreDo

mainknow

ledge

LittleDom

ainknow

ledge

#RSAC

CaseStudy 1Successfuldetectionthroughcombiningdisparatedatasets

PROBLEMSTATEMENTDetectcompromisedVMsinAzure

HYPOTHESISIftheVMissendingspam,thenitismostlikelycompromised.

SOLUTIONUsesupervisedMachineLearningtoleverageLabeledspamdatafromOffice365andcombinewithIPFIXdatafromAzure.

#RSAC

CaseStudy1

TechniqueOverviewIPFIXFeatures

Azure

SPAM

SpamTagscomefromO365!

EXAMPLESAutomated• Allportswithtraffic• Numberofconnections• WhichTCPflagscombinationexist• Manymore…

#RSAC

IPFIXdataSpamlabeledIPFIX

dataBenignIPFIX data

CaseStudy1

TechniqueOverview

NewCase

AutomatedCompromiseDetection

MachineLearning

ü

ü

#RSAC

Dataset

ExternalIPs

ExternalPorts

TCPflags

Existence(binary)

Counts

Normalizedcounts

FEATURESOURCES FEATURETYPES

CaseStudy1

WHYISNETWORKDATAGOODFORDETECTION?

ü Noinstallationrequired– runningonallAzuretenantsü NooverloadontheVMü Resilient– cannotbemaliciouslyturnedoffü OSindependent

#RSAC

Results

CaseStudy1MachineLearningDeepDive:GradientBoosting

Inputdatafor1st iteration

Weaklearnerat1st iteration

#RSAC

Results

CaseStudy1MachineLearningDeepDive:GradientBoosting

Inputdatafor2nd iteration

Thedatapointsthatwereincorrectlycategorizedbytheweaklearnerinthefirstiteration(thepositiveexamples)arenowweightedmore.

Simultaneously,thecorrectpointsaredownweighted. Learnerat2nd iteration

#RSAC

Results

CaseStudy1MachineLearningDeepDive:GradientBoosting

Inputdatafor3rd iteration

Thedatapointsthatwereincorrectlycategorizedintheseconditeration(thenegativeexamples)arenowweightedmore.

Simultaneously,thecorrectpointsaredownweighted.

Learnerat3rd iteration

+ +

Finalresultisacombinationoflearnersfromeachiteration

#RSAC

ModelPerformanceandProductization

ModeltrainedinregularintervalsSizeofdata:360GBperdayWithinminutes

ClassificationrunsmultipletimesadayCompletedwithinseconds

Dataset TruePositiveRate FalsePositiveRate

OnlyusingAzureIPFIXdata 55% 1%

UsingAzureIPFIXandO365data 81% 1%

26pointsimprovement

CaseStudy1

#RSAC

CaseStudy 2Successfuldetectionthroughcombiningrulesandmachinelearning

PROBLEMSTATEMENTRulebasedmalwaredetectionplacehardconstraintsifsomethingisamalwareornot.Whiletheyarespecific,theyhavealotofFalsePositives,Falsenegativesandarenotadaptable

HYPOTHESISCanwecombinethehardlogicofrulebaseddetectionswiththesoft- logicofmachinelearningsystems?

SOLUTIONBuildtwoMLmodels:1)Model1thatbaselinesmalwarebehavior2)Model2thatincorporatesrulesasfeaturesCombineresultoftwomodels

#RSAC

ConventionalA/V

DetonationChamberSpinupmultipleVMsMultipleOSandOfficeversionsInstrumentattachmentbehavior

SafelinksProtectsagainstmaliciousURLsinRealTime(onclick)

CaseStudy2MALWAREDETECTIONBACKGROUNDATPArchitecture

#RSAC

CombinedVerdict

Combiner

FingerprintModel

BehavioralModel

CaseStudy2

• Hash/FuzzyHash• PEAnalyzer• FileTypeAnalyzer• PhotoSimilarity• …

• SysMon• ETWLogger• APIhooks• Crashdump• …

• YARA• ThreatIntel• NetworkAnalysis• MacroEvidence• …

PRE-ANALYSIS DETONATION POST-ANALYSIS

TechniqueOverview

#RSAC

CaseStudy2

Dataset

(SAMPLE)

#RSAC

CaseStudy2

MachineLearningDeepDive:FingerprintModel

Informationgetsmoregranular

CallOrder Level1 Level2 Level3 Level4 Level5

1 Process LoadImage SYSTEM .exe wscript2 Api CallFunction CreateMutexA _!MSFTHISTORY!_3 Api CallFunction CreateMutexW !IETld!Mutex4 Registry SetRegValue Tracing wscript_rasapi32 EnableTracing5 Registry DeleteRegValue InternetOption internetsettings ProxyBypass6 Process CreateProcess NOT_SANDBOX_CHECK LaunchedViaCom7 Network AccessNetWork Wininet_Getaddrinfo8 Api CallFunction CreateMutexW RANDOM_STR9 Network ResolveHost piglyeleutqq.com UNKNOWN10 Api CallFunction Connect UNKNOWN

#RSAC

CaseStudy2

MachineLearningDeepDive:FingerprintModelObservations

BenefitsoftheAction-Chainprototype• ItcanbeRESILIENT tomalwareobfuscationbecauseitcapturestheruntimesemanticsbyconsideringthemore IMPORTANT details

• FeatureextractionisNON-PARAMETRIC• Wouldgeneralizetomanysituations

ModelCurrent:L1LogisticRegressionfollowedbyL2LogisticRegression;weightedsamplesthroughcross-validation

#RSAC

CaseStudy2

MachineLearningDeepDive:BehavioralModel

Incorporatessecuritydomainknowledgeintothemodel

Sourceoffeatures• YARArules• Staticanalysis• AggregatesfromData:

• Registrykeys/valuesthatarechanged/created/deleted• Mutexes created• Numberofspawnprocessesperprocessdetailinfo

Themodelworkswelltodetectnewtypesmalware

#RSAC

ModelPerformanceandProductizationModeltrainedinregularintervalsSizeofdata:270GBperdayCompletedwithinminutes

ClassificationrunsmultipletimesadayCompletedwithinmilliseconds

10pointsimprovement

Dataset TruePositiveRate FalsePositiveRate

YARArulesonly 82.6% 0.0178%

Machine LearningModel1+Model2 93.6% 0.0127%

CaseStudy2

#RSAC

Gather Detect Triage Context PlanAlert Execute

ForAttackDisruption,WeNeedtoThinkBeyondDetection

Weneedtochangethis

#RSAC

incidents,notalerts

AnomalousDLL:rundll32.exelaunchedassposql11onCFE110095

Newprocessuploading:rundll32.exeto40.114.40.133onCFE110095

Largetransfer:50MBto40.114.40.133fromsqlagent.exeonSQL11006

Triageanomalousdll

sposql11

CFE110095rundll32.exe

newprocupload

40.114.40.133sqlagent.exe

largetransfer

SQL11006

alerttype process user host

alerttype process remotehost host

alerttype remotehost process host

#RSAC

incidents,notalertsTriage

#RSAC

DemoDemo

#RSAC

AttackDisruptionmeanstoshortenblueteamkillchain

SpeedReal-timedetection

Conclusion

QualityReducefalsepositives

ReactFasttriage

#RSAC

AttackDisruptionChecklist

AzureEventHubs

Combinedifferent datasets

Labels,Labels,Labels

ScalableMLsolutionandexpertise

ExampleAzureservicesyoucanleverage:

AzureMachineLearning

AzureDataLake

#RSAC

Thankyou