identification of the correct hard-scatter vertex at the...
TRANSCRIPT
Identificationofthecorrecthard-scattervertexattheLargeHadronCollider(LHC)PratikKumar(pratikk),NeelManiSingh(neelmani)
• ATLASisaparticledetectoranalyzingproton-protoncollisionsfromtheLHC.
• Identificationofthecorrecthard-scatterprimaryvertexfromaround60collisions.
• KeychallengefortheanalysisofLHCeventsispileup.
MOTIVATION
Thecurrenttechniquefortheidentificationoftheprimaryvertexselectsthevertexwiththehighesttotalenergy.Thetotalenergyiscomputedasthescalarsumofallparticletracksassociatedtothevertex.Thismethodhasaverypoorperformancewhenthenumberofpileupinteractionsislarge,selectingthewrongvertex40%ofthetimeasseeninthegraph.
CURRENTMETHOD
DATASET&FEATURESOur dataset consists ofcomputer simulated eventsof Higgs bosons. Each eventpicture consists of a list ofvertices (60 on average) andeach vertex consists of a listof particle tracks. Each trackis represented by a directionin 3D space, an origin (givenby the vertex it belongs to),and its energy. res that willbe used as inputs for aclassifier.
MODELUSED
RESULTModel F-Score
(test)F-Score(train)
LR 98.63 98.62NN 96.84 96.72BBLR 96.37 96.32BBNN 55.18 55.01
DISCUSSIONS
FUTUREWORKFeaturesused–• sumPt - scalarsumof
transversemomentumofallthetracks.
• sumPtw - weightedsumoftrack.
• MET - missingtransverseenergy.
• eta1,eta2,eta3 - anglefortop3tracks.
• pt1,pt2,pt3 –transversemomentumoftop3tracks.
Since we have a classimbalance problem, wehave to use a metric thatis not biased towardsthe majority class.Therefore we havechosen to use weightedF1-score.
VERTEXSELECTION
VERTEXSELECTIONEFFICIENCY
ThevertexselectionefficiencyfromBBLRshowsthatourmodelperformsbetterathighpileupdensitiesthanthecurrenttechnique.
• LogisticRegression(LR)• NeuralNetwork(NN)• BalancedBagging(BB)• BalancedBaggingwithLogisticRegression(BBLR)• BalancedBaggingwithNeuralNetwork(BBNN)LRdidnotperformwellduetoclassimbalanceindata.Baggingtechniquesgavebetterresults.
• Tillnow,wehavetreatedeachofthevertexasanindependentdatainput.
• Butforourproblem,weneedtoselectavertexfromagroupofverticesofanexperiment.Forthisweevaluateourmodelperexperimentandchosethevertexthatgivesthehighestprobabilities.
• Basedonthis,wecalculatethevertexselectionefficiencyvspileupdensities.
• NeuralNetworkscanbeimprovedbytuningoftheparameters- learningrate,hiddenlayerunits,etc.
• Thresholds- Predictionsbasedondifferentthresholds.• Features- Morefeaturescanbeextractedfromthe
simulationoftheevents.REFERENCES
• https://atlas.cern• SlidesfromProfArielSchwartzman• Debashree Devi,Saroj kr.BiswasandBiswajit Purkayastha,“Redundancydriven modified
Tomek-link basedundersampling:Asolutiontoclassimbalance”,2016• KevinW.Bowyer,Nitesh V.Chawla,LawrenceO.HallandW.PhilipKegelmeyer,“SMOTE:
SyntheticMinorityOver-samplingTechnique”,CoRR ,2011.• https://svds.com/learning-imbalanced-classes/.
• Thedataisinherentlyunbalancedbecauseofthenatureoftheexperimentsogeneraltrainingtechniquesdoesn’twork.
• FeaturesapartfromsumPt hasdiscriminatingeffectfordifferenttypeofcollisionevent.Thatiswhyourmodelworksbetterthantheexistingapproachathighpile-updensitiesaspervertexselectionefficiency.
• Ourmodelperformsalmostsimilarontrainingandtestset.Thereforenooverfitting.
• NeuralNetworkwithoutbalancedbaggingmethodofclassificationisunstableforthisdatasetasitproducesquitevaryingresults.
The ROC curve forBBLR indicates thatresults can beimproved by usingdifferent threshold