identification of the correct hard-scatter vertex at the...

Post on 18-Sep-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Identificationofthecorrecthard-scattervertexattheLargeHadronCollider(LHC)PratikKumar(pratikk),NeelManiSingh(neelmani)

• ATLASisaparticledetectoranalyzingproton-protoncollisionsfromtheLHC.

• Identificationofthecorrecthard-scatterprimaryvertexfromaround60collisions.

• KeychallengefortheanalysisofLHCeventsispileup.

MOTIVATION

Thecurrenttechniquefortheidentificationoftheprimaryvertexselectsthevertexwiththehighesttotalenergy.Thetotalenergyiscomputedasthescalarsumofallparticletracksassociatedtothevertex.Thismethodhasaverypoorperformancewhenthenumberofpileupinteractionsislarge,selectingthewrongvertex40%ofthetimeasseeninthegraph.

CURRENTMETHOD

DATASET&FEATURESOur dataset consists ofcomputer simulated eventsof Higgs bosons. Each eventpicture consists of a list ofvertices (60 on average) andeach vertex consists of a listof particle tracks. Each trackis represented by a directionin 3D space, an origin (givenby the vertex it belongs to),and its energy. res that willbe used as inputs for aclassifier.

MODELUSED

RESULTModel F-Score

(test)F-Score(train)

LR 98.63 98.62NN 96.84 96.72BBLR 96.37 96.32BBNN 55.18 55.01

DISCUSSIONS

FUTUREWORKFeaturesused–• sumPt - scalarsumof

transversemomentumofallthetracks.

• sumPtw - weightedsumoftrack.

• MET - missingtransverseenergy.

• eta1,eta2,eta3 - anglefortop3tracks.

• pt1,pt2,pt3 –transversemomentumoftop3tracks.

Since we have a classimbalance problem, wehave to use a metric thatis not biased towardsthe majority class.Therefore we havechosen to use weightedF1-score.

VERTEXSELECTION

VERTEXSELECTIONEFFICIENCY

ThevertexselectionefficiencyfromBBLRshowsthatourmodelperformsbetterathighpileupdensitiesthanthecurrenttechnique.

• LogisticRegression(LR)• NeuralNetwork(NN)• BalancedBagging(BB)• BalancedBaggingwithLogisticRegression(BBLR)• BalancedBaggingwithNeuralNetwork(BBNN)LRdidnotperformwellduetoclassimbalanceindata.Baggingtechniquesgavebetterresults.

• Tillnow,wehavetreatedeachofthevertexasanindependentdatainput.

• Butforourproblem,weneedtoselectavertexfromagroupofverticesofanexperiment.Forthisweevaluateourmodelperexperimentandchosethevertexthatgivesthehighestprobabilities.

• Basedonthis,wecalculatethevertexselectionefficiencyvspileupdensities.

• NeuralNetworkscanbeimprovedbytuningoftheparameters- learningrate,hiddenlayerunits,etc.

• Thresholds- Predictionsbasedondifferentthresholds.• Features- Morefeaturescanbeextractedfromthe

simulationoftheevents.REFERENCES

• https://atlas.cern• SlidesfromProfArielSchwartzman• Debashree Devi,Saroj kr.BiswasandBiswajit Purkayastha,“Redundancydriven modified

Tomek-link basedundersampling:Asolutiontoclassimbalance”,2016• KevinW.Bowyer,Nitesh V.Chawla,LawrenceO.HallandW.PhilipKegelmeyer,“SMOTE:

SyntheticMinorityOver-samplingTechnique”,CoRR ,2011.• https://svds.com/learning-imbalanced-classes/.

• Thedataisinherentlyunbalancedbecauseofthenatureoftheexperimentsogeneraltrainingtechniquesdoesn’twork.

• FeaturesapartfromsumPt hasdiscriminatingeffectfordifferenttypeofcollisionevent.Thatiswhyourmodelworksbetterthantheexistingapproachathighpile-updensitiesaspervertexselectionefficiency.

• Ourmodelperformsalmostsimilarontrainingandtestset.Thereforenooverfitting.

• NeuralNetworkwithoutbalancedbaggingmethodofclassificationisunstableforthisdatasetasitproducesquitevaryingresults.

The ROC curve forBBLR indicates thatresults can beimproved by usingdifferent threshold

top related