identification of the correct hard-scatter vertex at the...

Identificationofthecorrecthard-scattervertexattheLargeHadronCollider(LHC)PratikKumar(pratikk),NeelManiSingh(neelmani)

• ATLASisaparticledetectoranalyzingproton-protoncollisionsfromtheLHC.

• Identificationofthecorrecthard-scatterprimaryvertexfromaround60collisions.

• KeychallengefortheanalysisofLHCeventsispileup.

MOTIVATION

Thecurrenttechniquefortheidentificationoftheprimaryvertexselectsthevertexwiththehighesttotalenergy.Thetotalenergyiscomputedasthescalarsumofallparticletracksassociatedtothevertex.Thismethodhasaverypoorperformancewhenthenumberofpileupinteractionsislarge,selectingthewrongvertex40%ofthetimeasseeninthegraph.

CURRENTMETHOD

DATASET&FEATURESOur dataset consists ofcomputer simulated eventsof Higgs bosons. Each eventpicture consists of a list ofvertices (60 on average) andeach vertex consists of a listof particle tracks. Each trackis represented by a directionin 3D space, an origin (givenby the vertex it belongs to),and its energy. res that willbe used as inputs for aclassifier.

MODELUSED

RESULTModel F-Score

(test)F-Score(train)

LR 98.63 98.62NN 96.84 96.72BBLR 96.37 96.32BBNN 55.18 55.01

DISCUSSIONS

FUTUREWORKFeaturesused–• sumPt - scalarsumof

transversemomentumofallthetracks.

• sumPtw - weightedsumoftrack.

• MET - missingtransverseenergy.

• eta1,eta2,eta3 - anglefortop3tracks.

• pt1,pt2,pt3 –transversemomentumoftop3tracks.

Since we have a classimbalance problem, wehave to use a metric thatis not biased towardsthe majority class.Therefore we havechosen to use weightedF1-score.

VERTEXSELECTION

VERTEXSELECTIONEFFICIENCY

ThevertexselectionefficiencyfromBBLRshowsthatourmodelperformsbetterathighpileupdensitiesthanthecurrenttechnique.

• LogisticRegression(LR)• NeuralNetwork(NN)• BalancedBagging(BB)• BalancedBaggingwithLogisticRegression(BBLR)• BalancedBaggingwithNeuralNetwork(BBNN)LRdidnotperformwellduetoclassimbalanceindata.Baggingtechniquesgavebetterresults.

• Tillnow,wehavetreatedeachofthevertexasanindependentdatainput.

• Butforourproblem,weneedtoselectavertexfromagroupofverticesofanexperiment.Forthisweevaluateourmodelperexperimentandchosethevertexthatgivesthehighestprobabilities.

• Basedonthis,wecalculatethevertexselectionefficiencyvspileupdensities.

• NeuralNetworkscanbeimprovedbytuningoftheparameters- learningrate,hiddenlayerunits,etc.

• Thresholds- Predictionsbasedondifferentthresholds.• Features- Morefeaturescanbeextractedfromthe

simulationoftheevents.REFERENCES

• https://atlas.cern• SlidesfromProfArielSchwartzman• Debashree Devi,Saroj kr.BiswasandBiswajit Purkayastha,“Redundancydriven modified

Tomek-link basedundersampling:Asolutiontoclassimbalance”,2016• KevinW.Bowyer,Nitesh V.Chawla,LawrenceO.HallandW.PhilipKegelmeyer,“SMOTE:

SyntheticMinorityOver-samplingTechnique”,CoRR ,2011.• https://svds.com/learning-imbalanced-classes/.

• Thedataisinherentlyunbalancedbecauseofthenatureoftheexperimentsogeneraltrainingtechniquesdoesn’twork.

• FeaturesapartfromsumPt hasdiscriminatingeffectfordifferenttypeofcollisionevent.Thatiswhyourmodelworksbetterthantheexistingapproachathighpile-updensitiesaspervertexselectionefficiency.

• Ourmodelperformsalmostsimilarontrainingandtestset.Thereforenooverfitting.

• NeuralNetworkwithoutbalancedbaggingmethodofclassificationisunstableforthisdatasetasitproducesquitevaryingresults.

The ROC curve forBBLR indicates thatresults can beimproved by usingdifferent threshold

identification of the correct hard-scatter vertex at the...

Documents

characterizing data-driven phenotypes of schizophrenia and...

5145071 -...

satellite image segmentation for building detection...

investigating links between the immune system and the...

topological data analysis of convolutional neural...

cs229 machinelearning bealert!accidentshurt! team #924...

time series sales forecasting - cs229: machine...

predicting global gene expression from chromatin...

multi-modal information extraction in a question...

automatic music transcription for monophonic piano...

training data and preprocessing predicting agricultural...

an exploration of computer vision techniques for bird...

speech command recognition with convolutional...

multiperson pose estimation using thermal and depth...

topological data analysis of convolutional neural networks...

classification models of driving distraction: analysis and...

cs229 - project final report: automatic earthquake detection...

detecting thoracic diseases from chest x-ray images binit...

yelp restaurant photo classiﬁcation - machine...

cs229 - project...