big data analytics large hadron collider · big data analytics large hadron collider manuel martín...

Post on 23-Jun-2020

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

BIG DATA ANALYTICS

LARGE HADRON COLLIDER

Manuel Martín Márquez, Senior Project Leader

CERN – IT Department

3

CERN EUROPEAN ORGANIZATION FOR NUCLEAR RESEACH

A WORLDWIDE COLLABORATION

4

FUNDAMENTAL RESEARCH

WHAT IS THE UNIVERSE MADE OF?

HOW DIT IT START?

5

FUNDAMENTAL RESEARCH

Why do particle have mass?

6

FUNDAMENTAL RESEARCH

Why is there no antimatter left in the Universe?

What is 95% of the Universe made of?

7Manuel Martin MarquezIntel IoT Ignition Lab – Cloud and Big Data

Munich, September 17th

8

CERN’s PARTICLE ACCELERATORS AND

EXPERIMENTS

12/6/2019 Document reference 9

CERN Aerial View

World’s largest scientific instrument27km (16.8 miles) circumference, 6000+ superconducting magnets

Emptiest place in the solar system High vacuum inside the magnets

Hottest spot in the galaxy

During Lead ion collisions create temperatures 100 000x hotter than the heart of the sun;

Fastest racetrack on Earth

Protons circulate 11245 times/s (99.9999991% the speed of light)

12/6/2019 Document reference 10

150 Million of sensorControl and detection sensors

Massive 3D camera

Capturing million of collisions per second

CMS Detector

11

FUTURE CHALLENGES

HIGH-LUMINOSITY LHC

FUTURE CIRCULAR COLLIDER

12

Post-LHC accelerator projects (80-100 km)

14

CERN’s BIG DATA AND MACHINE LEARNING

HIGH ENERGY PHYSICS

15

16

17

18

TRIGGER SYSTEM – FILTERING EVENTS & DATA

The trigger system selects approximately 1000 of the 1.7

billion collisions that occur each second in the centre of the

ATLAS detector.

19

TRIGGER SYSTEM – FILTERING EVENTS & DATA

Improve efficiency, flexibility and quality filtering Remove the costly and rigid hardware based model

Reduce false positives rates

From rule based systems to Deep Learning Classifiers

20

TRIGGER SYSTEM – ML PIPELINE

Data

IngestionFeature

PreparationModel

Development Training

Complex datasets

801x19 matrix

Files about 4TB

Data format preparation

19 Original Features

14 Derived from Domain

Knowledge (HLF)

Feed-Forward DNN

Recursive DNN - GRUs

Combined Models

Hyper-Parameter tuning

Scikit-learn-Keras-Spark (parallel)

Distributed Training

21

TRIGGER SYSTEM – SCORING

Strict latency constraints Target level 1 trigger

Larger networks with longer latency. neutrino,

astronomical experiments, industrial

applications etc.

FPGAs Provide huge flexibility and allow us to cope

with response time required

Performance depends on how well you take

advantage of it

22

QUANTUM COMPUTING – ANOTHER WEAPON?

Can Quantum Computing and Q-ML help Quantum Nearest Neighbors Clustering, PCA and SVM

But still hard to Get access to emulators and simulators

Get access to real devices, benchmark, compare results

Engineering aspects of QC installation, like cryogenics and material science

Use Cases: Track reconstruction in dense environments

Reconstruct neutrino interactions

Optimize Grid workflow

23

CERN’s BIG DATA AND MACHINE LEARNING

CERN CONTROL SYSTEMS AND IOT

24

CERN ACCELERATOR LOGGING SERVICE

+2M signals produce more than 2.5TB data per day.

From scalars to arrays of up-to 4 million elements.

Data diverse in nature: Accelerator running modes,

Equipment statuses,

Magnet currents,

Cryogenics temperatures,

Particle beam positions

25

CERN ACCELERATOR LOGGING SERVICE

Control I-IoT data at CERN is disperse into several data silos

Current system optimized for real-time serving but not for data

exploration: Find hidden correlations

Anomalies detection

Post-mortem analysis

Root cause analysis (RCA)

Intelligent Alarm systems

Etc

26

ENHANCE EXPLORATION - AUTONOMOUS TECH

Oracle Autonomous StrategyAutomated creation of required resources,

Administration, Patches,

Backups,

Memory handling

Flexibility to provisioning and scaling Easy Solution prototyping

Move fast from PoC to Production states

Cost Effective Solutions Hybrid systems integrating DB and External Object Storage

27

DATA INTERFACE – NOTEBOOKS

The human factor

SWAN

Jupyter notebooks

Integrated with CERN

GPUs – Cloud

Oracle collaboration

top related