dr jukka klem chep06 1 public resource computing at cern – philippe defert, markku degerholm,...

Download Dr Jukka Klem CHEP06 1 Public Resource Computing at CERN – Philippe Defert, Markku Degerholm, Francois Grey, Jukka Klem, Juan Antonio

If you can't read please download the document

Upload: kristin-sparks

Post on 08-Jan-2018

215 views

Category:

Documents


3 download

DESCRIPTION

Dr Jukka Klem CHEP06 3 Public Resource Computing Also called global computing, volunteer computing Also called global computing, volunteer computing Based on BOINC (Berkeley Open Infrastructure for Network Computing) Based on BOINC (Berkeley Open Infrastructure for Network Computing) Platform for distributed computing using volunteered computing resources Platform for distributed computing using volunteered computing resources Profit from unused CPU cycles for scientific computing Profit from unused CPU cycles for scientific computing

TRANSCRIPT

Dr Jukka Klem CHEP06 1 Public Resource Computing at CERN Philippe Defert, Markku Degerholm, Francois Grey, Jukka Klem, Juan Antonio Lopez Perez, Eric Mcintosh, Jakob Pedersen, Ignacio Reguero, Frank Schmidt, Ben Segal, Christian Soettrup Dr Jukka Klem CHEP06 2 Outline Public Resource Computing and BOINC Public Resource Computing and BOINC project and applications project and applications Physics results with Physics results with Precision of numerical results Precision of numerical results Statistics Statistics BOINC and grid computing BOINC and grid computing Dr Jukka Klem CHEP06 3 Public Resource Computing Also called global computing, volunteer computing Also called global computing, volunteer computing Based on BOINC (Berkeley Open Infrastructure for Network Computing) Based on BOINC (Berkeley Open Infrastructure for Network Computing) Platform for distributed computing using volunteered computing resources Platform for distributed computing using volunteered computing resources Profit from unused CPU cycles for scientific computing Profit from unused CPU cycles for scientific computing Dr Jukka Klem CHEP06 4 BOINC Infrastructure Each project runs a server identified by a master URL (e.g.Project components: Each project runs a server identified by a master URL (e.g.Project components:http://lhcathome.cern.ch Dr Jukka Klem CHEP06 5 Client Screensaver Dr Jukka Klem CHEP06 6 Basic Principles Communication initiated by the client Communication initiated by the client Security: BOINC uses code signing to prevent distribution of malicious executables. Each project has a key pair for code signing and the private key kept on network-isolated machine. Security: BOINC uses code signing to prevent distribution of malicious executables. Each project has a key pair for code signing and the private key kept on network-isolated machine. Dr Jukka Klem CHEP06 7 Redundant Computing Results from different hosts are validated using application specific function Results from different hosts are validated using application specific function Credit: numeric measure of how much a user has contributed. Important in motivating users. Credit: numeric measure of how much a user has contributed. Important in motivating users. Dr Jukka Klem CHEP06 8 BOINC Applications Public appeal Public appeal Independent parallelism Independent parallelism Easiest for low data/compute ratio Easiest for low data/compute ratio Usually available at least for Windows and Linux (also for MacOS, Solaris,...) Usually available at least for Windows and Linux (also for MacOS, Solaris,...) Existing applications in e.g. C, C++, Fortran can run as BOINC applications with small modifications (BOINC API) Existing applications in e.g. C, C++, Fortran can run as BOINC applications with small modifications (BOINC API) Dr Jukka Klem CHEP06 9 Some BOINC Projects look for extraterrestrial life look for extraterrestrial life Climateprediction.net: study climate change Climateprediction.net: study climate change search for gravitational signals search for gravitational signals protein-related diseases protein-related diseases cures for human diseases cures for human diseases World Community Grid: IBM project World Community Grid: IBM project African humanitarian causes African humanitarian causes improve LHC particle accelerator improve LHC particle accelerator Dr Jukka Klem CHEP06 10 applications Main application: SixTrack. Others prepared: ATLAS fast simulation, Garfield, Geant4 (poster presentation) Main application: SixTrack. Others prepared: ATLAS fast simulation, Garfield, Geant4 (poster presentation) SixTrack application simulates protons as they travel around the LHC ring SixTrack application simulates protons as they travel around the LHC ring Superconducting magnets generate unwanted multipole field errors, available phase space area (dynamic aperture) for stable particle motion limited Superconducting magnets generate unwanted multipole field errors, available phase space area (dynamic aperture) for stable particle motion limited Each job typically tracks 60 particles 10 5 or 10 6 turns in the LHC (1-10 hours on a modern PC) Each job typically tracks 60 particles 10 5 or 10 6 turns in the LHC (1-10 hours on a modern PC) SixTrack program is part of SPEC CPU2000 benchmark suite SixTrack program is part of SPEC CPU2000 benchmark suite Dr Jukka Klem CHEP06 11 Physics Results Phase space images of stable particle motion (up) and unstable chaotic motion (down). Phase space images of stable particle motion (up) and unstable chaotic motion (down). Chaotic motion predicts that the particle will be lost from LHC. Chaotic motion predicts that the particle will be lost from LHC. Map out conditions under which particle motion is stable. Map out conditions under which particle motion is stable. Dr Jukka Klem CHEP06 12 Physics Results Long range and head-on beam-beam interactions reduce dynamic aperture Long range and head-on beam-beam interactions reduce dynamic aperture Different beam crossing schemes and effect of triplet errors studied Different beam crossing schemes and effect of triplet errors studied Average stable phase space area (Dynamic Aperture, DA) for different tune values. Average stable phase space area (Dynamic Aperture, DA) for different tune values. Dr Jukka Klem CHEP06 13 Physics Results These studies would not have been possible without resources These studies would not have been possible without resources Large number of parameters can be carefully studied Large number of parameters can be carefully studied Results used in LHC design to provide more interesting collisions for the experiments Results used in LHC design to provide more interesting collisions for the experiments Dr Jukka Klem CHEP06 14 Precision of Numerical Results (1/2) is a heterogeneous distributed system: different processors and operating systems is a heterogeneous distributed system: different processors and operating systems Redundant computing (each job sent to different computers) allows to find differences in results Redundant computing (each job sent to different computers) allows to find differences in results IEEE-754 standard for floating-point arithmetic helps but does not specify everything needed (logarithm, trigonometric functions etc.) IEEE-754 standard for floating-point arithmetic helps but does not specify everything needed (logarithm, trigonometric functions etc.) Getting correct results depends on processor, operating system, programming language, compiler Getting correct results depends on processor, operating system, programming language, compiler Systems often optimized for performance Systems often optimized for performance Dr Jukka Klem CHEP06 15 Precision of Numerical Results (2/2) If particle motion chaotic, small errors can lead to large differences in final results If particle motion chaotic, small errors can lead to large differences in final results Some PCs give consistently wrong results (e.g. 10 year old desktop PC and one Linux batch PC) Some PCs give consistently wrong results (e.g. 10 year old desktop PC and one Linux batch PC) Many small differences in results found (due to log and exp functions on different processors) Many small differences in results found (due to log and exp functions on different processors) Solution: link executable statically to a portable library crlibm (https://lipforge.ens-lyon.fr/projects/crlibm/). Provides correct results on different platforms with performance cost less than 10%. Solution: link executable statically to a portable library crlibm (https://lipforge.ens-lyon.fr/projects/crlibm/). Provides correct results on different platforms with performance cost less than 10%.https://lipforge.ens-lyon.fr/projects/crlibm/ Need to be very careful if consistent results expected from heterogenous resources Need to be very careful if consistent results expected from heterogenous resources Dr Jukka Klem CHEP06 16 Some Result Statistics has about active users in 108 countries has about active users in 108 countries Users contribute about host computers Users contribute about host computers >800 CPU years processed for the LHC (assuming 1 CPU = 1 KSfp2K = 2.8 GHz Xeon) >800 CPU years processed for the LHC (assuming 1 CPU = 1 KSfp2K = 2.8 GHz Xeon) BOINC projects combined: about users and hosts BOINC projects combined: about users and hosts Estimate: 1 billion PCs in operation (less than 0.05% participate) Estimate: 1 billion PCs in operation (less than 0.05% participate) Dr Jukka Klem CHEP06 17 BOINC and Grids Bridges between BOINC and grid have been built for LCG and NorduGrid/ARC grid middleware Bridges between BOINC and grid have been built for LCG and NorduGrid/ARC grid middleware Sending jobs from BOINC to grid easier, grid to BOINC more difficult (security) Sending jobs from BOINC to grid easier, grid to BOINC more difficult (security) BOINC has lightweight infrastructure, some limitations with applications BOINC has lightweight infrastructure, some limitations with applications Dr Jukka Klem CHEP06 18 Useful links server:Please join the project! server:Please join the project!Some background information:Some background information: BOINC web site:BOINC web site: Unofficial BOINC Wiki:Unofficial BOINC Wiki: Dr Jukka Klem CHEP06 19 Summary Public resource computing projects built using BOINC platform Public resource computing projects built using BOINC platform Can obtain large resources with low cost Can obtain large resources with low cost Results very useful for the LHC Results very useful for the LHC Numerical results checked with redundant computing. Have to be careful when using heterogeneous resources Numerical results checked with redundant computing. Have to be careful when using heterogeneous resources