lsst/dm: building a next generation survey data processing system

49
1 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Name of Mee)ng • Loca)on • Date Change in Slide Master LSST/DM: Building a Next Genera7on Survey Data Processing System Mario Juric LSST Data Management Project Scien5st CFA CODE COFFEE June 4, 2014 Robyn Allsman, Yusra AlSayyad, Tim Axelrod, Jacek Becla, Andrew Becker, Steve Bickerton, Jim Bosch, Bill Chickering, Andy Connolly, Greg Daues, Gregory Dubois Fellsman, Mike Freemon, Andy Hanushevsky, Fabrice Jammes, Lynne Jones, Jeff Kantor, KianTat Lim, Dus5n Lang, Ron Lambert, Robert Lupton (the Good), Simon Krughoff, Serge Monkewitz, Jon Myers, Russell Owen, Steve Pietrowicz, Ray Plante, Paul Price, Andrei Salnikov, Dick Shaw, Schuyler Van Dyk, Daniel Wang and the LSST Project Team

Upload: mario-juric

Post on 13-Jul-2015

199 views

Category:

Science


1 download

TRANSCRIPT

Page 1: LSST/DM: Building a Next Generation Survey Data Processing System

1 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014. Name  of  Mee)ng  •  Loca)on  •  Date    -­‐    Change  in  Slide  Master  

LSST/DM:  Building  a  Next  Genera7on  Survey  Data  Processing  System    

Mario  Juric  LSST  Data  Management  Project  Scien5st                      

CFA CODE COFFEE June 4, 2014

Robyn  Allsman,  Yusra  AlSayyad,  Tim  Axelrod,  Jacek  Becla,  Andrew  Becker,      Steve  Bickerton,  Jim  Bosch,    Bill  Chickering,  Andy  Connolly,    Greg  Daues,  Gregory  Dubois-­‐Fellsman,  Mike  Freemon,  Andy  Hanushevsky,  Fabrice  Jammes,  Lynne  Jones,  Jeff  Kantor,    

Kian-­‐Tat  Lim,  Dus5n  Lang,    Ron  Lambert,  Robert  Lupton  (the  Good),    Simon  Krughoff,  Serge  Monkewitz,  Jon  Myers,  Russell  Owen,  Steve  Pietrowicz,  Ray  Plante,  Paul  Price,    Andrei  Salnikov,  Dick  Shaw,  Schuyler  Van  Dyk,  Daniel  Wang    

and  the  LSST  Project  Team  

Page 2: LSST/DM: Building a Next Generation Survey Data Processing System

2 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

A  Dedicated  Survey  Telescope  

−  A  wide  (half  the  sky),  deep  (24.5/27.5  mag),  fast  (image  the  sky  once  every  3  days)  survey  telescope.  Beginning  in  2022,  it  will  repeatedly  image  the  sky  for  10  years.  

−  The  LSST  is  an  integrated  survey  system.  The  Observatory,  Telescope,  Camera  and  Data  Management  system  are  all  built  to  support  the  LSST  survey.  There’s  no  PI  mode,  proposals,  or  )me.    

−  The  ul7mate  deliverable  of  LSST  is  not  the  telescope,  nor  the  instruments;  it  is  the  fully  reduced  data.  •  All  science  will  be  come  from  survey  catalogs  and  images  

 

Telescope    è          Images    è          Catalogs  

Page 3: LSST/DM: Building a Next Generation Survey Data Processing System

3 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Open  Data,  Open  Source:  A  Community  Resource  

−  LSST  data,  including  images  and  catalogs,  will  be  available  with  no  proprietary  period  to  the  astronomical  community  of  the  United  States,  Chile,  and  Interna7onal  Partners    

−  Alerts  to  variable  sources  (“transient  alerts”)  will  be  available  world-­‐wide  within  60  seconds,  using  standard  protocols    

−  LSST  data  processing  stack  will  be  free  soYware  (licensed  under  the  GPL,  v3-­‐or-­‐later)  

−  All  science  will  be  done  by  the  community  (not  the  Project!),  using  LSST’s  data  products  

Page 4: LSST/DM: Building a Next Generation Survey Data Processing System

4 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Why  LSST:  The  Science  

Page 5: LSST/DM: Building a Next Generation Survey Data Processing System

5 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

History    1996-­‐2000  “Dark  MaSer  Telescope”  This  project  began  as  a  quest  to  understand  cosmology  and  the  Solar  System.    2000  -­‐  …      “LSST”  Emphasizes  a  broad  range  of  science  from  the  same  mul7-­‐wavelength  survey  data,  including  unique  7me  domain  explora7on    A  single  telescope,  a  single  data  set,  can  serve  to  answer  a  wide  swath  of  science  ques7ons  

The  evolu1on  of  LSST  design  

LSST:  Evolu7on  of  Design  and  Purpose  

Page 6: LSST/DM: Building a Next Generation Survey Data Processing System

CfA  Code  Coffee  •  Harvard-­‐Smithsonian  Center  for  Astrophysics  •  June  4,  2014.  

LSST:  A  Deep,  Wide,  Fast,  Optical  Sky  Survey    

   8.4m  telescope  18000+  deg2  10mas  astrom.  r<24.5  (<27.5@10yr)  

 ugrizy  0.5-­‐1%  photometry  

3.2Gpix  camera  30sec  exp/4sec  rd      15TB/night  37  B  objects    

Imaging  the  visible  sky,  once  every  3  days,  for  10  years  (825  revisits)  

http://lsst.org  

Page 7: LSST/DM: Building a Next Generation Survey Data Processing System

7 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Fron7ers  of  Survey  Astronomy  

−  Time  domain  science    •  Nova,  supernova,  GRBs    •  Source  characteriza)on    •  Instantaneous  discovery    

−  Census  of  the  Solar  System  •  NEOs,  MBAs,  Comets  •  KBOs,  Oort  Cloud  

−  Mapping  the  Milky  Way  •  Tidal  streams  •  Galac)c  structure  

−  Dark  energy  and  dark  mafer  •  Strong  lensing  •  Weak  lensing  •  Constraining  the  nature  of  dark  energy  

Page 8: LSST/DM: Building a Next Generation Survey Data Processing System

8 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Funding  Status  

−  December  6th,  2013:  Passed  the  NSF  Final  Design  Review;  declared  ready  for  Construc1on!  

−  January  17th,  2014:  FY2014  budget  signed,  with  NSF  appropria1on  allowing  for  LSST  start.  

−  May  8th,  2014:  NSB  authorizes  NSF  Director  to  start  the  project.  

−  Expec5ng  the  signing  of  coopera5ve  agreement  and  start  of  construc5on  in  July  2014!  

Page 9: LSST/DM: Building a Next Generation Survey Data Processing System

CfA  Code  Coffee  •  Harvard-­‐Smithsonian  Center  for  Astrophysics  •  June  4,  2014.  

Loca)on:  Cerro  Pachon,  Chile  

Leveling  of  El  Peñón  (the  summit  of  Cerro  Pachón)  

Page 10: LSST/DM: Building a Next Generation Survey Data Processing System

10 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

LSST  Observatory  (cca.  late  ~2018)  

Page 11: LSST/DM: Building a Next Generation Survey Data Processing System

11 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Page 12: LSST/DM: Building a Next Generation Survey Data Processing System

12 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Combined  Primary/Ter7ary  Mirror  Thin  Meniscus  Secondary  

−  Primary-­‐Ter)ary  was  cast  in  the  spring  of  2008.  −  Fabrica)on  underway  at  the  Steward  Observatory  

Mirror  Lab  -­‐  comple)on  by  the  end  of  2014.      

−  Secondary  substrate  fabricated  by  Corning  in  2009.  −  Currently  in  storage  wai)ng  for  construc)on.    

Page 13: LSST/DM: Building a Next Generation Survey Data Processing System

13 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

LSST  Camera  

Parameter   Value  

Diameter   1.65  m  

Length   3.7  m  

Weight   3000  kg  

F.P.  Diam   634  mm  

1.65 m 5’-5”

–  3.2 Gigapixels –  0.2 arcsec pixels –  9.6 square degree FOV –  2 second readout –  6 filters

Page 14: LSST/DM: Building a Next Generation Survey Data Processing System

14 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Bandpasses:  u,g,r,i,z,y  

Page 15: LSST/DM: Building a Next Generation Survey Data Processing System
Page 16: LSST/DM: Building a Next Generation Survey Data Processing System

16 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Next  mee5ng:  August  11-­‐15th  2014,  Phoenix,  AZ  (hSp://ls.st/hf9)  

Community:  LSST  Science  Collabora7ons  

2012  All  Hands  Mee)ng  Group  Photo,  Aug  13-­‐17  2012,  Marana,  AZ  

Page 17: LSST/DM: Building a Next Generation Survey Data Processing System

17 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

LSST  From  the  Astronomer’s  Perspec7ve  

−  A  stream  of  ~10  million  )me-­‐domain  events  per  night,  detected  and  transmiled  to  event  distribu)on  networks  within  60  seconds  of  observa)on.  

−  A  catalog  of  orbits  for  ~6  million  bodies  in  the  Solar  System.  

−  A  catalog  of  ~37  billion  objects  (20B  galaxies,  17B  stars),  ~7  trillion  observa)ons  (“sources”),  and  ~30  trillion  measurements  (“forced  sources”),  produced  annually,  accessible  through  online  databases.  

−  Deep  co-­‐added  images.  

−  Services  and  compu)ng  resources  at  the  Data  Access  Centers  to  enable  user-­‐specified  custom  processing  and  analysis.  

−  Sonware  and  APIs  enabling  development  of  analysis  codes.  

Level  3  Level  1  

Level  2  

Page 18: LSST/DM: Building a Next Generation Survey Data Processing System

18 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

LSST  Data  Management  System  (from  readout  to  delivery  to  the  user)  

 

Page 19: LSST/DM: Building a Next Generation Survey Data Processing System

19 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

LSST  Data  Management:  Roles  

−  Archive  Raw  Data:  Receive  the  incoming  stream  of  images  that  the  Camera  system  generates  to  archive  the  raw  images.    

−  Process  to  Data  Products:  Detect  and  alert  on  transient  events  within  one  minute  of  visit  acquisi)on.  Approximately  once  per  year  create  and  archive  a  Data  Release,  a  sta)c  self-­‐consistent  collec)on  of  data  products  generated  from  all  survey  data  taken  from  the  date  of  survey  ini)a)on  to  the  cutoff  date  for  the  Data  Release.  

−  Publish:  Make  all  LSST  data  available  through  an  interface  that  uses  community-­‐accepted  standards,  and  facilitate  user  data  analysis  and  produc7on  of  user-­‐defined  data  products  at  Data  Access  Centers  (DACs)  and  external  sites.  

Page 20: LSST/DM: Building a Next Generation Survey Data Processing System

20 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

HQ  Site  Science  Opera)ons  Observatory  Management  Educa)on  and  Public  Outreach  

Archive  Site  Archive  Center  

Alert  Produc)on  Data  Release  Produc)on  

Calibra)on  Products  Produc)on  EPO  Infrastructure  

 Long-­‐term  Storage  (copy  2)  Data  Access  Center  

Data  Access  and  User  Services  

Summit  and  Base  Sites  Telescope  and  Camera  

Data  Acquisi)on  Crosstalk  Correc)on  

Long-­‐term  storage  (copy  1)  Chilean  Data  Access  Center  

Dedicated  Long  Haul  Networks  

 Two  redundant  40  Gbit  links  from  La  

Serena  to  Champaign,  IL  (exis)ng  fiber)  

Page 21: LSST/DM: Building a Next Generation Survey Data Processing System

21 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Infrastructure:  Petascale  Compu7ng,  Gbit  Networks  

Long  Haul  Networks  to  transport  data  from  Chile  to  the  U.S.  

 •  200  Gbps  from  Summit  to  La  Serena  (new  fiber)  •  2x40  Gbit  (minimum)  for  La  Serena  to  Champaign,  IL  

(protected,  exis1ng  fiber)  

Archive  Site  and  U.S.  Data  Access  Center  

NCSA,  Champaign,  IL  

Base  Site  and  Chilean  Data  Access  Center  

La  Serena,  Chile  

The  compu1ng  cluster  at  the  LSST  Archive  (at  NCSA)  will  

run  the  processing  pipelines.    

•  Single-­‐user,  single-­‐applica1on,  dedicated  data  center  

•  Process  images  in  real-­‐1me  to  detect  changes  in  the  sky  

•  Produce  annual  data  releases  

Page 22: LSST/DM: Building a Next Generation Survey Data Processing System

22 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

“Applica7ons”:  Scien7fic  Core  of  LSST  DM  

−  Applica1ons  carry  core  scien)fic  algorithms  that  process  or  analyze  raw  LSST  data  to  generate  output  Data  Products    

−  Variety  of  processing  •  Image  processing  •  Measurement  of  source  proper)es  •  Associa)ng  sources  across  space  and  )me,  e.g.  

for  tracking  solar  system  objects    

−  Applica1ons  framework  layer  (afw;  not  shown)  allows  them  to  be  wrilen  in  a  high-­‐level  language    

Page 23: LSST/DM: Building a Next Generation Survey Data Processing System

23 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Middleware  Layer:  Isola7ng  Hardware,  Orchestra7ng  SoYware  

Enabling  execu1on  of  science  pipelines  on  hundreds  of  thousands  of  cores.  

 •  Frameworks  to  construct  pipelines  out  of  basic  algorithmic  components  

•  Orchestra)on  of  execu)on  on  thousands  of  cores  •  Control  and  monitoring  of  the  whole  DM  System  

Isola1ng  the  science  pipelines  from  details  of  underlying  hardware  

 •  Services  used  by  applica)ons  to  access/produce  data  and  communicate  

•  "Common  denominator"  interfaces  handle  changing  underlying  technologies  

Page 24: LSST/DM: Building a Next Generation Survey Data Processing System

24 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Database  and  Science  UI:  Delivering  to  Users  

Massively  parallel,  distributed,  fault-­‐tolerant  

rela5onal  database.    

•  To  be  built  on  exis)ng,  robust,  well-­‐understood,  technologies  (MySQL  and  xrootd)  

•  Commodity  hardware,  open  source  •  Advanced  prototype  in  existence  (qserv)  

Science  User  Interface  to  enable  the  access  to  and  analysis  of  LSST  data  

 •  Web  and  machine  interfaces  to  LSST  databases  •  Visualiza)on  and  analysis  capabili)es  

Page 25: LSST/DM: Building a Next Generation Survey Data Processing System

25 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Going  Where  the  Talent  is:  One  Distributed  Team  

Infrastructure  

Middleware  

Core  Algorithms  (“Apps”)  

Database  

UI  

     Mgm

t,  I&T,  and

 Scien

ce  QA  

Page 26: LSST/DM: Building a Next Generation Survey Data Processing System

26 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

The  LSST  Soiware  Stack  (science  pipelines,  middleware,  database,  user  interfaces)  

 “Enabling  LSST  science  by  crea1ng  a  well  documented,  state-­‐of-­‐the-­‐art,  high-­‐performance,  scalable,  mul1-­‐camera,  open  source,  O/IR  survey  data  processing  and  analysis  system.”  

Page 27: LSST/DM: Building a Next Generation Survey Data Processing System

27 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

LSST  Science  Pipelines  

−  02C.01.02.01/02.  Data  Quality  Assessment  Pipelines      (slides  by  Juric)  −  02C.01.[02.01.04,04.01,04.02]    Calibra7on  Pipelines      (slides  by  Axelrod,  Yoachim)  −  02C.03.01.      Single-­‐Frame  Processing  Pipeline      (slides  by  Krughoff,  Lupton)  −  02C.03.02.      Associa7on  pipeline  (slides  by  Lupton)  −  02C.03.03.      Alert  Genera7on  Pipeline      (slides  by  Becker)  −  02C.03.04.      Image  Differencing  Pipeline      (slides  by  Becker)  −  02C.03.06.      Moving  Object  Pipeline      (slides  by  Jones)  −  02C.04.03.      PSF  Es7ma7on  Pipeline    (slides  by  Lupton)  −  02C.04.04.      Image  Coaddi7on  Pipeline      (slides  by  AlSayyad)  −  02C.04.05.      Deep  Detec7on  Pipeline      (slides  by  Lupton)  −  02C.04.06.      Object  Characteriza7on  Pipeline      (slides  by  Lupton,  Bosch)  −  02C.01.02.03.    Science  Pipeline  Toolkit    

                       (slides  by  Dubois-­‐Felsmann)  

−  02C.03.05/04.07  Applica7on  Framework                        (slides  by  Lupton)  

Calibra1on  reviewed  in  July  ’13,  by  Wood-­‐Vasey  et  al.  

Pipelines  reviewed  in  Sep.  ’13,  by  Magnier  et  al.    

Level  1  

Level  2  

L3  

Data  Management  Applica1ons  Design  (LDM-­‐151)  

Page 28: LSST/DM: Building a Next Generation Survey Data Processing System

28 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Implementa7on  Strategy:  Transfer  Know-­‐how,  not  Code  

−  Difficulty  adap7ng  exis7ng  public  codes  to  LSST  requirements  (AstroMa7c  suite,  PHOTO,  Elixir,  IRAF-­‐based  pipelines,  etc.)  •  Need  to  run  efficiently  at  scale  •  Need  to  be  flexible  (plugging/unplugging  of  algorithms  at  run)me)  •  Need  to  have  it  developed  by  a  large  team  (20+  scien)sts  and  

programmers)  •  Need  to  be  maintainable  over  ~25  years  of  R&D,  Construc)on,  and  

Survey  Opera)ons  •  Need  to  run  on  a  variety  of  hardware  and  sonware  pla{orms  •  Need  to  have  logging  and  provenance  built  into  the  design  

−  Early  on  (~2006),  a  decision  was  made  to  (largely)  transfer  the  scien7fic  know-­‐how,  but  not  code.  

Page 29: LSST/DM: Building a Next Generation Survey Data Processing System

29 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Maintainable  Design  /  Language  Choices  

−  LSST  sonware  stack  is  largely  wrilen  from  scratch,  in  Python,  unless  computa)onal  demands  require  the  use  of  C++  •  C++:  

-  Computa)onally  intensive  code  -  Made  available  to  Python  via  SWIG  

•  Python:  -  All  high-­‐level  code  -  Prefer  Python  to  C++  unless  performance  demands  otherwise  

−  Modularity  •  Virtually  everything  is  a  Python  module.  •  ~60  packages  (git  repositories,  ~corresponding  to  python  packages)  

−  Build  system:  scons    Version  control:  git    Package  management:  EUPS    

Page 30: LSST/DM: Building a Next Generation Survey Data Processing System

30 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Modular  Architecture  

Applica)on  Framework  (comp.  intensive  C++,  SWIG-­‐wrapped  into  

Python)  Middleware  (I/O,  configura)on,  …)  

External  C/C++  Libraries  (Boost,  FFTW,  Eigen,  CUDA  ..)  

External  Python  Modules  (numpy,  pyfits,  matplotlib,  …)  

Camera  Abstrac)on  Layer  

(obs_*  packages)  

Measurement  Algorithms  (meas_*)  

Tasks  (ISR,  Detec)on,  Co-­‐adding,  …)  

Command-­‐line  driver  scripts   Cluster  execu)on  middleware  

…  

Red:  Mostly  C++  (but  Python  wrapped);          Blue:  Mostly  Python;          Black:  External  Libraries  

Middleware  (I/O,  configura)on,  …)  

Page 31: LSST/DM: Building a Next Generation Survey Data Processing System

31 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Module  Dependency  Tree  

eigen xpa fftw implicitProductsminuit2afwdata cuda_toolkit pysqlitemysqlclientlibpngfreetype astrometry_net suprime_data testdata_subarudistEst hscAstrom astrometry_net_data zlib tcltk

cfitsio doxygengsl python sqliteswig

boostmysqlpythonnumpy sconswcslib

matplotlib pyfits

sconsUtils

base

ndarray pex_exceptions

utils

daf_base geom

pex_logging pex_policy

daf_persistencepex_config

afw obs_test

coadd_utils pipe_baseskymap skypixtesting_displayQA

coadd_chisquared daf_butlerUtilsmeas_algorithms

ip_diffim ip_isrmeas_astrom meas_extensions_photometryKron meas_extensions_rotAnglemeas_extensions_shapeHSM obs_lsstSim obs_subaru

pipe_tasks

Page 32: LSST/DM: Building a Next Generation Survey Data Processing System

32 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Module  Dependency  Tree  

eigen xpa fftw implicitProductsminuit2afwdata cuda_toolkit pysqlitemysqlclientlibpngfreetype astrometry_net suprime_data testdata_subarudistEst hscAstrom astrometry_net_data zlib tcltk

cfitsio doxygengsl python sqliteswig

boostmysqlpythonnumpy sconswcslib

matplotlib pyfits

sconsUtils

base

ndarray pex_exceptions

utils

daf_base geom

pex_logging pex_policy

daf_persistencepex_config

afw obs_test

coadd_utils pipe_baseskymap skypixtesting_displayQA

coadd_chisquared daf_butlerUtilsmeas_algorithms

ip_diffim ip_isrmeas_astrom meas_extensions_photometryKron meas_extensions_rotAnglemeas_extensions_shapeHSM obs_lsstSim obs_subaru

pipe_tasks

External  Tools  and  Libraries  

AFW  

Camera  abstrac)ons  Measurement  Algorithms  

Top-­‐level  scripts  

Page 33: LSST/DM: Building a Next Generation Survey Data Processing System

33 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

(Very  Basic)  SExtractor  with  lsst  primi7ves  (1/2)  

Page 34: LSST/DM: Building a Next Generation Survey Data Processing System

34 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

(Very  Basic)  SExtractor  with  lsst  primi7ves  (2/2)  

Page 35: LSST/DM: Building a Next Generation Survey Data Processing System

35 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Page 36: LSST/DM: Building a Next Generation Survey Data Processing System

36 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Current  Status:  Advanced  Prototypes  

−  8-­‐year  prototyping  effort  •  8  sonware  releases  (Data  Challenges)  •  Status:  A  rapidly  maturing  state-­‐of-­‐the  art  astronomical  data  reduc)on  system  

-  ~SDSS/SExtractor  level  quality  of  reduc)ons  -  Most  recently  tested  by  building  co-­‐adds  using  SDSS  Stripe  82  data  -  Used  in  commissioning  of  the  Hyper  Suprime-­‐Cam  Survey  on  Subaru  

 

−  Prototyped  Features:  •  Instrumental  signature  removal  •  Single-­‐frame  processing  •  Point  source  photometry  •  Extended  source  photometry  (model  fi�ng)  •  Deblender  •  Co-­‐addi)on  of  images  •  Image  differencing  •  Object  characteriza)on  on  mul)-­‐epoch  data  (StackFit/Mul)Fit)  •  …  

 

Planning  to  begin  addressing  it  over  the  next  few  months.  

Page 37: LSST/DM: Building a Next Generation Survey Data Processing System

Figure:    5  sq.  deg.    background-­‐matched  coadd  composite    (g,r,i)  ~55  epochs      Region:    Aqr  Galac)c  lat  =  -­‐35.0          

New  Algorithms:  Background-­‐matched  co-­‐add  of  SDSS  Stripe  82  in  the  vicinity  of  M2.    Background  matching  preserves  diffuse  structures.    Generated  with  LSST  pipeline  prototypes.  

hfp://moe.astro.washington.edu/sdss/  

Slide:  Yusra  AlSayyad  

Page 38: LSST/DM: Building a Next Generation Survey Data Processing System

38 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Streams  in  LSST-­‐reprocessed  SDSS  Stripe  82  

Stripe  82  background-­‐matched  coadds  built  with  LSST  Data  Management  stack  (hfp://moe.astro.washington.edu)  

hfp://moe.astro.washington.edu/sdss/  

Page 39: LSST/DM: Building a Next Generation Survey Data Processing System

39 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Example:  Forced  Photometry  on  SDSS  Stripe  82  

Forced  Photometry    For  every  detec)on  in  the  deep  co-­‐add,  perform  PSF  photometry  on  individual  frames  (ugriz).  Note  that  the  majority  of  these  will  be  below  the  single-­‐frame  SNR  detec)on  treshold.    Averaging  those  fluxes  allows  one  to  go  deeper.    Len:  comparison  of  Ivezic  et  al.  (2004)  w  and  y  color  loci;  single  frame  vs.  deep  catalog.    

Page 40: LSST/DM: Building a Next Generation Survey Data Processing System

40 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

Winter  2014  SoYware  Release  

curl  –O  http://sw.lsstcorp.org/eupspkg/newinstall.sh  bash  newinstall.sh  

Installing  

•  Supported  plaqorms  (plaqorms  we  regularly  build  on;  generally  builds  on  any  Linux/BSD)  

•  RHEL  6  •  OS  X  10.8  Mountain  Lion  •  OS  X  10.9  Mavericks  

 

Page 41: LSST/DM: Building a Next Generation Survey Data Processing System

41 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

WARNING!  ADVERTENCIA!  AVERTISSEMENT!  

THIS  IS  STILL  NOT  A  FINISHED,  POLISHED,  READY-­‐TO-­‐USE  END-­‐USER  PRODUCT!  BEFORE  DOWNLOADING,  PLEASE  MAKE  SURE  

TO  READ  THE  DM  STACK  FAQ:  

Page 42: LSST/DM: Building a Next Generation Survey Data Processing System

42 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

WARNING!  ADVERTENCIA!  AVERTISSEMENT!  

THIS  IS  STILL  NOT  A  FINISHED,  POLISHED,  READY-­‐TO-­‐USE  END-­‐USER  PRODUCT!  BEFORE  DOWNLOADING,  PLEASE  MAKE  SURE  

TO  READ  THE  DM  STACK  FAQ:    

hfp://dev.lsstcorp.org/trac/wiki/DM/Policy/UsingDMCode/FAQ    

KEY  POINTS:  -­‐  POOR  DOCUMENTATION  

-­‐  YOU’RE  DOWNLOADING  UNSUPPORTED,  PROTOTYPE,  CODE  -­‐  THIS  CODE  WILL  NOT  WORK  OUT  OF  THE  BOX  FOR  CAMERAS  

OTHER  THAN  LSST  (AND  SDSS).  -­‐  EXPECT  TO  WRITE  SOME  PYTHON  CODE  

Page 43: LSST/DM: Building a Next Generation Survey Data Processing System

43 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

The  Big  Picture:      Preparing  for  the  Data  Driven  Astronomy  of  the  Next  Decade  (and  beyond)  

Page 44: LSST/DM: Building a Next Generation Survey Data Processing System

44 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

“Astro  2020”:  Rise  of  the  Machines  

−  We’re  witnessing  a  change  in  how  astronomy  is  done,  and  the  technical  knowledge  and  tools  needed  to  do  it.  

•  The  rise  of  big  projects  and  end  to  data  scarcity  

•  The  rise  of  systema)cs  limited  science  •  The  rise  of  open  (source),  (massively)  

collabora)ve,  science  

Page 45: LSST/DM: Building a Next Generation Survey Data Processing System

45 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

“Astro  2020”:  Rise  of  the  Machines  

−  We’re  witnessing  a  change  in  how  astronomy  is  done,  and  the  technical  knowledge  and  tools  needed  to  do  it.  

•  The  rise  of  big  projects  and  end  to  data  scarcity  

•  The  rise  of  systema)cs  limited  science  •  The  rise  of  open  (source),  (massively)  

collabora)ve,  science  

−  Consequences  •  Ability  to  collect  data  has  outstripped  the  

ability  to  analyze  it  -  Extrac)on  of  features  from  the  data  (“image  

processing”)  -  Mining  of  knowledge  from  the  data  (“data  

mining”)  

•  We  cri)cally  dependent  on  compu)ng  infrastructure  and  sonware/algorithm  research  for  astronomical  progress  -  Yet  we  don’t  generally  acknowledge,  

encourage,  or  teach  it  

 

Page 46: LSST/DM: Building a Next Generation Survey Data Processing System

46 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

“Astro  2020”:  Rise  of  the  Machines  

−  We’re  witnessing  a  change  in  how  astronomy  is  done,  and  the  technical  knowledge  and  tools  needed  to  do  it.  

•  The  rise  of  big  projects  and  end  to  data  scarcity  

•  The  rise  of  systema)cs  limited  science  •  The  rise  of  open  (source),  (massively)  

collabora)ve,  science  

−  Consequences  •  Ability  to  collect  data  has  outstripped  the  

ability  to  analyze  it  -  Extrac)on  of  features  from  the  data  (“image  

processing”)  -  Mining  of  knowledge  from  the  data  (“data  

mining”)  

•  We  cri)cally  dependent  on  compu)ng  infrastructure  and  sonware/algorithm  research  for  astronomical  progress  -  Yet  we  don’t  generally  acknowledge,  

encourage,  or  teach  it  

 

−  Challenges  •  Eleva)ng  sonware  engineering  to  a  

foo)ng  equal  to  mathema)cs?  -  Learn-­‐by-­‐osmosis  not  sufficient  any  

more  •  T(construc)on)  >>  T(discovery)  

-  Research  becoming  more  data  driven  -  Broad  interests  in  astrophysics  -  Sta)s)cs,  CS,  sonware  engineering,  etc.  

•  Sonware  reusability  -  Increasing  complexity  makes  

perpetual  wheel  reinven7ons  infeasible  (and,  honestly,  silly…)  

Page 47: LSST/DM: Building a Next Generation Survey Data Processing System

47 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

LSST:  Helping  Build  the  Common  Codebase  for  the  Next  Quarter  Century  

−  LSST  sonware  will  be  general  purpose  and  highly  reusable  by  design.  

•  Necessary  to  deal  with  real-­‐world  hardware  •  Necessary  to  be  able  to  process  precursor  

data  •  Necessary  to  enable  science  (“Level  3”)  

sonware  to  be  wrilen  on  top  of  it  

−  Opportuni7es  for  using  LSST-­‐derived  code  on  other  data  sets  

•  More  work  ahead,  but  becoming  a  state  of  the  art,  well  supported,  codebase  

•  Possibili)es:  SDSS,  CFHT-­‐LS,  PanSTARRS,  HSC,  DES,  WFIRST,  Euclid,  …  

•  Good  basis  for  analysis  frameworks  (LSST  DESC)  

•  Leveraging  a  100M+  NSF  investment  in  large  survey  data  management  

−  The  benefits  feed  back  to  LSST:  more  users,  less  bugs,  beler  understanding,  shorter  path  to  science.  

Page 48: LSST/DM: Building a Next Generation Survey Data Processing System

48 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

LSST:  A  Piece  of  the  Puzzle  

−  LSST  can  help  posi7on  us  for  the  future  in  two  ways  

•  With  code  (see  previous  slide)  •  With  people/culture  

−  SoYware  Development  Culture  •  We  will  run  the  sonware  effort  as  an  open  source  

project  with  reusability  in  mind  -  A  source  tarball  at  the  very  end  is  not  useful!  -  Open  bug  trackers,  mailing  lists,  repositories  -  S7ll  have  a  job  to  do!  But  that  doesn’t  mean  we  

must  do  it  in  a  closed,  insulated,  manner!  •  Think  Fedora  Project/RedHat,  Android/Google,  

Debian/Ubuntu/Mint  

•  Use  what  works:  numpy,  scipy,  astropy,  etc…  -  Improve  upstream  rather  than  fork!  -  Where  we  run  into  problems:  poor  sonware  

engineering,  performance  issues,  licenses  •  Startup  mentality:  excellence  wins,  agile  process,  

con)nuous  change  &  learning,  collabora)ve  spirit,  sense  of  urgency  and  excitement.  

−  People  •  We  will  have  40+  people  working  on  

LSST  Data  Management  over  (1)8+  yrs  -  Crea)ng  a  career  path  for  sonware  

instrumentalists  •  We  can  help  train  a  whole  genera)on  

of  “data  driven  astronomers”  -  Impar)ng  the  know-­‐how  needed  to  

make  the  best  use  of  the  next  genera)on  of  surveys  

Page 49: LSST/DM: Building a Next Generation Survey Data Processing System

49 CFA CODE COFFEE | HARVARD-SMITHSONIAN CENTER FOR ASTROPHYSICS | JUNE 4, 2014.

@LSST    @mjuric