lsst data science fellowship program · lsst data science fellowship program bigskyearth meeting...

18
LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University of Washington With special thanks to: Adam Miller Lucianne Walkowicz DSFP Advisory Council

Upload: others

Post on 03-Nov-2019

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

LSST Data Science Fellowship Program

BigSkyEarth meetingOctober 24, 2016, Sorrento, Italy

Željko Ivezić Department of Astronomy University of Washington

With special thanks to: Adam Miller Lucianne Walkowicz DSFP Advisory Council

Page 2: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

Outline

• Motivation for the Program - large data sets, in particular LSST - “Big Data” problems - inadequate astronomy curricula

• Organization - multi-year program - open source philosophy

Page 3: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

A catalog of 20 billion stars and 20 billion galaxies with exquisite photometry, astrometry and image quality!

More information at www.lsst.org

and arXiv:0805.2366

LSST in one sentence: An optical/near-IR survey of half the sky in ugrizy bands to r~27.5 based on ~1000 visits over a 10-year period:

LSST: a digital color movie of the Universe...

3.6x10-31 erg/s/cm2/Hz36 nJy100x fainter than SDSS

Page 4: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

LSST Science Themes

• Dark matter, dark energy, cosmology (spatial distribution of galaxies, gravitational lensing, supernovae, quasars)

• Time domain (cosmic explosions, variable stars)

• The Solar System structure (asteroids)

• The Milky Way structure (stars)

LSST Science Book: arXiv:0912.0201 Summarizes LSST hardware, software, and observing plans, science enabled by LSST, and educational and outreach opportunities 245 authors, 15 chapters, 600 pages

Page 5: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

First light: 2019

Page 6: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

Name of Meeting • Location, Location • Date Date 6

LSST From the User’s Perspective: A Data Stream, a Database, and a (small) Cloud

− A stream of ~10 million time-domain events per night, detected and transmitted to event distribution networks within 60 seconds of observation.

− A catalog of orbits for ~6 million bodies in the Solar System.

− A catalog of ~37 billion objects (20B galaxies, 17B stars), ~7 trillion single-epoch detections (“sources”), and ~30 trillion forced sources, produced annually, accessible through online databases.

− Deep co-added images.

− Services and computing resources at the Data Access Centers to enable user-specified custom processing and analysis.

− Software and APIs enabling development of analysis codes.

Level 3Level 1

Level 2

LSST Data Products: see http://ls.st/dpdd

Nightly Alert Stream

Community Services

Yearly Data Releases

Page 7: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

What is LSST Data Science Fellowship Program - there is a disconnect between the skills that are needed

for an era rich in data and those that we teach to incoming graduate students and early-career postdocs

- DSFP is a program that will enhance a traditional astrophysics curriculum; extending a strong physics education to one that encompasses computational techniques, programming skills, data management, statistics, and data analysis; it is inspired by the EU Gaia Research for European Astronomy Training (GREAT) Network

- DSFP is a program tailored to the science of all of the LSST science collaborations

- The primary DSFP goal is to teach the skills required for LSST science that are not easily addressed by current astrophysics graduate programs (i.e. to enhance and not replace the current physics based curricula).

Page 8: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

LSST DSFP Goals

1) Educate next generation of astro data-scientists

2) Supplement existing graduate curriculum

3) Develop an inclusive and welcoming community

4) Create a unique program

5) Foster new collaborations

6) Produce well-documented, open resource - available to all

7) Expose wider community to LSST tools

8) To proliferate ambassadors for LSST and the LSSTC DSFP

Page 9: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

DSFP Curriculum

1) Data Science Basics: an introduction to statistics, machine learning, and information theory including a brief introduction to common tools used in software engineering (e.g. code repositories, issue tracking, GitHub)

2) Image processing: an introduction to and hands-on experience with the LSST image processing tools including the analysis of existing data sets

3) Non-supervised machine learning: the application and use of density estimation and clustering techniques with astronomical data

4) Supervised machine learning and time series: classification of multispectral data and time series analysis of variable sources with incomplete and noisy sampling

5) Scalable programming and data management: use of databases

Page 10: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

DSFP Curriculum

5) Scalable programming and data management: use of databases and parallel programming that can enable analyses to scale to large data sets

6) Data visualization: visualization of LSST images and catalog data including interactive visualization Communication & Collaboration

Page 11: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

DSFP Organization 1) a two-year training program for graduate students, with three

one-week schools per year

2) an initial cohort of 15 students, with bringing in (and graduating) 15 new students per year (after completing the first two-year program; 50% of the cohort entering and leaving each year)

3) each workshop emphasizes a particular theme (e.g. image analysis or machine learning) but statistics, data management, software engineering, and visualization will be incorporated into all of the workshops (spiral philosophy)

4) run by the Director (Lucianne Walkowicz), Deputy Director (Vicky Kalogera) and supporting postodoc with the title Director of Program Development and Communications (50%, Adam Miller), also input from the Leadership Council (Chris Lintott, Andrew Connoly, Zeljko Ivezic, Phil Marshall, Mario Juric, Robert Lupton)

Page 12: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

Produce well-documented, open resource

Page 13: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

Produce well-documented, open resource

Page 14: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

2016 LSSTC Face-to-Face Board Meeting | October 5-6, 2016 | Tucson, AZ 14

First school: August 1st-5th 2016

Hosted by Northwestern/CIERA & the Adler Planetarium

First week: overview session providing intro to program

All curriculum hosted on GitHub: https://github.com/LSSTC-DSFP

Page 15: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

2016 LSSTC Face-to-Face Board Meeting | October 5-6, 2016 | Tucson, AZ 15

Curriculum

Page 16: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

2016 LSSTC Face-to-Face Board Meeting | October 5-6, 2016 | Tucson, AZ 16

Page 17: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

2016 LSSTC Face-to-Face Board Meeting | October 5-6, 2016 | Tucson, AZ 17

Reflections on first school

− Active workspace and lectures interwoven with hands-on work proved effective

− Wide range of skills was challenging but not insurmountable; group work helps

− Very full schedule for an entire week— will build more free-form time and breaks into the next session

− Student evaluations overwhelmingly positive

Page 18: LSST Data Science Fellowship Program · LSST Data Science Fellowship Program BigSkyEarth meeting October 24, 2016, Sorrento, Italy Željko Ivezić Department of Astronomy University

2016 LSSTC Face-to-Face Board Meeting | October 5-6, 2016 | Tucson, AZ 18

Looking forward

− Next session ~ early Feb 2017

− Subsequent sessions will delve into individual topics in deeper depth

− Curriculum will gradually encourage students to bring in problems from their own research

− Exploring additional funding for international applicants

− Admissions for Cohort 2 begin next spring

More details at http://ciera.northwestern.edu/Education/LSSTC_FAQ.php