what to expect of the lsst archive: the lsst science platform · what to expect of the lsst...
TRANSCRIPT
1LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017NameofMeeting• Location• Date- ChangeinSlideMaster
WhattoExpectoftheLSSTArchive:TheLSSTSciencePlatform
MarioJuric,UniversityofWashingtonLSSTDataManagementSubsystemScientist
fortheLSSTDataManagementTeam.
LSST SCIENCE ADVISORY COMMITTEE MEETINGSeptember 25th, 2017
2LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
LSSTDataProducts:Level1,2,and3
− Astreamof~10milliontime-domaineventspernight,detectedandtransmittedtoeventdistributionnetworkswithin60secondsofobservation.
− Acatalogoforbitsfor~6millionbodiesintheSolarSystem.
− Acatalogof~37billionobjects(20Bgalaxies,17Bstars), ~7trillionobservations(“sources”),and~30trillionmeasurements(“forcedsources”),producedannually,accessiblethroughonlinedatabases.
− Reducedsingle-epoch,deepco-addedimages.
− User-producedadded-valuedataproducts(deepKBO/NEOcatalogs,variablestarclassifications,shearmaps,…)
Level3Level1
Level2
Formoredetails,seethe“DataProductsDefinitionDocument”,http://ls.st/lse-163
(nightly)
(annual)
(usergenerated)
3LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
HowDoWeMakeThoseDataAvailable?
Internet
LSST Users
DumpingFITStablesontoanFTPsitewillnotsufficeinthe2020ies…
4LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
ModelingLSSTUserNeeds
− Alargemajorityofusersarelikelytobeginbyaccessingthedatasetthroughawebportalinterface.TheywishtobecomefamiliarwiththeLSSTdataset.Theymayquerysmallersubsetsofdatafor“athome”analysis.
− Someuserswillwishtousethetoolsthey’reaccustomedto(e.g.,TOPCAT,Aladin,AstroPy,etc.)tograbthedatafromtheLSSTarchive.
− SomefractionofourusersmaychoosetocontinuetheiranalysisbyutilizingresourcesavailabletothemattheDAC. Thisavoidsthelatency(andthenecessarylocalresources)associatedwithdownloading(large)subsetoftheLSSTdataset.Theirsciencecasesmaynotrequiretoomuchcomputing,butarelimitedbystorage,latency,orevenjusthavingtherightsoftwareprerequisites.Theywouldbenefitfromaprepared,next-to-the-data,analysisenvironmentutilizingthe10%Level3allocation.
− Usecasesdemandinglargerresourcesmaybeabletoacquirethematadjacentcomputingfacilities (e.g.,XSEDE).Theseuserswillbenefitfromconnectivitytosuchresources.
− Finally,themostdemandingusecases,therights-holdersmayutilizetheirowncomputingfacilitiestosupportlarger-scaleprocessingorevenputuptheirownDataAccessCenters.Theyneedtheabilitytomove,re-process,and/orre-serve,largedatasets.
5LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
TheLSSTSciencePlatform:AccessingLSSTDataandEnablingLSSTScience
Portal JupyterLab
User Databases
LSST Science Platform
Software ToolsUser ComputingUser FilesData Releases Alert Streams
Web APIs
Internet
LSST Users
TheLSSTSciencePlatform isasetofintegratedwebapplicationsandservicesdeployedattheLSSTDataAccessCenters(DACs)throughwhichthescientificcommunitywillaccess,visualize,subsetandperformnext-to-the-dataanalysisofthedata.
6LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
TheLSSTSciencePlatformAspects:Portal,JupyterLab,WebAPIs
TheSciencePlatformexposestheunderlyingDACservicesservicesthroughthreeuserfacingaspects:thePortal(novice),theJupyterLab (intermediate),andtheWebAPIs(expertandremotetools).
Throughthese,weenableaccesstotheDataReleasesandAlertStreams,andsupportnext-to-thedataanalysisandLevel3productcreationusingthecomputingresourcesavailableattheDAC.
Portal JupyterLab
User Databases
LSST Science Platform
Software ToolsUser ComputingUser FilesData Releases Alert Streams
Web APIs
Internet
LSST Users
7LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
LSSTPortal:TheWebWindowintotheLSSTArchive
TheWebPortaltothearchivewillenablebrowsingandvisualizationoftheavailabledatasetsinwaystheusersareaccustomedtoatarchivessuchasIRSA,MAST,ortheSDSSarchive,withanaddedlevelofinteractivity.
ThroughthePortal,theuserswillbeabletoviewtheLSSTimages,requestsubsetsofdata(viasimpleformsorSQLqueries),constructsimpleplots,andgenerallyexploretheLSSTdatasetinawaythatallowsthemtoidentifyandaccess(subsetsof)datarequiredbytheirsciencecase.
Thiswillallbebackedbyapetascale-capableRDBMS.
JupyterLab
User Databases
LSST Science Platform
Software ToolsUser ComputingUser FilesData Releases Alert Streams
Web APIsPortal
8LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
LSSTPortal:TheWebWindowintotheLSSTArchive
TheFireflyWebScienceUserInterface(Wuetal,2016;ADASS)
WecurrentlyhaveaninitialversionofthePortalrunningatNCSA.
Datasets:• SDSSStripe82• NEOWISE
Soon:• HSC(LSST-
reprocessed)
9LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
JupyterLab:Next-to-the-dataAnalysis
ThetoolsexposedthroughtheWebPortalwillpermitsimpleexploration,subsetting,andvisualizationLSSTdata.Theymaynot,however,besuitableformorecomplexdataselectionoranalysistasks.
Toenablethatnextlevelofnext-to-the-datawork,weplantoenabletheuserstolaunchtheirownJupyter notebooksatourcomputingresourcesattheDAC.ThesewillhavefastaccesstotheLSSTdatabaseandfiles.Theywillcomewithcommonlyusedandusefultoolspreinstalled(e.g.,AstroPy,LSSTdataprocessingsoftwarestack).
ThisserviceissimilarinnaturetoeffortssuchasSciServer atJHU,ortheJupyterHub deploymentforDESatNCSA.
JupyterLab
User Databases
LSST Science Platform
Software ToolsUser ComputingUser FilesData Releases Alert Streams
Web APIsPortal
10LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
JupyterLab:Next-to-the-dataAnalysis
YouTubedemooftheLSSTJupyterLab AspectDemo:http://ls.st/bgt
11LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
WebAPIs:IntegratingWithExistingTools
BackendPlatformservices– suchasaccesstodatabases,images,andotherfiles– willbeexposedthroughmachine-accessiblewebAPIs.
Wehaveapreferenceforindustrystandardand/orVOAPIs(e.g.,WebDAV,TAP,SIA,etc.)– thegoalistosupportwhat’sbroadlyacceptedwithinthecommunity.ThiswillallowthediscoverabilityofLSSTdataproductsfromwithintheVirtualObservatory,federationoftheLSSTdatasettootherarchives,andtheuseofwidelyutilizedtools(eg.,TOPCATorothers).
JupyterLab
User Databases
LSST Science Platform
Software ToolsUser ComputingUser FilesData Releases Alert Streams
Web APIsPortal
12LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
Computing,Storage,andDatabaseResources
Computing,filestorage,andpersonaldatabases(the“userworkspace”)willbemadeavailabletosupporttheworkviathePortalandwithintheNotebooks.
AnimportantfeatureisthatnomatterhowtheuseraccessestheDAC(Portal,Notebook,orVOAPIs)theyalways“see”thesameworkspace.
JupyterLab
User Databases
LSST Science Platform
Software ToolsUser ComputingUser FilesData Releases Alert Streams
Web APIsPortal
13LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
Howbigisthe“LSSTScienceCloud”(@DR2)?
− Computing:• ~2,400cores• ~18TFLOPs
− Filestorage:• ~4PB
− Databasestorage• ~3PB
Thisissharedbyallusers.We’reestimatingthenumberofpotentialDACusersnottoexceed7500(relevantforfileanddatabasestorage).
Notalluserswillbeaccessingthecomputingclusterconcurrently.Weareestimatingonorderofa~100.
Thoughthisisarelativelysmallclusterby2020-erastandards,itwillbesufficienttoenablepreliminaryend-userscienceanalyses (workingoncatalogs,smallernumberofimages)andcreationofsomeadded-value(Level3)dataproducts.
ThinkofthisashavingyourownserverwithafewTBofdiskanddatabasestorage,rightnexttotheLSSTdata,withachancetousetenstohundredsofcoresforanalysis.
Forlargerendeavors(e.g.,pixel-levelreprocessingoftheentireLSSTdataset),theuserswillwanttouseresourcesbeyondtheLSSTDAC(morelater).
TheseresourceswillbemadeavailabletotheusersoftheU.S.DataAccessCenter.
AllDACuserswillbeginwithsomedefault(small)allocation,withmorelikelytobemadeavailableviaa(TBD)proposalprocess.
14LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
Level3:Added-valueDataProducts
− Level3DataProducts:Added-valueproductscreatedbythecommunity
− Thesemayenablescienceuse-casesnotfullycoveredbywhatwe’llgenerateinLevel1and2:• Customprocessingofdeepdrillingfields• SNe photometry(e.g.CFHT-LStypeforwardmodeling)• Extremelycrowdedfieldphotometry(e.g.,globularclusters)• Characterizationofdiffusestructures(e.g.,ISM)• Custommeasurementalgorithms• CatalogsofSNe lightechos
− TheusercomputingandstoragepresentintheLSSTSciencePlatformaremeanttoenablenext-to-the-datarealizationofusecasesliketheonesabove.
− Level3software/dataproductsmaybemigratedtoLevel2(withowners’permission);thisisoneofthewayshowLevel2productswillevolve.
15LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
Howwe(think)wewillworkwithLSSTdata?
Portal JupyterLab
User Databases
LSST Science Platform
Software ToolsUser ComputingUser FilesData Releases Alert Streams
Web APIs
Internet
LSST Users
16LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
MappingthePlatformtoModelUsers
− MostusersarelikelytobeginwiththeWebPortal,tobecomefamiliarwiththeLSSTdatasetandquerysmallersubsetsofdatafor“athome”analysis.Somemayusethetoolsthey’reaccustomedto(e.g.,TOPCAT,Aladin,AstroPy,etc.)tograbthedatausingLSST’sVO-compatibleAPIs.
− SomeusersmaychoosetocontinuetheiranalysisbyutilizingresourcesavailabletothemattheDAC. They’llaccessthesethroughJupyter notebook-typeremoteinterfaces,withaccesstoamid-sizedcomputingcluster.It’squitepossiblethatalargefractionofend-user(“singlePI”)sciencemaybeachievablethisway.
− Foruserswhoneedlargerresources,theymaybeabletoapplyformoreresourcesatadjacentcomputingfacilities.Forexample,U.S.computingislocatedintheNationalPetascale ComputingFacilityattheNationalCenterforSupercomputingApplications(NCSA).Significantadditionalsupercomputingisexpectedtobeavailableatthesamesite(e.g.,NPCFcurrentlyhoststheBlueWaterssupercomputer).
− Finally,rights-holdersmayutilizetheirowncomputingfacilitiestosupportlarger-scaleprocessingorevenputuptheirownDataAccessCenters.Asthey’reopensource,theymayre-useoursoftware(pipelines,middleware,databases)totheextentpossible.
17LSST SCIENCE ADVISORY COMMITTEE MEETING | SEPTEMBER 25TH, 2017
Puttingitalltogether:theLSSTSciencePlatform
Portal JupyterLab
User Databases
LSST Science Platform
Software ToolsUser ComputingUser FilesData Releases Alert Streams
Web APIs
Internet
LSST Users
Formoredetails,seethe“LSSTSciencePlatformVisionDocument”,http://ls.st/lsp