experience building the world wide telescope aka: the virtual observatory

18
1 Experience Building The World Wide Telescope aka: The Virtual Observatory Jim Gray Alex Szalay

Upload: ayala

Post on 07-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Experience Building The World Wide Telescope aka: The Virtual Observatory. Jim Gray Alex Szalay. The Evolution of Science. Observational Science Scientist gathers data by direct observation Scientist analyzes data Analytical Science Scientist builds analytical model Makes predictions. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

1

Experience Building The World Wide Telescope aka: The Virtual Observatory

Jim Gray

Alex Szalay

Page 2: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

2

The Evolution of Science• Observational Science

– Scientist gathers data by direct observation– Scientist analyzes data

• Analytical Science – Scientist builds analytical model– Makes predictions.

• Computational Science – Simulate analytical model– Validate model and makes predictions

• Data Exploration Science Data captured by instrumentsOr data generated by simulator– Processed by software– Placed in a database / files– Scientist analyzes database / files

Page 3: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

3

Information Avalanche• In science, industry, government,….

– better observational instruments and – and, better simulations producing a data avalanche

• Examples– BaBar: Grows 1TB/day

2/3 simulation Information 1/3 observational Information

– CERN: LHC will generate 1GB/s .~10 PB/y– VLBA (NRAO) generates 1GB/s today– Pixar: 100 TB/Movie

• New emphasis on informatics:– Capturing, Organizing,

Summarizing, Analyzing, Visualizing

Image courtesy C. Meneveau & A. Szalay @ JHU

BaBar, Stanford

Space Telescope

P&E Gene Sequencer Fromhttp://www.genome.uci.edu/

Page 4: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

4

World Wide TelescopeVirtual Observatory

http://www.ivoa.net/

• Premise: Most data is (or could be online)

• The Internet is the world’s best telescope:– It has data on every part of the sky– In every measured spectral band: optical, x-ray, radio..

– As deep as the best instruments (2 years ago).

– It is up when you are up.The “seeing” is always great (no working at night, no clouds no moons no..).

– It’s a smart telescope: links objects and data to literature on them.

Page 5: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

5

The WWT Components• Data Sources

– Literature– Archives

• Unified Definitions– Units, – Semantics/Concepts/Metrics,

Representations, – Provenance

• Object model• Classes and methods• Portals

Page 6: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

6

Data Sources• Literature online and cross indexed

– Simbad, ADS, NED,http://simbad.u-strasbg.fr/Simbad, http://adswww.harvard.edu/, http://nedwww.ipac.caltech.edu/

• Many curated archives online– FIRST, DPOSS, 2MASS, USNO, IRAS, SDSS, VizeR,…– Typically files with English meta-data and some programs

• Groups, Researchers, Amateurs Publish– Datasets online in various formats– Documentation varies– Publications are Ephemeral – Unknown provenance

Page 7: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

7

Unified Definitions• Universal Content Definitions

http://vizier.u-strasbg.fr/doc/UCD.htx

– Collated all table heads from all the literature– 100,000 terms reduced to ~1,500– Rough consensus that this is the right thing.– Refinement in progress as people use UCDs

• Defines – Units:

• gram, radian, second, ...

– Semantic Concepts / Metrics • Std error, Chi2 fit, magnitude, flux @ passband, velocity,

Page 8: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

8

Provenance• Most data will be derived.• To do science,

need to trace derived data back to source.• So programs and inputs must be registered.• Must be able to re-run them.• Example: Space Telescope Calibrated Data

– Run on demand– Can specify software version (to get old answers)

• Scientific Data Provenance and Curation are largely unsolved problems (some ideas but no science).

Page 9: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

9

Object Model• General acceptance of XML • Recent acceptance of XML Schema (XSD over DTD)

• Wait-and-See about SOAP/WSDL/…– “ Web Services are just Corba with angle brackets.”– FTP is good enough for me.

• Personal opinion:– Web Services are much more than “Corba + <>”– Huge focus on interop– Huge focus on integrated tools

• But the community says “Show me!”– Many technologists sold, but not the astronomers

Page 10: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

10

Classes and Methods• First Class: VO table

http://www.us-vo.org/VOTable/VOTable-1-0.htm

– Represents an answer set in XML• Defined by an XML Schema (XSD) • Metadata (in terms of UCDs)• Data representation(numbers and text)

– First method• Cone Search: Get objects in this cone

Page 11: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

11

Other Classes• Space-Time class

– http://hea-www.harvard.edu/~arots/nvometa/STCdoc.pdf

• Image Class (returns pixels)– SdssCutout– Simple Image Access Protocol

http://bill.cacr.caltech.edu/cfdocs/usvo-pubs/files/ACF8DE.pdf

– HyperAtlashttp://bill.cacr.caltech.edu/usvo-pubs/files/hyperatlas.pdf

• Spectral – Simple Spectral Access Protocol – 500K spectra available at http://voservices.net/wave

• Query Services– ADQL and SkyNode http://skyservice.pha.jhu.edu/develop/vo/adql/

• Registry: – see below

Page 12: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

12

The Registry• UDDI seemed inappropriate

– Complex – Irrelevant questions– Relevant questions missing

• Evolved Dublin Core– Represent Datasets, Services, Portals– Needs to be machine readable– Federation (DNS model)– Push & Pull: register then harvest

• http://www.ivoa.net/twiki/bin/view/IVOA/IvoaResReg

Page 13: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

13

SkyQueryA Prototype WWT

• Started with SDSS data and schema

• Imported about 9 other datasets into that spine schema.

• Unified them with a portal

• Implicit spatial join among the datasets.

• All built on Web Services– Pure XML– Pure SOAP– Used .NET toolkit

Page 14: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

14

Demo

• SkyServer: – navigator showing cutout web service– List: showing many calls and variant use.

• SkyQuery:– Show integration of various archives.– Explain spatial join xMatch operator.

Page 15: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

15

MyDB

• Portal allows federation of data but…

• Intermediate results may be large.

• Intermediate results feed into next analysis step.

• Sending them back-and-forth to client is costly and sometimes infeasible.

• Solution: create a working DB for client at Portal: MyDB

Page 16: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

16

MyDB

• Anyone can create a personal DB at SkyServer portal. – It is about 100 MB– It is private

• Simple queries done immediately

• Complex queries done by batch scheduler

• All queries can create/read/write MyDB tables

• Very popular with “serious” users.

• MyDB will be sharable with by a group.

Page 17: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

17

Open SkyQuery

• SkyQuery being adopted by AstroGrid as reference implementation for OGSA-DAI(Open Grid Services Architecture, Data Access and Integration).

• SkyNode basic archive objecthttp://www.ivoa.net/twiki/bin/view/IVOA/SkyNode

• SkyQuery Language (VoQL) is evolving.http://www.ivoa.net/twiki/bin/view/IVOA/IvoaVOQL

Page 18: Experience Building  The World Wide Telescope  aka: The Virtual Observatory

18

The WWT ComponentsOutline• Data Sources

– Literature– Archives

• Unified Definitions– Units, – Semantics/Concepts/Metrics,

Representations, – Provenance

• Object model• Classes and methods• Portals• WWT is a poster child for

the Data Grid.

What we learned• Astro is a community of 10,000 • Homogenous & Cooperative• If you can’t do it for Astro,

do not bother with 3M bio-info.• Agreement

– Takes time – Takes endless meetings

• Big problems are non-technical– Legacy is a big problem.

• Plumbing and tools are thereBut…– What is the object model– What do you want to save.– How document provenance.