national virtual observatory

40
National Virtual Observatory

Upload: robert

Post on 21-Jan-2016

62 views

Category:

Documents


0 download

DESCRIPTION

National Virtual Observatory. The National Virtual Observatory. National distributed in scope across institutions and agencies available to all astronomers and the public Virtual not tied to a single “brick-and-mortar” location - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: National Virtual Observatory

National Virtual Observatory

Page 2: National Virtual Observatory

National Virtual Observatory

The National Virtual Observatory

National• distributed in scope across institutions and agencies

• available to all astronomers and the public

Virtual• not tied to a single “brick-and-mortar” location

• supports astronomical “observations” and discoveries via remote access to digital representations of the sky

Observatory• general purpose

• access to large areas of the sky at multiple wavelengths

• supports a wide range of astronomical explorations

• enables discovery via new computational tools

Page 3: National Virtual Observatory

National Virtual Observatory

Why Now ?

The past decade has witnessed• a thousand-fold increase in computer speed

• a dramatic decrease in the cost of computing & storage

• a dramatic increase in access to broadly distributed data

• large archives at multiple sites and high speed networks

• significant increases in detector size and performance

These form the basis for science

of qualitatively different nature

Page 4: National Virtual Observatory

National Virtual Observatory

Trends

Future dominated by detector improvements

Total area of 3m+ telescopes in the world in m2, total number of CCD pixels in Megapix, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels.

• Moore’s Law growth in CCD capabilities

• Gigapixel arrays on the horizon

• Improvements in computing and storage will track growth in data volume

• Investment in software is critical, and growing

Page 5: National Virtual Observatory

National Virtual Observatory

The Discovery Process

discover significant patterns

• from the analysis of statistically rich and unbiased image/catalog databases

understand complex astrophysical systems • via confrontation between data and

large numerical simulations

Past: observations of small, carefully selected samplesof objects in a narrow wavelength band

Future: high quality, homogeneous multi-wavelengthdata on millions of objects, allowing us to

The discovery process will rely heavily on advanced visualizationand statistical analysis tools

Page 6: National Virtual Observatory

National Virtual Observatory

NVO Science: Discoveries

Discoveries of rare objects:Searches for exotic new sources

truly rare at level of 1 source in 10 million

Multi-wavelength identification of large statistical samples of previously rare objects:

• brown dwarfs, high-z quasars, ultra-luminous IR galaxies, etc.

Efficient cross-identification of “unidentified sources” from new surveys

• Example: Use radio, optical, and IR surveys to identify serendipitous Chandra X-ray sources

Selection of targets for spectroscopic follow-up

Page 7: National Virtual Observatory

National Virtual Observatory

NVO Science: Statistical Surveys

Homogeneous samples of typical objects• Mega-surveys: sample size not a problem any more

• Statistical accuracy determined entirely by systematics

• Multi-wavelength data enables accurate sample selection(evolution, rest-frame selection)

High Precision Astrophysics of Origins• Large scale structure of the universe

• Galactic structure

• Galaxy evolution

• Active galaxies, galaxy clusters, ...

• Stellar populations

Leading to New AstronomyNew Astronomy

Page 8: National Virtual Observatory

National Virtual Observatory

New Astronomy – Different!

Systematic Data Explorationwill have a central role in the New Astronomy

Digital Archives of the Skywill be the main access to data

Data “Avalanche”the flood of Terabytes of data is already happening, whether we like it or not!

Transition to the newmay be organized or chaotic

Page 9: National Virtual Observatory

National Virtual Observatory

Ongoing Mega-Surveys

Large number of new surveys• Multi-terabyte in size, 100 million objects or larger

• Individual archives planned and under way

Multi-wavelength view of the sky• More than 13 wavelength coverage within 5 years

Impressive early discoveries• Finding exotic objects by unusual colors

L,T dwarfs, high redshift quasars

• Finding objects by time variability

gravitational micro-lensing

MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE, ...

MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE, ...

Page 10: National Virtual Observatory

National Virtual Observatory

High Redshift Quasars

Several z>5 QSOs discovered by SDSSin the early test data

Page 11: National Virtual Observatory

National Virtual Observatory

Methane/T Dwarf

Discovery of several newobjects by SDSS & 2MASS

SDSS T-dwarf (June 1999)

Page 12: National Virtual Observatory

National Virtual Observatory

DPOSS Discoveries

Page 13: National Virtual Observatory

National Virtual Observatory

New Neighbor of the Milky Way

Finding new galaxies by spatial clustering of red objects

New galaxy is about 30 million light years away

Larger than most of the spiral galaxies in the Messier Catalogue

Clearly visible in the 2MASS infrared image

Expect to find 1000’s of such galaxies with 2MASS

Infrared Optical

Page 14: National Virtual Observatory

National Virtual Observatory

The Observatories

NOAO/NRAO • 20% of the time on all its telescopes dedicated to major

surveys using a wide range of telescope and instrumentation packages

The NASA Great Observatories• new opportunities for surveys, • combine mission-specific data with those from other

missions and from the ground

several multi-Terabyte databasesand further extensive catalogs of objects

Page 15: National Virtual Observatory

National Virtual Observatory

HST Data Archive

Several Terabytes/year Several Terabytes/year

Already more retrieval than ingest!Already more retrieval than ingest!

Page 16: National Virtual Observatory

National Virtual Observatory

Some Proposed Surveys

Next Decade: New optimized “survey systems”

exploring new parameter spaceexploring new parameter space

Dark Matter Telescope• map the distribution of matter for z<1.5 from weak

lensing, through deep, high quality images of galaxies

• moving and variable objects through repetitive surveys

Spectroscopic Wide-Field Telescope• evolution of galaxies from z~4 to the present

from star formation rates

• determine chemical abundances and kinematics

Page 17: National Virtual Observatory

National Virtual Observatory

The Road to the NVO

The environment to exploit these huge sky surveys does not exist today!

• 1 Terabyte at 10 Mbyte/s takes 1 day• Expect 100’s of intensive queries and

1000’s of casual queries per-day• Data will reside at multiple locations• Existing analysis tools do not scale to Terabyte data sets

Acute need in a few years,

it will not just happen

a New Initiative is needed!a New Initiative is needed!

Page 18: National Virtual Observatory

National Virtual Observatory

NVO: A New Initiative

A new initiative is needed • to ensure an evolutionary, cost-effective transition

• to maximize the impact of large current and future efforts

• to create the necessary new standards in the community

• to develop the software tools needed

• to ensure that the astronomical community has the proper network and hardware infrastructure to carry out its science

The National Virtual Observatory• can be the catalyst of the “New Astronomy”

Page 19: National Virtual Observatory

National Virtual Observatory

The Goals of the NVO

Virtual observations of the sky in multiple wavelengths, by integrating all-sky Mega-surveys

Query the individual object catalogs and image databases thousands of times per day

Joint queries of the combined catalogs thousands of times per day

Enable discovery in these archives via new tools novel visualization techniques,

supervised, unsupervised learning, advanced classification techniques

Page 20: National Virtual Observatory

National Virtual Observatory

NVO: The Challenges

Size of the archived data• 40,000 square degrees is 2 trillion pixels• One band: 4 Terabytes• Multi-wavelength: 10-100 Terabytes• Time dimension: few Petabytes

The development of• new archival methods• new analysis tools

Hardware requirements

Training the next generation

Page 21: National Virtual Observatory

National Virtual Observatory

Necessary Components

New archival methods

New analysis tools

New hardware requirements

Page 22: National Virtual Observatory

National Virtual Observatory

New Archival Methods

Structure and manage multi-TB (and soon PB) data archives, distributed across the continent

Rapid and transparent access to image/catalog databases across all wavelengths, via intelligent query agents

Efficient query and data retrieval by more than 10,000 scientists world-wide, with enhanced search operators (like spatial proximity)

Page 23: National Virtual Observatory

National Virtual Observatory

Examples: non-local queries

Find all objects within 1' which have more than two neighbors with u-g, g-r, i-K colors within 0.05m

Find all star-like objects within dm=0.2 of the colors of a quasar at 5.5<z<6.5, using all colors in all available catalogs

Find galaxies that are blended with a star, output the deblended magnitudes

Provide a list of moving objects consistent with an asteroid, based on all the surveys, estimate possible orbit parameters

Find binary stars where at least one of them has the colors of a white dwarf, within the error boxes of hard x-ray sources

Page 24: National Virtual Observatory

National Virtual Observatory

Examples: Today’s I/O rates

Reading a 1 TB data set

data access speed time [days]

Fast database server 50 MB/s 0.23

Local SCSI/Fast Ethernet 10 MB/s 1.2

T1 0.5 MB/s 23

Typical ‘good’ www 20 KB/s 580

Brute force is not enough – we need clever techniques

Page 25: National Virtual Observatory

National Virtual Observatory

Geometric Indexing

“Divide and Conquer”“Divide and Conquer” PartitioningPartitioning

3 N M3 N M

HierarchicalTriangular

Mesh

HierarchicalTriangular

Mesh

Split as k-d treeStored as r-tree

of bounding boxes

Split as k-d treeStored as r-tree

of bounding boxes

Using regularindexing

techniques

Using regularindexing

techniques

Attributes Number

Sky Position 3Multiband Fluxes N = 5+Other M= 100+

Attributes Number

Sky Position 3Multiband Fluxes N = 5+Other M= 100+

Page 26: National Virtual Observatory

National Virtual Observatory

Sky coordinates

Stored as Cartesian coordinates:projected onto a unit sphere

Longitude and Latitude lines:intersections of planes and the sphere

Boolean combinations:query polyhedron

Stored as Cartesian coordinates:projected onto a unit sphere

Longitude and Latitude lines:intersections of planes and the sphere

Boolean combinations:query polyhedron

Page 27: National Virtual Observatory

National Virtual Observatory

Sky Partitioning

Hierarchical Triangular Mesh - based on octahedron

Page 28: National Virtual Observatory

National Virtual Observatory

Hierarchical Subdivision

Hierarchical subdivision of spherical trianglesrepresented as a quadtree

In SDSS the tree is 5 levels deep - 8192 triangles,In 2MASS the tree goes much deeper in the Galactic plane

Hierarchical subdivision of spherical trianglesrepresented as a quadtree

In SDSS the tree is 5 levels deep - 8192 triangles,In 2MASS the tree goes much deeper in the Galactic plane

One shoe fits all…

This indexing is now adopted by SDSS, 2MASS, GSC2, POSS2, FIRST and is considered by CDS, PLANCK and GAIA

New standard spatial index for astronomy!

One shoe fits all…

This indexing is now adopted by SDSS, 2MASS, GSC2, POSS2, FIRST and is considered by CDS, PLANCK and GAIA

New standard spatial index for astronomy!

Page 29: National Virtual Observatory

National Virtual Observatory

Result of the Query

Page 30: National Virtual Observatory

National Virtual Observatory

New Analysis Tools

Discover new patterns through advanced statistical methods and visualization techniques

Confront catalogs and image databases with numerical simulations of astrophysical systems

Collaborative exploration of multi-wavelength databases by multiple groups working at remote sites

Page 31: National Virtual Observatory

National Virtual Observatory

New Hardware Requirements

Large distributed database engines with Gbyte/s aggregate I/O speed

High speed (>10 Gbits/s) backbones cross-connecting the major archives

Scalable computing environment with hundreds of CPUs for statistical analysis and discovery

Page 32: National Virtual Observatory

National Virtual Observatory

What is the NVO? - Content

Source Catalogs,Image Data

Query Tools

Specialized Data:Spectroscopy, Time Series,

Polarization Information Archives:Derived & legacy data: NED,Simbad,ADS, etc

Analysis/Discovery Tools:Visualization, Statistics

Standards

Page 33: National Virtual Observatory

National Virtual Observatory

What is the NVO? - Components

Information Providerse.g. ADS, NED, ...

Data ProvidersSurveys, observatories,

archives, SW repositories

Service ProvidersQuery engines,

Compute engines

Page 34: National Virtual Observatory

National Virtual Observatory

Conceptual Architecture

Data ArchivesData Archives

Analysis toolsAnalysis tools

Discovery toolsDiscovery toolsUser

Gateway

Page 35: National Virtual Observatory

National Virtual Observatory

NVO Layers

Basic analysis tools• Query capabilities• Statistical tools• Ability to run user code (API)• Browsing tools

Three layers built on top of another, tied together with standards

Archives•Data content•Interconnections•Cross identifications•Services

Discovery tools• Visualization• Advanced classification methods• Supervised/unsupervised learning• Data mining

Standards•Meta-data•Interfaces between archives•Cross-identification standards•Archive-tool interfaces

Page 36: National Virtual Observatory

National Virtual Observatory

The Flavor/Role of the NVO

Highly Distributed and Decentralized

Multiple Phases, built on top of another• Establish standards, meta-data formats• Integrate main catalogs• Develop initial querying tools• Develop collaboration requirements,

establish procedure to import new catalogs• Develop distributed analysis environment• Develop advanced visualization tools• Develop advanced querying tools

Page 37: National Virtual Observatory

National Virtual Observatory

NVO Development Functions

Software development– query generation/optimization, software agents, user

interfaces, discovery tools, visualization tools

Standards development– Meta-data, meta-services, streaming formats, object

relationships, object attributes,...

Infrastructure development– archival storage systems, query engines, compute servers,

high speed connections of main centers

Train the Next Generation– train scientists equally at home in astronomy and modern

computer science, statistics, visualization

Page 38: National Virtual Observatory

National Virtual Observatory

The Mission of the NVO

The National Virtual Observatory should provide seamless integration of the digitally represented

multi-wavelength sky enable efficient simultaneous access to multi-Terabyte to

Petabyte databases develop and maintain tools to find patterns and discoveries

contained within the large databases develop and maintain tools to confront data with

sophisticated numerical simulations

Page 39: National Virtual Observatory

National Virtual Observatory

NVO Funding

The NVO is ideal for multi-agency and IT funding• relevant for all areas of astronomy and space science

• excellent match to goals of the IT2 initiative

• requires funding from NASA and NSF

• needs serious involvement of computer scientists

Scope• approximately $25M for the first 5 years,

could be larger in the second half

Requires long term commitment• development/deployment (5 + 5 years)

Needs to start soon• data avalanche has already begun

An effort for the whole astronomy An effort for the whole astronomy - astrophysics community! - astrophysics community!

Page 40: National Virtual Observatory

National Virtual Observatory