national virtual observatory
DESCRIPTION
National Virtual Observatory. The National Virtual Observatory. National distributed in scope across institutions and agencies available to all astronomers and the public Virtual not tied to a single “brick-and-mortar” location - PowerPoint PPT PresentationTRANSCRIPT
National Virtual Observatory
National Virtual Observatory
The National Virtual Observatory
National• distributed in scope across institutions and agencies
• available to all astronomers and the public
Virtual• not tied to a single “brick-and-mortar” location
• supports astronomical “observations” and discoveries via remote access to digital representations of the sky
Observatory• general purpose
• access to large areas of the sky at multiple wavelengths
• supports a wide range of astronomical explorations
• enables discovery via new computational tools
National Virtual Observatory
Why Now ?
The past decade has witnessed• a thousand-fold increase in computer speed
• a dramatic decrease in the cost of computing & storage
• a dramatic increase in access to broadly distributed data
• large archives at multiple sites and high speed networks
• significant increases in detector size and performance
These form the basis for science
of qualitatively different nature
National Virtual Observatory
Trends
Future dominated by detector improvements
Total area of 3m+ telescopes in the world in m2, total number of CCD pixels in Megapix, as a function of time. Growth over 25 years is a factor of 30 in glass, 3000 in pixels.
• Moore’s Law growth in CCD capabilities
• Gigapixel arrays on the horizon
• Improvements in computing and storage will track growth in data volume
• Investment in software is critical, and growing
National Virtual Observatory
The Discovery Process
discover significant patterns
• from the analysis of statistically rich and unbiased image/catalog databases
understand complex astrophysical systems • via confrontation between data and
large numerical simulations
Past: observations of small, carefully selected samplesof objects in a narrow wavelength band
Future: high quality, homogeneous multi-wavelengthdata on millions of objects, allowing us to
The discovery process will rely heavily on advanced visualizationand statistical analysis tools
National Virtual Observatory
NVO Science: Discoveries
Discoveries of rare objects:Searches for exotic new sources
truly rare at level of 1 source in 10 million
Multi-wavelength identification of large statistical samples of previously rare objects:
• brown dwarfs, high-z quasars, ultra-luminous IR galaxies, etc.
Efficient cross-identification of “unidentified sources” from new surveys
• Example: Use radio, optical, and IR surveys to identify serendipitous Chandra X-ray sources
Selection of targets for spectroscopic follow-up
National Virtual Observatory
NVO Science: Statistical Surveys
Homogeneous samples of typical objects• Mega-surveys: sample size not a problem any more
• Statistical accuracy determined entirely by systematics
• Multi-wavelength data enables accurate sample selection(evolution, rest-frame selection)
High Precision Astrophysics of Origins• Large scale structure of the universe
• Galactic structure
• Galaxy evolution
• Active galaxies, galaxy clusters, ...
• Stellar populations
Leading to New AstronomyNew Astronomy
National Virtual Observatory
New Astronomy – Different!
Systematic Data Explorationwill have a central role in the New Astronomy
Digital Archives of the Skywill be the main access to data
Data “Avalanche”the flood of Terabytes of data is already happening, whether we like it or not!
Transition to the newmay be organized or chaotic
National Virtual Observatory
Ongoing Mega-Surveys
Large number of new surveys• Multi-terabyte in size, 100 million objects or larger
• Individual archives planned and under way
Multi-wavelength view of the sky• More than 13 wavelength coverage within 5 years
Impressive early discoveries• Finding exotic objects by unusual colors
L,T dwarfs, high redshift quasars
• Finding objects by time variability
gravitational micro-lensing
MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE, ...
MACHO2MASSSDSSDPOSSGSC-IICOBE MAPNVSSFIRSTGALEXROSATOGLE, ...
National Virtual Observatory
High Redshift Quasars
Several z>5 QSOs discovered by SDSSin the early test data
National Virtual Observatory
Methane/T Dwarf
Discovery of several newobjects by SDSS & 2MASS
SDSS T-dwarf (June 1999)
National Virtual Observatory
DPOSS Discoveries
National Virtual Observatory
New Neighbor of the Milky Way
Finding new galaxies by spatial clustering of red objects
New galaxy is about 30 million light years away
Larger than most of the spiral galaxies in the Messier Catalogue
Clearly visible in the 2MASS infrared image
Expect to find 1000’s of such galaxies with 2MASS
Infrared Optical
National Virtual Observatory
The Observatories
NOAO/NRAO • 20% of the time on all its telescopes dedicated to major
surveys using a wide range of telescope and instrumentation packages
The NASA Great Observatories• new opportunities for surveys, • combine mission-specific data with those from other
missions and from the ground
several multi-Terabyte databasesand further extensive catalogs of objects
National Virtual Observatory
HST Data Archive
Several Terabytes/year Several Terabytes/year
Already more retrieval than ingest!Already more retrieval than ingest!
National Virtual Observatory
Some Proposed Surveys
Next Decade: New optimized “survey systems”
exploring new parameter spaceexploring new parameter space
Dark Matter Telescope• map the distribution of matter for z<1.5 from weak
lensing, through deep, high quality images of galaxies
• moving and variable objects through repetitive surveys
Spectroscopic Wide-Field Telescope• evolution of galaxies from z~4 to the present
from star formation rates
• determine chemical abundances and kinematics
National Virtual Observatory
The Road to the NVO
The environment to exploit these huge sky surveys does not exist today!
• 1 Terabyte at 10 Mbyte/s takes 1 day• Expect 100’s of intensive queries and
1000’s of casual queries per-day• Data will reside at multiple locations• Existing analysis tools do not scale to Terabyte data sets
Acute need in a few years,
it will not just happen
a New Initiative is needed!a New Initiative is needed!
National Virtual Observatory
NVO: A New Initiative
A new initiative is needed • to ensure an evolutionary, cost-effective transition
• to maximize the impact of large current and future efforts
• to create the necessary new standards in the community
• to develop the software tools needed
• to ensure that the astronomical community has the proper network and hardware infrastructure to carry out its science
The National Virtual Observatory• can be the catalyst of the “New Astronomy”
National Virtual Observatory
The Goals of the NVO
Virtual observations of the sky in multiple wavelengths, by integrating all-sky Mega-surveys
Query the individual object catalogs and image databases thousands of times per day
Joint queries of the combined catalogs thousands of times per day
Enable discovery in these archives via new tools novel visualization techniques,
supervised, unsupervised learning, advanced classification techniques
National Virtual Observatory
NVO: The Challenges
Size of the archived data• 40,000 square degrees is 2 trillion pixels• One band: 4 Terabytes• Multi-wavelength: 10-100 Terabytes• Time dimension: few Petabytes
The development of• new archival methods• new analysis tools
Hardware requirements
Training the next generation
National Virtual Observatory
Necessary Components
New archival methods
New analysis tools
New hardware requirements
National Virtual Observatory
New Archival Methods
Structure and manage multi-TB (and soon PB) data archives, distributed across the continent
Rapid and transparent access to image/catalog databases across all wavelengths, via intelligent query agents
Efficient query and data retrieval by more than 10,000 scientists world-wide, with enhanced search operators (like spatial proximity)
National Virtual Observatory
Examples: non-local queries
Find all objects within 1' which have more than two neighbors with u-g, g-r, i-K colors within 0.05m
Find all star-like objects within dm=0.2 of the colors of a quasar at 5.5<z<6.5, using all colors in all available catalogs
Find galaxies that are blended with a star, output the deblended magnitudes
Provide a list of moving objects consistent with an asteroid, based on all the surveys, estimate possible orbit parameters
Find binary stars where at least one of them has the colors of a white dwarf, within the error boxes of hard x-ray sources
National Virtual Observatory
Examples: Today’s I/O rates
Reading a 1 TB data set
data access speed time [days]
Fast database server 50 MB/s 0.23
Local SCSI/Fast Ethernet 10 MB/s 1.2
T1 0.5 MB/s 23
Typical ‘good’ www 20 KB/s 580
Brute force is not enough – we need clever techniques
National Virtual Observatory
Geometric Indexing
“Divide and Conquer”“Divide and Conquer” PartitioningPartitioning
3 N M3 N M
HierarchicalTriangular
Mesh
HierarchicalTriangular
Mesh
Split as k-d treeStored as r-tree
of bounding boxes
Split as k-d treeStored as r-tree
of bounding boxes
Using regularindexing
techniques
Using regularindexing
techniques
Attributes Number
Sky Position 3Multiband Fluxes N = 5+Other M= 100+
Attributes Number
Sky Position 3Multiband Fluxes N = 5+Other M= 100+
National Virtual Observatory
Sky coordinates
Stored as Cartesian coordinates:projected onto a unit sphere
Longitude and Latitude lines:intersections of planes and the sphere
Boolean combinations:query polyhedron
Stored as Cartesian coordinates:projected onto a unit sphere
Longitude and Latitude lines:intersections of planes and the sphere
Boolean combinations:query polyhedron
National Virtual Observatory
Sky Partitioning
Hierarchical Triangular Mesh - based on octahedron
National Virtual Observatory
Hierarchical Subdivision
Hierarchical subdivision of spherical trianglesrepresented as a quadtree
In SDSS the tree is 5 levels deep - 8192 triangles,In 2MASS the tree goes much deeper in the Galactic plane
Hierarchical subdivision of spherical trianglesrepresented as a quadtree
In SDSS the tree is 5 levels deep - 8192 triangles,In 2MASS the tree goes much deeper in the Galactic plane
One shoe fits all…
This indexing is now adopted by SDSS, 2MASS, GSC2, POSS2, FIRST and is considered by CDS, PLANCK and GAIA
New standard spatial index for astronomy!
One shoe fits all…
This indexing is now adopted by SDSS, 2MASS, GSC2, POSS2, FIRST and is considered by CDS, PLANCK and GAIA
New standard spatial index for astronomy!
National Virtual Observatory
Result of the Query
National Virtual Observatory
New Analysis Tools
Discover new patterns through advanced statistical methods and visualization techniques
Confront catalogs and image databases with numerical simulations of astrophysical systems
Collaborative exploration of multi-wavelength databases by multiple groups working at remote sites
National Virtual Observatory
New Hardware Requirements
Large distributed database engines with Gbyte/s aggregate I/O speed
High speed (>10 Gbits/s) backbones cross-connecting the major archives
Scalable computing environment with hundreds of CPUs for statistical analysis and discovery
National Virtual Observatory
What is the NVO? - Content
Source Catalogs,Image Data
Query Tools
Specialized Data:Spectroscopy, Time Series,
Polarization Information Archives:Derived & legacy data: NED,Simbad,ADS, etc
Analysis/Discovery Tools:Visualization, Statistics
Standards
National Virtual Observatory
What is the NVO? - Components
Information Providerse.g. ADS, NED, ...
Data ProvidersSurveys, observatories,
archives, SW repositories
Service ProvidersQuery engines,
Compute engines
National Virtual Observatory
Conceptual Architecture
Data ArchivesData Archives
Analysis toolsAnalysis tools
Discovery toolsDiscovery toolsUser
Gateway
National Virtual Observatory
NVO Layers
Basic analysis tools• Query capabilities• Statistical tools• Ability to run user code (API)• Browsing tools
Three layers built on top of another, tied together with standards
Archives•Data content•Interconnections•Cross identifications•Services
Discovery tools• Visualization• Advanced classification methods• Supervised/unsupervised learning• Data mining
Standards•Meta-data•Interfaces between archives•Cross-identification standards•Archive-tool interfaces
National Virtual Observatory
The Flavor/Role of the NVO
Highly Distributed and Decentralized
Multiple Phases, built on top of another• Establish standards, meta-data formats• Integrate main catalogs• Develop initial querying tools• Develop collaboration requirements,
establish procedure to import new catalogs• Develop distributed analysis environment• Develop advanced visualization tools• Develop advanced querying tools
National Virtual Observatory
NVO Development Functions
Software development– query generation/optimization, software agents, user
interfaces, discovery tools, visualization tools
Standards development– Meta-data, meta-services, streaming formats, object
relationships, object attributes,...
Infrastructure development– archival storage systems, query engines, compute servers,
high speed connections of main centers
Train the Next Generation– train scientists equally at home in astronomy and modern
computer science, statistics, visualization
National Virtual Observatory
The Mission of the NVO
The National Virtual Observatory should provide seamless integration of the digitally represented
multi-wavelength sky enable efficient simultaneous access to multi-Terabyte to
Petabyte databases develop and maintain tools to find patterns and discoveries
contained within the large databases develop and maintain tools to confront data with
sophisticated numerical simulations
National Virtual Observatory
NVO Funding
The NVO is ideal for multi-agency and IT funding• relevant for all areas of astronomy and space science
• excellent match to goals of the IT2 initiative
• requires funding from NASA and NSF
• needs serious involvement of computer scientists
Scope• approximately $25M for the first 5 years,
could be larger in the second half
Requires long term commitment• development/deployment (5 + 5 years)
Needs to start soon• data avalanche has already begun
An effort for the whole astronomy An effort for the whole astronomy - astrophysics community! - astrophysics community!
National Virtual Observatory