e-science: international collaborations, research networks and grids
DESCRIPTION
E-Science: International Collaborations, Research Networks and Grids Pan-American Advanced Studies Institute (PASI) Program NSF OISE #0418366 Mendoza, Argentina May 15-21, 2005. Julio Ibarra, PI Heidi Alvarez, Co-PI. 1. Goals and Objectives. - PowerPoint PPT PresentationTRANSCRIPT
1
E-Science: International Collaborations, E-Science: International Collaborations, Research Networks and GridsResearch Networks and Grids
Pan-American Advanced Studies Institute Pan-American Advanced Studies Institute (PASI) Program(PASI) Program
NSF OISE NSF OISE #0418366Mendoza, ArgentinaMendoza, Argentina
May 15-21, 2005May 15-21, 2005
Julio Ibarra, PIJulio Ibarra, PI
Heidi Alvarez, Co-PI
Goals and ObjectivesGoals and Objectives
Understand current issues and challenges Understand current issues and challenges involving e-Science collaborations that span involving e-Science collaborations that span beyond national boundariesbeyond national boundaries
Improve our understanding of the role of Improve our understanding of the role of technology, in particular research networks and technology, in particular research networks and Grids, in e-ScienceGrids, in e-Science
Understand how research faculty, students and Understand how research faculty, students and practitioners are collaborating in e-Sciencepractitioners are collaborating in e-Science
Learn from the high-energy physics and astronomy Learn from the high-energy physics and astronomy communities about how they use Grids and communities about how they use Grids and advanced networking technologies for e-Scienceadvanced networking technologies for e-Science
Understand how Grids and advanced networking Understand how Grids and advanced networking technologies are being applied for international e-technologies are being applied for international e-Science collaborationsScience collaborations
What is the phenomenon of What is the phenomenon of e-Science?e-Science?
Experimental science is no longer limited Experimental science is no longer limited to being conducted in laboratories made to being conducted in laboratories made of bricks and mortar, and is no longer of bricks and mortar, and is no longer done in isolationdone in isolation
Science is increasingly being conducted Science is increasingly being conducted in virtual in virtual laboratorylaboratory environments, it is environments, it is increasingly increasingly collaborativecollaborative, and , and increasingly global increasingly global
In many experimental disciplines, In many experimental disciplines, measuring apparatus are in one location, measuring apparatus are in one location, data is captured and reduced at different data is captured and reduced at different sites, data analysis is conducted at yet sites, data analysis is conducted at yet another site, then data is stored at another site, then data is stored at archives located elsewherearchives located elsewhere
The increasing rates data is generated or The increasing rates data is generated or collected is impacting how e-Scientists collected is impacting how e-Scientists solve problems and coordinate data-solve problems and coordinate data-intensive workintensive work
The Very-Long Baseline Interferometry(VLBI) Technique
Why is e-Science Happening?Why is e-Science Happening?
Discovery requires larger, faster, higher-precision Discovery requires larger, faster, higher-precision measuring apparatusmeasuring apparatus Eg., Discovery of new particles or astronomical Eg., Discovery of new particles or astronomical
objects in the Universe objects in the Universe Measuring apparatus have become prohibitively Measuring apparatus have become prohibitively
expensive for a single nation to developexpensive for a single nation to develop Eg. International Space Station, Radio and Eg. International Space Station, Radio and
Optical telescopesOptical telescopes Technology, in particular high-speed networks, Technology, in particular high-speed networks,
low-cost compute resources, low-cost high-speed low-cost compute resources, low-cost high-speed disk storage and large-size high-resolution display disk storage and large-size high-resolution display for visualization of very large data sets, is more for visualization of very large data sets, is more affordable and more accessibleaffordable and more accessible
Instruments
Picture ofdigital sky
Knowledge from Data
Sensors
Picture ofearthquakeand bridge
Wireless networks
Personalized Medicine
More Diversity, New Devices, New ApplicationsMore Diversity, New Devices, New Applications
How e-Science has been How e-Science has been defineddefined
““e-Science is about global collaboration in key areas of science and e-Science is about global collaboration in key areas of science and the next generation of infrastructure that will enable it.” the next generation of infrastructure that will enable it.” Dr John Taylor, Director General of Research Councils, United Dr John Taylor, Director General of Research Councils, United
KingdomKingdom e-Science refers to large-scale science carried out through distributed e-Science refers to large-scale science carried out through distributed
global collaborations enabled by networks, requiring access to very global collaborations enabled by networks, requiring access to very large data collections, very-large-scale computing resources and high-large data collections, very-large-scale computing resources and high-performance visualizationperformance visualization NSF CISE Grand Challenges in e-Science Workshop ReportNSF CISE Grand Challenges in e-Science Workshop Report
E-Science developed from the field of computational scienceE-Science developed from the field of computational science Since the late 1980s, computational science became established as a Since the late 1980s, computational science became established as a
third avenue of scientific discovery, alongside theoretical and third avenue of scientific discovery, alongside theoretical and experimental methodologiesexperimental methodologies
E-Science goes further by focusing not only on compute-intensive E-Science goes further by focusing not only on compute-intensive simulations, but also on the simulations, but also on the remote use of large-scale data and remote use of large-scale data and knowledge repositories, scientific instruments and experimentsknowledge repositories, scientific instruments and experiments, , sensor arrayssensor arrays
What is the Scientific What is the Scientific Method?Method?
Observe some aspect of the universe.Observe some aspect of the universe. Invent a tentative description, called a Invent a tentative description, called a
hypothesis, that is consistent with what you hypothesis, that is consistent with what you have observed.have observed.
Use the hypothesis to make predictions.Use the hypothesis to make predictions. Test those predictions by experiments or Test those predictions by experiments or
further observations and modify the further observations and modify the hypothesis in the light of your results.hypothesis in the light of your results.
Repeat steps 3 and 4 until there are no Repeat steps 3 and 4 until there are no discrepancies between theory and discrepancies between theory and experiment and/or observation (source: experiment and/or observation (source: Wudka, Jose, Wudka, Jose, The Physics 7 PageThe Physics 7 Page, University California Riverside, , University California Riverside, http://phyun5.ucr.edu/~wudka/physics7.html)http://phyun5.ucr.edu/~wudka/physics7.html)
The Scientific Method in e-ScienceThe Scientific Method in e-Science
Natural Phenomenon
Measuring Engine
Data Archives
Quick Look Data
Theory of Nature
Model of Nature
Data Archives
Human Creativity
+ Compare _
ERROR
SIGNAL
Adjust
Adjust
Virtual Science
Experimental/Observational Science
Theoretical Science
Simulation of Nature
Data Capture & Reduction
Simulation Analysis
Data Analysis
Quick Look Data
• Heavy lines indicate probable high-performance network connections. Dashed lines represent access to archived data, including virtual science, virtual observatories, and other forms of data mining
A measuring engine might be operated by remote control from a Data Capture site
Humans manage most all the process boxes. As a result, there is an information process layer that’s not represented, that is necessary for the exchange of ideas, consultation, collaboration and coordination required to conduct research (source: NSF CISE Grand Challenges in e-Science Workshop Report)
Challenges of Next Generation Challenges of Next Generation Science in the Information AgeScience in the Information Age
Flagship Applications Flagship Applications High Energy & Nuclear Physics, AstroPhysics Sky Surveys:High Energy & Nuclear Physics, AstroPhysics Sky Surveys:
Multi-Terabyte “block” transfers at 1-10+ Gbps Multi-Terabyte “block” transfers at 1-10+ Gbps Fusion Energy:Fusion Energy: Time Critical Burst-Data Distribution; Time Critical Burst-Data Distribution;
Distributed Plasma Simulations, Visualization, Analysis Distributed Plasma Simulations, Visualization, Analysis eVLBI:eVLBI: Many real time data streams at 1-10 Gbps Many real time data streams at 1-10 Gbps BioInformatics, Clinical Imaging:BioInformatics, Clinical Imaging: GByte images on demand GByte images on demand
NEW “Analysis” Challenge:NEW “Analysis” Challenge: Provide results to thousands of Provide results to thousands of scientists, with rapid turnaround, over networks of varying scientists, with rapid turnaround, over networks of varying capability in different world regions: capability in different world regions:
Advanced integrated Grid applications rely on reliable, Advanced integrated Grid applications rely on reliable, high performance operation of our LANs and WANshigh performance operation of our LANs and WANs
Petabytes of complex data explored and analyzed by Petabytes of complex data explored and analyzed by 1000s of globally dispersed scientists, in hundreds of teams1000s of globally dispersed scientists, in hundreds of teams
Source: Harvey Newman
The Data DelugeThe Data Deluge
Science AreasScience Areas Today Today End2EndEnd2End ThroughputThroughput
5 years End2End 5 years End2End ThroughputThroughput
5-10 Years 5-10 Years End2End End2End ThroughputThroughput
RemarksRemarks
High Energy PhysicsHigh Energy Physics 0.5 Gb/s0.5 Gb/s 100 Gb/s100 Gb/s 1000 Gb/s1000 Gb/s high bulk high bulk throughputthroughput
Climate (Data & Climate (Data & Computation)Computation)
0.5 Gb/s0.5 Gb/s 160-200 Gb/s160-200 Gb/s N x 1000 Gb/sN x 1000 Gb/s high bulk high bulk throughputthroughput
NanoScience -NanoScience -
Spallation Neutron Spallation Neutron SourceSource
Not yet startedNot yet started 1 Gb/s1 Gb/s 1000 Gb/s + QoS for 1000 Gb/s + QoS for control channelcontrol channel
remote control remote control and time critical and time critical throughputthroughput
Fusion EnergyFusion Energy 0.066 Gb/s0.066 Gb/s(500 MB/s burst)(500 MB/s burst)
0.198 Gb/s0.198 Gb/s(500MB/(500MB/20 sec. burst)20 sec. burst)
N x 1000 Gb/sN x 1000 Gb/s time critical time critical throughputthroughput
AstrophysicsAstrophysics 0.013 Gb/s0.013 Gb/s(1 TBy/week)(1 TBy/week)
N*N multicastN*N multicast 1000 Gb/s1000 Gb/s computational computational steering and steering and collaborationscollaborations
Genomics Data & Genomics Data & ComputationComputation
0.091 Gb/s0.091 Gb/s(1 TBy/day)(1 TBy/day)
100s of users100s of users 1000 Gb/s + QoS for 1000 Gb/s + QoS for control channelcontrol channel
high throughput high throughput and steeringand steering
Source: DOE Roadmap to 2008 Report
Technology of e-ScienceTechnology of e-Science
Last updated: 27 April 2005
Abilene International PeeringAbilene International Peering
NSF International Research Network Connections (IRNC)NSF International Research Network Connections (IRNC)
US - Latin America Year 1 US - Latin America Year 1 TopologyTopology
LILA links reestablish direct LILA links reestablish direct connectivity to South America connectivity to South America from east and west coastsfrom east and west coasts Reduces delay reaching Reduces delay reaching sites in Chile and Brazil from sites in Chile and Brazil from the US and Asia-Pacificthe US and Asia-Pacific Introduces an infrastructure Introduces an infrastructure to develop a distributed to develop a distributed international exchange points international exchange points and peering fabricsand peering fabrics Leverages network Leverages network resources to provide route resources to provide route diversity and high-availability diversity and high-availability production servicesproduction services
15
European lambdas to US (red)–10Gb Amsterdam—Chicago–10Gb London—Chicago–10Gb Amsterdam—NYC
Canadian lambdas to US (white)–30Gb Chicago-Canada-NYC–30Gb Chicago-Canada-Seattle
US to Europe (grey)–10Gb Chicago—Amsterdam
Japan JGN II lambda to US (cyan)–10Gb Chicago—Tokyo
European lambdas (yellow)–10Gb Amsterdam—CERN –2.5Gb Prague—Amsterdam–2.5Gb Stockholm—Amsterdam–10Gb London—Amsterdam
IEEAF lambdas (blue)–10Gb NYC—Amsterdam–10Gb Seattle—Tokyo
CAVEWave/PacificWave (purple)–10Gb Chicago—Seattle–10Gb Seattle—LA—San Diego
TRANSLIGHT 2004 *Lambdas
NorthernLight
UKLight
CERN
Japan
SurfNet
Global Lambda Integrated FacilityGlobal Lambda Integrated FacilityWorld Map – December 2004World Map – December 2004
Enabling Communities of Researchers and Communities of Interest