e-science and the grid – for research and industry tony hey director of uk e-science core...
TRANSCRIPT
e-Science and the Grid – for Research and Industry
Tony Hey
Director of UK e-Science Core Programme
The e-Science Paradigm • The Integrative Biology Project involves the
University of Oxford (and others) in the UK and the University of Auckland in New ZealandModels of electrical behaviour of heart cells
developed by Denis Noble’s team in OxfordMechanical models of beating heart developed by
Peter Hunter’s group in Auckland
• Researchers need to be able to easily build a secure ‘Virtual Organisation’ allowing access to each group’s resources Will enable researchers to do different science
The Grid = A set of core middleware services running on top of high performance global networks
RCUK e-Science Funding
First Phase: 2001 –2004• Application Projects
– £74M– All areas of science
and engineering• Core Programme
– £15M Research infrastructure
– £20M Collaborative industrial projects
Second Phase: 2003 –2006• Application Projects
– £96M– All areas of science and
engineering• Core Programme
– £16M Research Infrastructure
– £10M DTI Technology Fund
UK Focus on Data and Security• Data Access and Integration
– OGSA-DAI and DAIT project with IBM
• Key grid data services– Workflow, Provenance– Distributed Query, Knowledge Management
• Data Curation and Data Handling– Digital Curation Centre with JISC
• Security, AA and all that– e-Science CA, GSI and WS-Security– Shibboleth/PERMIS deployment with InterNet2
Comb-e-Chem Project
X-Raye-Lab
Analysis
Properties
Propertiese-Lab
SimulationVideo
Diff
ract
omet
er
Grid Middleware
StructuresDatabase
myGrid Project
• Imminent ‘deluge’ of data
• Highly heterogeneous• Highly complex and
inter-related• Convergence of data
and literature archives
Nucleotide Annotation Workflows
Discovery Net Project
Download sequence
from Reference
Server
Save to Distributed Annotation
Server
InteractiveEditor &
Visualisation
Execute distributed annotation workflow
NCBIEMBL
TIGR SNP
InterPro
SMART
SWISSPROT
GO
KEGG
1800 clicks 500 Web access200 copy/paste 3 weeks work in 1 workflow and few second execution
In flight data
Airline
Maintenance Centre
Ground Station
Global Networkeg: SITA
Internet, e-mail, pager
DS&S Engine Health Center
Data centre
DAME Project
eDiaMoND Project
Mammograms have different appearances, depending on image settings and acquisition systems
StandardMammoFormat
StandardMammoFormat
Temporal mammography
ComputerAidedDetection
3D View
The UK e-Science Experience:Phase 1
• All Research Council e-Science funds committed– e-Science pilots launched covering many areas
of science, engineering and medicine
• UK e-Science Core Programme – DTI £20M for collaborative industrial R&D
About 80 UK companies participating Over £30M industrial contributions
• Engineering, Pharmaceutical, Petrochemical• IT companies, Commerce, Media
UK e-Science: Phase 2
Three major new activities:
1. Deploy National Grid Service and establish Grid Operation Support Centre
2. Fund Open Middleware Infrastructure Institute for testing, software engineering and repository for UK middleware
3. Set up Digital Curation Centre for R&D into long-term data preservation issues
UK National Grid Service
• From April 2004, NGS offers free access to two 128 processor compute nodes and two data nodes
• Initial software is based on GT2 via VDT and LCG releases plus SRB and OGSA-DAI
• Plan to move to Web Services based Grid middleware by April 2005
• Need for resource allocation mechanismsAccounting, Performance Prediction
The Web Services ‘Magic Bullet’
Web services
Company A(J2EE)
Company B(LAMP)
Company C(.Net)
Open Grid Services Architecture • Development of Web Services• OGSA/WSRF/… will provide
Naming /Authorization / Security / Privacy/… Projects should look at higher level services: Workflow,
Transactions, DataMining, Knowledge Discovery… Exploit Synergy: Commercial Internet
with Grid Services
The UK Open Middleware Infrastructure Institute (OMII)
• Repository for UK-developed Open Source ‘e-Science/Cyber-infrastructure’ Middleware
• Documentation, specification,QA and standards
• Fund work to bring ‘research project’ software up to ‘production strength’
• Fund Middleware projects for identified ‘gaps’
• Work with US NSF, EU Projects and others
• Supported by major IT companies Southampton selected as the OMII site
Digital Curation Centre (DCC)• In next 5 years e-Science projects will produce
more scientific data than has been collected in the whole of human history
• In 20 years can guarantee that the operating and spreadsheet program and the hardware used to store data will not exist
Research curation technologies and best practice Need to liaise closely with individual research
communities, data archives and libraries
Edinburgh with Glasgow, CLRC and UKOLN selected as site of DCC
MIT DSpace Vision
‘As more and more research and educational material is ‘born digital’, institutions and organizations are increasingly realizing the need for a stable place in which such material may be stored and accessed long-term. The Massachusetts Institute of Technology is a perfect example of an organization with this need. Much of the material produced by faculty, such as datasets, experimental results and rich media data as well as more conventional document-based material (e.g. articles and reports) is housed on an individual’s hard drive or department Web server. Such material is often lost forever as faculty and departments change over time.’
Three Industry Perspectives
• An SAP view
• BAESystems and Virtual Organisations
• The Burger Model from T-Systems
ERP System
Naturally Distributed ProcessingBuyer
WMTRM
Vendor
EventManagement AII
Delivery
Create Event Handler
Purchase Order
Adv. ShipNotification
Cust.Order
Register IDof Pallet
Post Goods Issue
Create HU
Pickor
Produce
Scan IDs
Issue Goods(Loading)
Build HU
AssociateItems / Pallet /
Tags
DeliveryDelivery
BAEgrid – deployment of virtual organisations
BAEsite R
BAEsite B
Southamptone-Science
SingaporeiHPC
Cardiffe-Science
Swansea U.Manchestere-Science
HP Labs BAEsite F
BAEsite G
BAEsite WPlatform
•VO needs better definition to support asymmetric operation.
• VO lifecycle tools are required.
Identification
Formation
Operation
Dissolution
T-Systems Burger Model
Corporate Computing
Information Glue
Pervasive Computing
The Commoditization of Middleware
• Microsoft and IBM have agreed on the Web Services ‘open standard’ approach to interoperable low level distributed middleware
• Providing high value-added services and products based on this secure, robust, common open standard middleware infrastructure will be central to the new economy.
• The existence of ‘open source’ implementations of this ‘open standard’ middleware will enable new SMEs to compete with traditional packaged software vendors
Realizing Licklider’s Vision “Lick had this concept of the intergalactic network
which he believed was everybody could use computers anywhere and get at data anywhere in the world. He didn’t envision the number of computers we have today by any means, but he had the same concept – all of the stuff linked together throughout the world, that you can use a remote computer, get data from a remote computer, or use lots of computers in your job. The vision was really Lick’s originally.”
Larry Roberts – Principal Architect of the ARPANET
e-Government and the Grid
‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information.’
Tony Blair, 2002