new e-science and cyberinfrastructure: a middleware perspective hey.pdf · 2019. 12. 6. · a...

28
JPGRID-GW06 1 e- Science and Science and Cyberinfrastructure Cyberinfrastructure: A Middleware Perspective A Middleware Perspective Tony Hey Tony Hey Corporate VP for Technical Computing Corporate VP for Technical Computing Microsoft Corporation Microsoft Corporation The e The e- Science Vision Science Vision e- Science is about multidisciplinary science and Science is about multidisciplinary science and the technologies to support such distributed, the technologies to support such distributed, collaborative scientific research collaborative scientific research Many areas of science are now being overwhelmed Many areas of science are now being overwhelmed by a by a ‘ data deluge data deluge’ from new high from new high- throughput devices, throughput devices, sensor networks, satellite surveys sensor networks, satellite surveys … Areas such as bioinformatics, genomics, drug design, Areas such as bioinformatics, genomics, drug design, engineering and healthcare require collaboration engineering and healthcare require collaboration between different domain experts between different domain experts ‘e- Science Science’ is a shorthand for a set of is a shorthand for a set of technologies to support collaborative networked technologies to support collaborative networked science science

Upload: others

Post on 10-Oct-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

1

ee--Science and Science and CyberinfrastructureCyberinfrastructure::

A Middleware PerspectiveA Middleware Perspective

Tony HeyTony HeyCorporate VP for Technical ComputingCorporate VP for Technical Computing

Microsoft CorporationMicrosoft Corporation

The eThe e--Science VisionScience Visionee--Science is about multidisciplinary science and Science is about multidisciplinary science and the technologies to support such distributed, the technologies to support such distributed, collaborative scientific researchcollaborative scientific research

Many areas of science are now being overwhelmed Many areas of science are now being overwhelmed by a by a ‘‘data delugedata deluge’’ from new highfrom new high--throughput devices, throughput devices, sensor networks, satellite surveys sensor networks, satellite surveys ……Areas such as bioinformatics, genomics, drug design, Areas such as bioinformatics, genomics, drug design, engineering and healthcare require collaboration engineering and healthcare require collaboration between different domain expertsbetween different domain experts

‘‘ee--ScienceScience’’ is a shorthand for a set of is a shorthand for a set of technologies to support collaborative networked technologies to support collaborative networked science science

Page 2: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

2

http://www.neptune.washington.edu/http://www.neptune.washington.edu/

Undersea Sensor Network

Connected & Controllable

Over the Internet

Page 3: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

3

Visual Programming

PersistentDistributed

Storage

Distributed Computation

Interoperability & Legacy

Support via Web Services

Page 4: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

4

Live Documents

Searching & Visualization

Reputation& Influence

CyberinfrastructureCyberinfrastructureCyberinfrastructure and eCyberinfrastructure and e--InfrastructureInfrastructure

In the US, Europe and Asia there is a common In the US, Europe and Asia there is a common vision for the vision for the ‘‘cyberinfrastructurecyberinfrastructure’’ required to required to support the esupport the e--Science revolutionScience revolutionSet of Middleware Services supported on top of high Set of Middleware Services supported on top of high bandwidth academic research networksbandwidth academic research networksSoftware, hardware and organizations to support eSoftware, hardware and organizations to support e--ScienceScience

Similar to vision of the Grid as a set of services Similar to vision of the Grid as a set of services that allows scientists that allows scientists –– and industry and industry –– to to routinelyroutinely set up set up ‘‘Virtual OrganizationsVirtual Organizations’’ for their for their research research –– or businessor business

The The ‘‘Microsoft GridMicrosoft Grid’’ vision is as much about vision is as much about integrating and managing data and information than integrating and managing data and information than about compute cyclesabout compute cycles

Page 5: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

5

Technical Computing at MicrosoftTechnical Computing at MicrosoftAdvanced Computing for Science and Advanced Computing for Science and EngineeringEngineering

Application of new algorithms, tools and Application of new algorithms, tools and technologies to scientific and engineering technologies to scientific and engineering problemsproblems

High Performance ComputingHigh Performance ComputingApplication of high performance clusters and Application of high performance clusters and database technologies to industrial and database technologies to industrial and scientific applicationsscientific applications

Radical ComputingRadical ComputingResearch in potential breakthrough Research in potential breakthrough technologiestechnologies

PaPartnering with Japanese Academia rtnering with Japanese Academia in Computer Science Researchin Computer Science Research

MS IJARCMS IJARC

Microsoft JapanMicrosoft Japan ResearchResearchMicrosoftMicrosoftMicrosoft

CorporationCorporation

CORE projectProf. Igarashi

User InterfaceGestural controlfor Appliance

CORE projectProf. Tsujii

Natural LanguageData Mining with NLP technology

CORE projectProf. Aizawa

Search TechnologySearch for graphical data

NLPSpeech

Human/Computer Interaction Cutting Edge Software Technology

GamingCG

SearchTechnology

UIPen-Ink HPCWindows

CoreEmbeddedTwCSecurity

Page 6: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

6

Fighting HIV with Computer ScienceFighting HIV with Computer ScienceNebojsa Jojic and David HeckermanNebojsa Jojic and David Heckerman

A major problem: Over 40 million infectedA major problem: Over 40 million infectedDrug treatments are effective but are an Drug treatments are effective but are an expensive life commitmentexpensive life commitment

Vaccine needed for third world countriesVaccine needed for third world countriesEffective vaccine could eradicate diseaseEffective vaccine could eradicate disease

Methods from computer science are Methods from computer science are helping with the design of vaccinehelping with the design of vaccine

Machine learning: Finding biological patterns Machine learning: Finding biological patterns that may stimulate the immune system to fight that may stimulate the immune system to fight the HIV virusthe HIV virusOptimization methods: Compressing these Optimization methods: Compressing these patterns into a small, effective vaccinepatterns into a small, effective vaccine

HIV: The diabolical virusHIV: The diabolical virusThe trainThe train--andand--kill mechanism doesnkill mechanism doesn’’t t work for HIV work for HIV –– the virus adapts the virus adapts through rapid mutation. As soon as through rapid mutation. As soon as the killer cells get the upper hand, the the killer cells get the upper hand, the epitopes start changing.epitopes start changing.

Strategy: Strategy: Find peptides or epitopes that occur Find peptides or epitopes that occur commonly across a *population* of commonly across a *population* of HIV virusesHIV virusesCompact the known or potential Compact the known or potential immune targets into a small vaccineimmune targets into a small vaccine

Page 7: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

7

Set of Computational ToolsSet of Computational ToolsChromatogram Chromatogram deconvolutiondeconvolutionPathway analysis/association/causal Pathway analysis/association/causal modelsmodelsClustering/Trees (Clustering/Trees (phylophylo, , haplotypeshaplotypes etc.)etc.)Protein binding and foldingProtein binding and foldingSequence diversity models (epitomes)Sequence diversity models (epitomes)Image analysis/classificationImage analysis/classificationEvolution modeling and inferenceEvolution modeling and inferenceEpitopeEpitope predictionprediction

International Virtual ObservatoryInternational Virtual ObservatoryData has no commercial valueData has no commercial value

No privacy concernsNo privacy concernsCan freely share results with othersCan freely share results with othersGreat for experimenting with algorithmsGreat for experimenting with algorithms

Data is real and well documentedData is real and well documentedHighHigh--dimensional datadimensional dataSpatial dataSpatial dataTemporal dataTemporal data

Data from many different Data from many different instruments, places and timesinstruments, places and times

Federation is a key goalFederation is a key goalThere is a lot of data (There is a lot of data (petabytespetabytes))

With thanks to Jim GrayWith thanks to Jim Gray

IRAS 100µ

ROSAT ~keV

DSS Optical

2MASS 2µ

IRAS 25µ

NVSS 20cm

WENSS 92cm

GB 6cm

Page 8: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

8

The Multiwavelength Crab NebulaeThe Multiwavelength Crab Nebulae

X-ray, optical,

infrared, and radio

views of the nearby Crab Nebula, which is

now in a state of chaotic expansion after a

supernova explosion first sighted in 1054

A.D. by Chinese Astronomers.

Slide courtesy of Robert Brunner @ CalTech.

Crab star 1053 AD

SkyServer (http://cas.sdss.org) A modern archiveA modern archive

Access to Sloan Digital Sky SurveyAccess to Sloan Digital Sky SurveySpectroscopic and Optical surveysSpectroscopic and Optical surveysRaw Pixel data lives in file serversRaw Pixel data lives in file serversCatalog data (derived objects) lives in DatabaseCatalog data (derived objects) lives in DatabaseOnline query to any and allOnline query to any and all

Interesting thingsInteresting thingsSpatial data searchSpatial data searchQuery interface via Java AppletQuery interface via Java AppletQuery from Query from EmacsEmacs, Python, , Python, ……. . Template design cloned by other Template design cloned by other surveys surveys Web Services are core of itWeb Services are core of it

Page 9: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

9

SkyQuerySkyQuery ((http://skyquery.net/http://skyquery.net/))Distributed Query tool using a set of Web ServicesDistributed Query tool using a set of Web ServicesFederates many astronomy archives from Federates many astronomy archives from Pasadena, Chicago, Baltimore, Cambridge UKPasadena, Chicago, Baltimore, Cambridge UKGrown from 4 to 15 Grown from 4 to 15 archives,becomingarchives,becominginternational standardinternational standardWebServiceWebService ‘‘Poster ChildPoster Child’’Allows queries like:Allows queries like:

SELECT o.objId, o.r, o.type, t.objIdFROM SDSS:PhotoPrimary o,

TWOMASS:PhotoPrimary tWHERE XMATCH(o,t)<3.5

AND AREA(181.3,-0.76,6.5)AND o.type=3 and (o.I - t.m_j)>2

IVO: An Astronomy Data GridIVO: An Astronomy Data GridWorking to build worldWorking to build world--wide telescopewide telescope

All astronomy data and literature All astronomy data and literature online and cross indexedonline and cross indexedTools to analyze itTools to analyze it

Built Built SkyServer.SDSS.orgSkyServer.SDSS.orgBuilt Analysis systemBuilt Analysis system

MyDBMyDBCasJobsCasJobs (batch job)(batch job)

OpenSkyQueryOpenSkyQueryFederation of ~20 observatories.Federation of ~20 observatories.Results:Results:

It works and is used every dayIt works and is used every daySpatial extensions in SQL 2005Spatial extensions in SQL 2005A good example of Data GridA good example of Data GridA good example of Web ServicesA good example of Web Services

Page 10: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

10

HPC: Top 500 TrendsHPC: Top 500 Trends

Industry usage rising

Clusters over 50%

x86 is winning

GigE is gaining

HPC: Market TrendsHPC: Market Trends

Capability, Enterprise

$1M+

Divisional$250K-$1M

Departmental$50-250K

Workgroup<$50K

2004 Systems2004 Systems

1,1671,167

3,9153,915

22,71222,712

127,802127,802

Source: IDC, 2005Source: IDC, 2005

<$250K <$250K –– 97% of systems, 52% of revenue97% of systems, 52% of revenueIn 2004 clusters grew 96% to 37% by revenueIn 2004 clusters grew 96% to 37% by revenue

Average cluster size 10Average cluster size 10--16 nodes16 nodes

Page 11: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

11

Continuing Trend Towards Continuing Trend Towards Decentralized, Networked Decentralized, Networked ResourcesResources Grids of personal &

departmental clusters

Personal workstations & departmental servers

Minicomputers

Mainframes

Microsoft Strategy for HPCMicrosoft Strategy for HPCReduce barriers to adoption for HPC clustersReduce barriers to adoption for HPC clustersEasy to deploy and useEasy to deploy and useEasy to manage and ownEasy to manage and ownProvide application support in key HPC verticalsProvide application support in key HPC verticalsEngagement with the top HPC ISVsEngagement with the top HPC ISVsEnabling Open Source applications via University Enabling Open Source applications via University relationshipsrelationshipsLeverage a breadth of standard knowledgeLeverage a breadth of standard knowledge--

management toolsmanagement toolsWeb Services, SQL, Web Services, SQL, SharepointSharepoint, , InfopathInfopath, Excel, Excel

Focused Approach to MarketFocused Approach to MarketEnable broad HPC adoption and making HPC into a Enable broad HPC adoption and making HPC into a high volume markethigh volume market

Page 12: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

12

Tokyo Institute of TechnologyTokyo, Japan

Center of InnovationCenter of InnovationTokyo Institute of TechnologyTokyo Institute of Technology

•• Primary FocusPrimary FocusIntegrating Windows CCE into a large heterogeneous computing environment

•• Current SystemCurrent SystemNEC/Intel Xeon Server x 65 nodes, 130 CPUs (64 Computing Nodes, 1 Head Node)

Goals:Goals:1) Build credibility in HPC community through

relationships with HPC leaders world-wide2) Secure high quality sources of product feedback

and advanced HPC research

Windows HPC ConsortiumWindows HPC ConsortiumWindows HPC Consortium

Broad engagement with public corporations and organizations

Share output to the publicShare output to the public

同志社大学

Page 13: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

13

Radical ComputingRadical ComputingThe end of MooreThe end of Moore’’s Law as we know its Law as we know it

Number of transistors on a chip will Number of transistors on a chip will continue to increasecontinue to increaseNo significant increase in clock speedNo significant increase in clock speed

Future of silicon chipsFuture of silicon chips““100100’’s of cores on a chip in 2015s of cores on a chip in 2015””(Justin (Justin RattnerRattner, Intel), Intel)““4 cores4 cores””//TflopTflop => 25 => 25 TflopsTflops/chip/chip

Challenge for IT industryChallenge for IT industryCan we make parallel computing on a chip Can we make parallel computing on a chip easier than messageeasier than message--passing?passing?

ServiceService--Orientation for Orientation for building Distributed Systemsbuilding Distributed Systems

Service

Service

Service

Service

Administrative domain

Service

Service

network

boundariesmessages

Page 14: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

14

The Service RevolutionThe Service RevolutionWeb 2.0Web 2.0

Social networks, tagging for sharing e.g. Social networks, tagging for sharing e.g. FlikrFlikr, , Del.icio.usDel.icio.us, , MySpaceMySpace, , ……WikisWikis, , BlogsBlogs, RSS , RSS ……

Software delivered as a serviceSoftware delivered as a serviceLive servicesLive services

Microsoft Office LiveMicrosoft Office LiveXboxLiveXboxLiveAcademicLiveAcademicLive

MashupsMashupsCraigslistCraigslist + + GoogleMapGoogleMaphttp://http://mashupcamp.commashupcamp.com

id

id

id

Combine services to give added value

ee--Science Science MashupsMashups??

Page 15: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

15

Scenario for an eScenario for an e--Science Science MashupMashup

Combine datasets to create new Combine datasets to create new datasetdatasetDocument results Document results –– data and articledata and articleDeposit in repositoryDeposit in repository

Added value is in data synthesis and Added value is in data synthesis and analysisanalysis

The Web Services The Web Services ‘‘Magic BulletMagic Bullet’’

Company A(J2EE)

Open Source(OMII)

Company C(.Net)

Web Services

Page 16: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

16

Convergence in Web Services Convergence in Web Services Systems ManagementSystems Management

Different approaches lead to confusion Different approaches lead to confusion and uncertaintyand uncertainty

WSWS--DM and WSDM and WS--ManagementManagementWSWS--RF and WSRF and WS--TransferTransferWSWS--Notification and WSNotification and WS--EventingEventing

Microsoft, IBM, HP, and Intel agreed to a Microsoft, IBM, HP, and Intel agreed to a convergence roadmapconvergence roadmap

No specific timeline yet announcedNo specific timeline yet announced

Possible New Web Services?Possible New Web Services?

New Roadmap proposes:New Roadmap proposes:WSWS--ResourceTransferResourceTransferWSWS--EventingNotificationEventingNotificationWSWS--? For Systems Management? For Systems Management

Problem:Problem:It will be some time before these It will be some time before these proposals are stable and widely adopted proposals are stable and widely adopted with multiple implementations and robust with multiple implementations and robust tooling availabletooling available

Page 17: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

17

Web Services and the GridWeb Services and the GridA Complicated Story:A Complicated Story:

Basic Web Service specifications Basic Web Service specifications WSWS--I (SOAP, WSDL) from 2001 onwardsI (SOAP, WSDL) from 2001 onwards

Web Service GridsWeb Service GridsGG--WSDL and OGSI (2001 WSDL and OGSI (2001 –– 2003)2003)WSWS--RF, WSRF, WS--N and WSN and WS--DM (2004 DM (2004 -- ?)?)

Lesson: Lesson: Build Web Service Grids incrementally only Build Web Service Grids incrementally only on stable, mature and widelyon stable, mature and widely--accepted WS accepted WS foundationsfoundations

stableprofile

Web Service Grids: Web Service Grids: An Evolutionary ApproachAn Evolutionary Approach

WS-I

Standards that havebroad industry support

and multiple interoperableimplementations

Specifications that are emergingfrom standardisation process

and are recognised as being ‘useful’

Specifications that have/will enter a standardisation processbut are not stable and are still experimental

Page 18: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

18

Grids for Virtual OrganizationsGrids for Virtual Organizations

Grids for Virtual OrganizationsGrids for Virtual Organizations

Virtual Organizations

Application domain-specificservices

Web Services technologies

Secu

rityH

PC

Dat

a

Wor

kflo

w

Page 19: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

19

Premise: GGF/EGA will soon be Premise: GGF/EGA will soon be able to deliver some specifications able to deliver some specifications for Web Service Gridsfor Web Service Grids

By focusing on simple Grid services built on By focusing on simple Grid services built on accepted Web Services we can reach accepted Web Services we can reach agreement quicklyagreement quicklyLook at three key areas for Grids for Virtual Look at three key areas for Grids for Virtual OrganizationsOrganizations

SecuritySecurityHPC ServicesHPC ServicesData ServicesData Services

Virtual Organization SecurityVirtual Organization SecurityNot yet routine and seamless: many Not yet routine and seamless: many technologies and standards exist in the technologies and standards exist in the security spacesecurity spaceInteroperability only works if proposed Interoperability only works if proposed solutions are widely accepted by both solutions are widely accepted by both industry and academia industry and academia Larger problem than just for the GGF Larger problem than just for the GGF communitycommunityIT industry will provide high quality, well IT industry will provide high quality, well documented tooling and services to documented tooling and services to construct secure Virtual Organizationsconstruct secure Virtual Organizations

Page 20: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

20

Security Challenges (1)Security Challenges (1)Security decisions about multiple principalsSecurity decisions about multiple principals

Users, groups, machines, clusters Users, groups, machines, clusters ……

FineFine--grained trustgrained trustLoosely coupled domains, dynamic relationshipsLoosely coupled domains, dynamic relationships

Simple and scalable authenticationSimple and scalable authenticationSeamless crossSeamless cross--domain authentication domain authentication Flexible revocation approachesFlexible revocation approaches

Uniform and flexible authorization Uniform and flexible authorization Uniform interUniform inter-- and intraand intra--domain access control domain access control Distributed/hierarchical policyDistributed/hierarchical policy

Seamless communications securitySeamless communications securityEfficient discovery/negotiation of requirementsEfficient discovery/negotiation of requirements

Security Challenges (2)Security Challenges (2)Automatic and safe code deploymentAutomatic and safe code deployment

Code identity and policyCode identity and policy--controlled actions controlled actions Securely deliver code and provisioning information Securely deliver code and provisioning information

Distributed resource managementDistributed resource managementPolicy controlled resource disclosurePolicy controlled resource disclosureAuthorization for job scheduling, monitoring, Authorization for job scheduling, monitoring, cancellations, cancellations, ……

Constrained delegationConstrained delegationDelegated access rights and authorizationDelegated access rights and authorization

Uniform auditing approach to support forensicsUniform auditing approach to support forensicsIntegrated with authorization policyIntegrated with authorization policy

Page 21: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

21

Today: Some Partial SolutionsToday: Some Partial SolutionsMany relevant technologies aroundMany relevant technologies around

Name/Password, Kerberos, X.509 PKI, SAMLName/Password, Kerberos, X.509 PKI, SAMLPermisPermis, XACML, XACMLX.509 CA, X.509 CA, MyProxyMyProxy, Shibboleth CA, VOMS , Shibboleth CA, VOMS WSWS--Security, SAML, HTTPSSecurity, SAML, HTTPS

Present solutions work with limitationsPresent solutions work with limitationsComplex to build, deploy, maintain and manageComplex to build, deploy, maintain and manageMultiple security weaknesses, unaddressed needsMultiple security weaknesses, unaddressed needsInteroperability is difficult to achieve and maintainInteroperability is difficult to achieve and maintain

Challenge for the Grid communityChallenge for the Grid communityAll the pieces are there for us to provide industrial All the pieces are there for us to provide industrial strength solutions for building Gridsstrength solutions for building Grids

The OGSA HPC ProfileThe OGSA HPC ProfileDefines a minimalist base interface plus Defines a minimalist base interface plus optional extensionsoptional extensions

Small base interface enables simple interoperability Small base interface enables simple interoperability widely and quickly widely and quickly Common use cases covered by extensionsCommon use cases covered by extensionsExtension model enables principled Extension model enables principled experimentation and evolutionexperimentation and evolution

Defines minimal set of composable, Defines minimal set of composable, extensible servicesextensible services

Job SubmissionJob SubmissionData StagingData Staging

Page 22: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

22

Builds on GGF specificationsBuilds on GGF specificationsLeverages existing workLeverages existing work

Profile over JSDL, BES, etc.Profile over JSDL, BES, etc.Likely to need to extensions/restrictionsLikely to need to extensions/restrictionsMay need new protocols that are the May need new protocols that are the equivalent of BES for resource reservation, equivalent of BES for resource reservation, provisioning, etc.provisioning, etc.

Uses only stable, widelyUses only stable, widely--accepted WS accepted WS specifications in the designspecifications in the design

e.g. SOAP, WSDL, WSe.g. SOAP, WSDL, WS--Security, etc.Security, etc.Independent of WSIndependent of WS--’’Systems ManagementSystems Management’’reconciliationreconciliation

An OGSA Data Profile?An OGSA Data Profile?Guiding principles:Guiding principles:

Keep profile as simple as possibleKeep profile as simple as possibleExample of Amazon S3Example of Amazon S3

DAIS Working Group specificationsDAIS Working Group specificationsWSWS--DAIDAIWSWS--DAIR and WSDAIR and WS--DAIXDAIX

Build on only widely accepted Web Build on only widely accepted Web Services Services

WSWS--I + I + ……. .

Page 23: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

23

S3: Simple Storage Service

S3 is storage for the InternetDesigned to make web-scale computing easier for developers

Provides a simple Web Services interface to store and retrieve any amount of data from anywhere on the Web

‘CRUD’ philosophy – Create, Read, Update and Delete operations

Amazon S3 Functionality (1)

Intentionally built with a minimal feature setWrite, read, and delete objects containing from 1 byte to 5 gigabytes of data each

Can store unlimited number of objects Each object is stored and retrieved via a unique, developer-assigned key

Authentication mechanisms providedObjects can be made private or public, and rights can be granted to specific users

Page 24: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

24

Amazon S3 Functionality (2)

Uses simple standards-based REST and SOAP Web Service interfaces

Built to be flexible so that protocol or functional layers can easily be added

Default download protocol is HTTPBitTorrent protocol interface is provided to lower costs for high-scale distribution

Add additional interfaces later Model of incremental development

WSWS--DAI SpecificationsDAI Specifications

WS-DAI

WS-DAIR WS-DAIX

Sets general pattern for DAIS realisations

Possible Future Realisations

Extensions for specific kinds of data resource

RelationalSQL

XMLXQuery/Xpath

Page 25: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

25

WSWS--DAI Specifications: DAI Specifications: Design Features (1 )Design Features (1 )

Independent of WS Management debateIndependent of WS Management debateUses existing data access languages:Uses existing data access languages:

SQL, SQL, XQueryXQuery, …, …Functionality: classified as:Functionality: classified as:

Data description: properties that Data description: properties that characterise the behaviour providedcharacterise the behaviour providedData access: request / response Data access: request / response access to a data resourceaccess to a data resourceData factory: service manages Data factory: service manages response to a requestresponse to a request

WSWS--DAI Specifications: DAI Specifications: Design Features (2)Design Features (2)

Extensibility points:Extensibility points:Different data models and languagesDifferent data models and languages by by extending WSextending WS--DAI in new specificationsDAI in new specificationsDifferent Different response formatsresponse formats, by advertising , by advertising and selecting supported representationsand selecting supported representationsDifferent Different delivery mechanismsdelivery mechanisms, by , by implementing data movement interfaces implementing data movement interfaces on response access serviceson response access services

Designed to be both usable alone and to Designed to be both usable alone and to combine effectively with other standardscombine effectively with other standards

Page 26: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

26

Key Data Grid Services?Key Data Grid Services?S3 good for unstructured data S3 good for unstructured data

Can do search but not structured or Can do search but not structured or relational queriesrelational queries

Technical Computing needs more Technical Computing needs more functionalityfunctionality

FilesFilesStructured DataStructured DataData TransferData TransferData Query exploiting MetadataData Query exploiting MetadataFederationFederationReplication Replication ……

Towards an OGSA Data Profile?Towards an OGSA Data Profile?FilesFiles

S3 and/or SRBS3 and/or SRB--like services? like services? WebDavWebDav and/or and/or ByteIOByteIO for file manipulation?for file manipulation?

Structured Data Structured Data –– DAIS WGDAIS WGWSWS--DAIR for DAIR for RDBMsRDBMs? ? WSWS--DAIX for XML Databases?DAIX for XML Databases?

Data TransferData TransferFast secure data transferFast secure data transferGridFTPGridFTP, DMIS?, DMIS?

Data Query exploiting MetadataData Query exploiting MetadataSRBSRB--like Metadata services?like Metadata services?

FederationFederationSkyQuerySkyQuery, DQP , DQP ……

Page 27: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

27

Roles of Industry and the Roles of Industry and the Research Community?Research Community?

New technologies and proposed New technologies and proposed specifications need to be investigated specifications need to be investigated and evaluated and evaluated

Vital role for the research communityVital role for the research communityAfter experimentation, extra functionality After experimentation, extra functionality can be added to basic OGSA servicescan be added to basic OGSA services

When services have gained wide When services have gained wide acceptance and demonstrable acceptance and demonstrable interoperability, industry will provide interoperability, industry will provide high quality tooling for new Web and high quality tooling for new Web and Grid ServicesGrid Services

SummarySummaryThe GGF/EGA merger gives great The GGF/EGA merger gives great opportunity for the new organization to opportunity for the new organization to launch a small set of basic OGSA services launch a small set of basic OGSA services

Important to harness the power of the Important to harness the power of the worldworld--wide Grid community to develop wide Grid community to develop open source reference implementations open source reference implementations that help users build Gridsthat help users build Grids

Grid research community needs to propose Grid research community needs to propose and explore new features in real experiments and explore new features in real experiments

By taking small steps at a time we By taking small steps at a time we reassure industry about progress in Grid reassure industry about progress in Grid standards and grow the market for allstandards and grow the market for all

Page 28: New e-Science and Cyberinfrastructure: A Middleware Perspective Hey.pdf · 2019. 12. 6. · A Middleware Perspective Tony Hey Corporate VP for Technical Computing Microsoft Corporation

JPGRID-GW06

28

Microsoft and Scientific ComputingMicrosoft and Scientific ComputingRecognize reality of heterogeneous Recognize reality of heterogeneous research infrastructure and open source research infrastructure and open source software software

Interoperability and open standards using Interoperability and open standards using Web Services, Web Services, OpenXMLOpenXML and so onand so on

Work with the Grid community to develop Work with the Grid community to develop interoperable Grid Servicesinteroperable Grid Services

Begin by building Web Service Grids on Begin by building Web Service Grids on widely accepted Web Serviceswidely accepted Web Services

Great opportunities for GGF/EGA and Great opportunities for GGF/EGA and the Grid communitythe Grid community

AcknowledgementsAcknowledgements

With special thanks toWith special thanks to Malcolm Malcolm Atkinson, Neil Atkinson, Neil ChuehongChuehong, Geoffrey , Geoffrey Fox, Fox, Jim Gray, Marty Humphrey, Jim Gray, Marty Humphrey, Steven Newhouse, Stuart Ozer, Savas Steven Newhouse, Stuart Ozer, Savas Parastatidis, Norman PatonParastatidis, Norman Paton and Paul and Paul Watson Watson