100503 bioinfo instsymp

18
The Impact of BeSTGRID Developing NZ research infrastructure www.bestgrid. org twitter: @bestgrid Nick Jones Director, BeSTGRID Co-Director eResearch, Centre for eResearch University of Auckland [email protected]

Upload: nick-jones

Post on 11-May-2015

529 views

Category:

Documents


0 download

DESCRIPTION

Presentation on the work we've done within BeSTGRID as it relates to bioinformatics in NZ, for the 2010 Bioinformatics Symposium https://www.bestgrid.org/NZ-Bioinformatics-Symposium-2010

TRANSCRIPT

Page 1: 100503 bioinfo instsymp

The Impact of BeSTGRIDDeveloping NZ research infrastructure

www.bestgrid.orgtwitter: @bestgrid

Nick JonesDirector, BeSTGRID

Co-Director eResearch, Centre for eResearchUniversity of [email protected]

Page 2: 100503 bioinfo instsymp

“BeSTGRID aims to enhance e-Research capability in New Zealand by providing the skill base to help the various research disciplines engage with new eResearch services”

“BeSTGRID aims to kick start centralised infrastructure with some capital investment at key institutions”

BeSTGRID: since 2006

Page 3: 100503 bioinfo instsymp

KAREN

Page 4: 100503 bioinfo instsymp

CurrentPlanned

Identity provider

Virtual Applications

ComputationCluster

Storage

Page 5: 100503 bioinfo instsymp

PeopleThe University of Auckland• Stephen Cope• Mark Gahegan• Yuriy Halytskyy• Nick Jones• Andrey Kharuk

Auckland University Technology• Slava Kitaev• Gene Soudlenkov

Canterbury University• Tim David• Vladimir Mencl

Industrial Research Limited• John Burnell• Peter McGavin

Landcare Research• Robert Gibb• Aaron Hicks

Landcare Research (continued)• Niels Hoffmann• David Medyckyj-Scott

Lincoln University• Stuart Charters

Massey University• Martin Johnson• Guy Kloss

Otago University• Mik Black• Marcus Davy• Matthew Grant

Victoria University Wellington• Kevin Buckley• John Hine

Waikato University• Joseph Lane

Page 6: 100503 bioinfo instsymp

Successes: within NZNational grid infrastructure, primarily focused on Bioscience and Geoscience

applications & services, data storage and sharing, computation, along with many others

Established the foundations of a shared research infrastructure service delivery model (governance, management, services) across research sector institutions (Universities, CRIs)

Piloted first federated identity management service in NZ, now moving to production through the NZ Access Federation (nzfed.auckland.ac.nz)

Acting as the basis for sector wide HPC and eResearch programme:– National eScience Infrastructure proposal : 2010 – 2015

Page 7: 100503 bioinfo instsymp

What did we learn?Stay Connected, Collaborate

• With Scientists, Administrators, Technologists, Policy makers, Funders

• Locally AND Internationally

Buy (Translate) - don’t build• Development where necessary – others have been

here before• Source from international community:

technologies, approaches, strategies, configurations, programmes

Page 8: 100503 bioinfo instsymp

* VDT - OpenScienceGrid* Globus – Argonne* Shibboleth – Internet2* iRODS – DICE / RENCI* Sakai – Sakai Foundation* OCI - NSF * PRAGMA – UCSD* EVO – CalTech* AccessGrid - Argonne* GenePattern – MIT Broad* caBIG - NIH

* VDT - OpenScienceGrid* Globus – Argonne* Shibboleth – Internet2* iRODS – DICE / RENCI* Sakai – Sakai Foundation* OCI - NSF * PRAGMA – UCSD* EVO – CalTech* AccessGrid - Argonne* GenePattern – MIT Broad* caBIG - NIH

* UK eScience - JISC* eFramework – JISCShibboleth – JISC, SWITCHSLCS - SwitchSakai - JISC

* UK eScience - JISC* eFramework – JISCShibboleth – JISC, SWITCHSLCS - SwitchSakai - JISC

* Grisu – ARCS* Confluence - Atlassian* NCRIS - DIISR* QFAB - QCIF

* Grisu – ARCS* Confluence - Atlassian* NCRIS - DIISR* QFAB - QCIF

Page 9: 100503 bioinfo instsymp

What did we learn?Change difficult within Institutions

– Medium to Long term timeframes required with researchers, ITS, SMT, research support units– “Final Report: A Workshop on Effective Approaches to Campus Research Computing Cyberinfrastructure” Klingenstein,

Morooney, Olshansky. 2006

Collaboration seen as critical, yet complex and expensive– Working with user communities essential to mutual understand of needs and opportunities, and developing capabilities– Need significant capability before international collaborations (and related learning) possible– New forms of organisation and support required– “Beyond Being There: A Blueprint for Advancing the Design, Development, and Evaluation of Virtual Organization ”

Cummings, Finholt, Foster, Kesselman, Lawrence. 2008– “The Importance of Long-Term Science and Engineering Infrastructure for Digital Discovery” Wilkins-Diehr, Alameda,

Droegemeier, Gannon. 2008

Government and Institutional Investment well below international levels– Australia NCRIS + Super Science– EU FP7 e-Infrastructure call + National Grid Initiatives + Identity + HPC + Networks in member states– UK JISC + NGS + DCC + AAA– US Office of Cyberinfrastructure, NSF, NIH, DoE all funding major Cyberinfrastructure initiatives

Education programs lacking– Who is training the next generation?– Need to grow awareness of career paths in computational and data intensive sciences for computer science and software

engineering graduates

Page 10: 100503 bioinfo instsymp

“BeSTGRID could well be New Zealand’s earliest dedicated eResearch infrastructure providers – even predating KAREN. Since 2006, BeSTGRID has been delivering services and tools to support research and research collaboration on shared data sets, and in accessing computational resources.”

KAREN hypenIssue 09September 2009

BeSTGRID: since 2006

Page 11: 100503 bioinfo instsymp

MoRST eResearch programme

In Vote RS&T 2008/09, funding was appropriated to develop capability within the Kiwi Advanced Research and Education Network user group to make effective use of KAREN:

• Advanced Video Conferencing Collaboration and Support Centre

• Federated Identity and Access Management• Towards a Federated Approach – strategy development• Technical Support and Resources

• Semantic Data and Public Access to Research

• BeSTGRID Grid Middleware Initiative

• Hardship fund for remote site connection to KAREN

Page 12: 100503 bioinfo instsymp

BeSTGRID 2009 – 2010

• MoRST Dec 2008 funding request forMiddleware development for eResearch in NZ

• Provides ongoing support for BeSTGRID– Widened from founders through active participation of new

Universities and CRIs

• Aims:• Coordinate access to compute and data resources• Provide discipline specific services and applications• Build a sustainable resource administration, middleware

and applications & services development community

Page 13: 100503 bioinfo instsymp

Compute and storage platformsShared investmentHeterogeneous architecturesScale up scienceCreate new opportunities

Compute and storage platformsShared investmentHeterogeneous architecturesScale up scienceCreate new opportunities

Grid MiddlewareProvides simplified & secure access to job queues & data sharingCommon across compute and storage platforms

Grid MiddlewareProvides simplified & secure access to job queues & data sharingCommon across compute and storage platforms

Applications and toolsManaged, secure, stable, reliableImprove collaboration‘Science as a service’

Applications and toolsManaged, secure, stable, reliableImprove collaboration‘Science as a service’

Research communitiesDeep engagement to map problems onto HPC and GridIncreased throughput, efficient

Research communitiesDeep engagement to map problems onto HPC and GridIncreased throughput, efficient

Governance / PrioritiesNimble, responsive, focused & fairBy member institutions

Governance / PrioritiesNimble, responsive, focused & fairBy member institutions

Page 14: 100503 bioinfo instsymp

Site admin

Site admin

Site admin

Site admin

Site admin

Site admin

Site admin

Site admin

Site admin

Site admin

Faculty ITInstitution IT

Faculty ITInstitution IT

Technical Working GroupTechnical Working Group

GeoscienceLeads

GeoscienceLeads

eResearch Centres

eResearch Centres

eResearch Centres

eResearch Centres

eResearch Centres

eResearch Centres

BioscienceLeads

BioscienceLeads

eResearch Centres

eResearch Centres

Steering CommitteeSteering Committee

Project ManagementProject Management

Compute and storage platformsShared investmentHeterogeneous architecturesScale up scienceCreate new opportunities

Compute and storage platformsShared investmentHeterogeneous architecturesScale up scienceCreate new opportunities

Grid MiddlewareProvides simplified & secure access to job queues & data sharingCommon across compute and storage platforms

Grid MiddlewareProvides simplified & secure access to job queues & data sharingCommon across compute and storage platforms

Applications and toolsManaged, secure, stable, reliableImprove collaboration‘Science as a service’

Applications and toolsManaged, secure, stable, reliableImprove collaboration‘Science as a service’

Research communitiesDeep engagement to map problems onto HPC and GridIncreased throughput, efficient

Research communitiesDeep engagement to map problems onto HPC and GridIncreased throughput, efficient

Governance / PrioritiesNimble, responsive, focused & fairBy member institutions

Governance / PrioritiesNimble, responsive, focused & fairBy member institutions

Page 15: 100503 bioinfo instsymp

Take lead from community• Lead Users from each community

– Define requirements– Evaluate developments– Promote services– Disseminate learning

• Technology group– “User needs” lead requirements– Scans undertaken for technologies to acquire– Incremental development iterations

Aligned with national research & economic growth agenda

Page 16: 100503 bioinfo instsymp

Jack Flanagan: Drug discovery

Virtual screening pipeline

Virtual screening pipeline

Maurice Wilkins Centre ?

Molecular dockingMolecular dockingGene sequence

DB

Shared Storage

X ray crystallograph

y

Shaun Lott: Biological Structure

Virtual serversShibboleth Service ProviderDatabases3D Docking libraries

Virtual serversShibboleth Service ProviderDatabases3D Docking libraries

GenePattern – MIT BroadLocal Analysis Modules• Auckland• Otago

GenePattern – MIT BroadLocal Analysis Modules• Auckland• Otago

Data StorageComputation GridData StorageComputation Grid

Page 17: 100503 bioinfo instsymp

Bioscience projects• BioMirror

– Service: Database hosting, 5TB storage, FTP– Partner: Bioinformatics Institute– A public service in New Zealand for high-speed access to up-to-date DNA & protein biological

sequence databanks

• GenePattern: – Service: Genomics processing portal– Partners: Integration Genomics, Otago– Reuse existing and deploy custom genomics processing codes– Reich M, Liefeld T, Gould J, Lerner J, Tamayo P, Mesirov JP (2006) GenePattern 2.0 Nature Genetics 38 no. 5 (2006): pp500-501

doi:10.1038/ng0506-500.

• CellML Simulator– Service: Run simulations of CellML models on computational cluster– Partner: Auckland Bioengineering Institute

• GOLD Protein Docking– Service: GOLD protein docking on computational cluster– Partner: Auckland Cancer Society Research Centre– Molecular Mechanics and Structural Biology

Page 18: 100503 bioinfo instsymp

Any questions..?

www.bestgrid.org@bestgrid

Nick JonesDirector, BeSTGRIDCo-Director eResearch, Centre for eResearchUniversity of [email protected]