the australian national data service and open access to data andrew treloar director, ands...
TRANSCRIPT
THE AUSTRALIAN NATIONAL DATA SERVICE AND OPEN ACCESS TO DATA
Andrew Treloar
Director, ANDS Establishment Project
1
ANDS: Building the Australian Research Data Commons
ands.org.au
Outline
Context Blueprint Goal Structure Progress Internationally Acknowledgements
2
ands.org.au
Context3
ands.org.au
eResearch Co-ordinating Committee (2006)
Thematic Issues
Continuing Need for a Focus through national coordination
Human Capabilities People, skills and understanding
Linkage of eResearch Resources seamless access to resources
Access to Data best practice data management
and curation
Structural and Cultural Change evolution of organisational
structures and cultures
Awareness and Support develop researchers’ ability to
adopt eResearch
4
Service Clusters
• Data– outreach, curation, data
management– meta-services, location, access,
movement– practice, providers and users
• Computing – capability computing facilities– national computing environment
• Interoperation– discipline services (tools
((software))– user and operations support– collaboration services support
• Access– the Australian access federation– the Australian research and
education network
http://www.dest.gov.au/sectors/research_sector/publications_resources/profiles/e_research_strat_imp_framework.htm
ands.org.au
Australian Code for theResponsible Conduct of Research Describes the responsibilities of institutions and
researchers in range of areas, including the management of research data and primary materials
Institutions are to retain research data, provide secure data storage, identify ownership, and ensure security and confidentiality of research data
Researchers are to retain research data and primary materials, manage storage of research data and primary materials, maintain confidentiality of research data and primary materials
http://www.nhmrc.gov.au/publications/synopses/r39syn.htm
5
ands.org.au
NCRIS Investments
$542M** over the five years: 2007-2011
• Evolving bio-molecular platforms and informatics
• Integrated biological systems• Characterisation• Fabrication • Biotechnology products• Optical and radio astronomy• Integrated marine capability• Structure and evolution of the Australian
continent
• Networked biosecurity framework
• Population health and clinical data linkage
• Terrestrial ecosystem research network
+ Platforms for Collaboration (allocated $82 M)
6
ands.org.au
NCRIS Budget Breakdown7
ands.org.au
Platforms for Collaboration:Major Investments 2007-20118
Access
Interoperation
Com
puteD
ata
Capability ComputingAdvanced models
NCI - $26M
The Data CommonsData Federations
ANDS - $24M
Research connectivitySeamless reach
AAF+AREN - $6M
Collaboration servicesResearch workflows
ARCS - $20M
ands.org.au
Blueprint9
=
ands.org.au
The ANDS Blueprint
Towards the Australian Data Commons (TADC)
Developed during 2007 by ANDS Technical Working Group
Mapped out coherent vision of what needs to be done in the data space
Available at http://www.pfc.org.au/bin/view/Main/Data
10
ands.org.au
TADC: Why Data? Why Now? We are in an era of increasing data-
intensive research Almost all data is now born digital Increasing amount of data generated
(semi-)automatically “Consequently, increasing effort and
therefore funding will necessarily be diverted to data and data management over time” (TADC, p. 4)
11
ands.org.au
TADC: Need for standardisation Software and hardware keep getting cheaper,
wetware keeps getting more expensive Fixing data management problems is
enormously labour intensive and costly “Consequently, standardisation within forms of
data and simplification in the frameworks around retention, storage, access and use of data, and the elimination of differences whose resolution requires labour, must be made, if the on-going keeping and reuse of data is to remain affordable” (TADC, p. 5)
12
ands.org.au
TADC: Role of data federations With more data online, more can be done Possible now to answer questions unrelated to
reasons why data was collected originally Increasing focus on cross-disciplinary science “Consequently greater clarity is needed over
control and access to community-funded data, and the means of aggregating, federating and accessing such data are increasingly important” (TADC, p. 5)
13
ands.org.au
Changing Data, Changing Research New scientific instruments
Large Hadron Collider at CERN will generate 1.5 gigabytes of data per second
the Square Kilometre Array (1 EB/day!) New scientific Models
The mapping of the Human Genome: A billion DNA letters in a human sequence
Global climate models with ever finer resolution New knowledge from unlocked data
Hubble data has to be shared six months after collection Majority of published research from Hubble telescope
data was not “first use” http://www.nature.com/news/specials/bigdata/
was free for two weeks, now isn’t
14
ands.org.au
Goal15
ands.org.au
The ANDS Goal
“to deliver greater access to Australia’s research data assets
in forms that support easier and more effective
data use and reuse” TADC, p. 18
And to be a “voice for data” RF, 24/9/08
16
ands.org.au
ANDS implementation assumptions ANDS doesn’t have enough money to fund
storage And so is predicated on institutionally-supported
solutions Not all data shared by ANDS will be open ANDS aims to leverage existing activity, and
coordinate/fund new activity ANDS will only start to build the Australian
Research Data Commons ANDS governance and management
arrangements are sized for the current funding
17
ands.org.au
Realising the goal18
Develop user and owner frameworks for data commons
Develop and operate national registries and discovery
Seed the commons by connecting existing stores/federations
Increase capabilities across sector in data mgt, integration
ands.org.au
Structure19
ands.org.au
ANDS Delivery Structure
ANDS has been structured as four inter-related and co-ordinated service delivery programs: Developing Frameworks Providing Utilities Seeding the Commons Building Capabilities
Plus candidate service development activities funded through National eResearch Architecture Taskforce projects
20
ands.org.au
Developing Frameworks (Monash) Influencing relevant national policies Building common understanding of data
management issues and solutions across government, research funding agencies, and research intensive organizations
21
Assisting OA by encouraging moves in favour of discipline-acceptable default data sharing practices
Assisting OA by encouraging moves in favour of discipline-acceptable default data sharing practices
ands.org.au
Providing Utilities (ANU)
Building and delivering national technical services to support the data commons
Initial services Discovery
Both “you come to us” and “we come to you” flavours Probably a two-step process for some collections Includes surfacing of ISO2146 entities (next slide) for
web harvesting Persistent identifier minting and management Collections registry to underpin discovery
Plus Services Roadmap for later years Providing capability within ANDS for integration
of existing systems into Australian Data Commons
22
Assisting OA by improving discoverability, particularly across disciplines
Assisting OA by improving discoverability, particularly across disciplines
23
ISO2146
ands.org.au
Seeding the Commons (Monash) In targeted areas (because not enough
resource to do everything), working to improve: fabric for data management amount of content state of data capture and management
Selection process to identify targets Plus, opportunistic content recruitment
in first year
24
Assisting OA by increasing the amount of content available, much of it (hopefully!) OA
Assisting OA by increasing the amount of content available, much of it (hopefully!) OA
ands.org.au
Building Capabilities (ANU)
Improving level of capability for research data management and research access to data Train-the-trainer model
Two initial target populations Early career researchers Research support staff (IT, data management) NOTE: Overlapping but different messages
Building a community around data management concerns
25
Assisting OA by advocating to researchers for changed practices
Assisting OA by advocating to researchers for changed practices
ands.org.au
Progress26
ands.org.au
ANDS: From Project to Service Government asked Monash, ANU, CSIRO to set
up ANDS Establishment Project has met all its deliverables DIISR has now signed contract for ANDS First (interim) Business Plan available at
http://ands.org.au/andsinterimbusinessplan-final.pdf This will run until June 2009 Next Business Plan needs to be complete by March
2009 for consideration and approval ANDS will run until July 2011
27
ands.org.au
Australian Strategic Roadmap Review Data Storage (p.21)
National data-fabric, based on institutional nodes Shared Data (p. 22)
More ANDS Coordination Component (p. 23)
Integration of eResearch activities Expertise as an enabling infrastructure (p. 23) http://www.innovation.gov.au/ScienceAndResearch/
Documents/Strategic%20Roadmap%20Aug%202008.pdf
28
ands.org.au
National Innovation System Review R7.10: A specific strategy for ensuring the
scientific knowledge produced in Australia is placed in machine searchable repositories be developed and implemented using public funding agencies and universities as drivers
R7.14: To the maximum extent practicable, information, research and content funded by Australian governments including national collections should be made freely available over the internet as part of the global public commons…
http://www.innovation.gov.au/innovationreview/Pages/home.aspx
29
ands.org.au
Internationally30
ands.org.au
Wellcome Trust
Policy on data management and sharing The Trust “wishes to ensure that the outputs of the
research it funds, including research data, are managed and used in ways that maximise public benefit.”
Benefits gained from research data “will be maximised when they are made widely available to the research community as soon as feasible, so that they can be verified, built upon and used to advance knowledge.”
Trust “expects the researchers that it funds to maximise the availability of research data with as few restrictions as possible”
http://www.wellcome.ac.uk/About-us/Policy/Policy-and-position-statements/WTX035043.htm
31
ands.org.au
National Institutes of Health
Final NIH Statement On Sharing Research Data (February 26, 2003) “Data sharing is essential for expedited translation
of research results into knowledge, products, and procedures to improve human health”
NIH “endorses the sharing of final research data to serve these and other important scientific goals”
Investigators “are expected to include a plan for data sharing or state why data sharing is not possible”
http://grants.nih.gov/grants/guide/notice-files/NOT-OD-03-032.html
32
ands.org.au
Acknowledgements
ANDS Project Management Committee
Paul Bonnington, Monash Cathrine Harboe-Ree,
Monash Alan McMeekin, Monash David Groenewegen, Monash Vic Elliott, ANU Adrian Burton, ANU Markus Buchhorn, ANU Alex Zelinsky, CSIRO David Toll, CSIRO Tracey Hind, CSIRO Clare McLaughlin/Jacqueline
Cooke/Peter Nicholson, DIISR Rhys Francis, AeRIC
ANDS Organising Network Andrew Treloar, Monash David Groenewegen, Monash Adrian Burton, ANU Margaret Henty, ANU Chris Blackall, ANU Ross Wilkinson, CSIRO Tracey Hind, CSIRO John Morrissey, CSIROSenior Representatives Edwina Cornish, Monash Robin Stanton, ANU Alez Zelinsky, CSIRO
33