australian ecosystems science cloud

10
RN is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy . Australian Ecosystems Science Cloud overview Presentation by Siddeswara Guru Director, Data Science

Upload: terrestrial-ecosystem-research-network

Post on 13-Apr-2017

120 views

Category:

Data & Analytics


1 download

TRANSCRIPT

Page 1: Australian Ecosystems Science Cloud

TERN is supported by the Australian Government through the National Collaborative Research Infrastructure Strategy.

Australian Ecosystems Science Cloud

overview

Presentation by Siddeswara GuruDirector, Data Science

Page 2: Australian Ecosystems Science Cloud

Ecosystem science• Inter-relationship among the living organisms, physical features, bio-chemical

processes, natural phenomena, and human activities in ecological communtiies1

• Focusing on Terrestrial Ecosystem– Terrestrial Ecosystem Research Network– Atlas of Living Australia

• Data is heterogeneous: wide variety from different domain– Observation (human, in-situ sensors and satellite remote sensing)– Variety of scale: spatial and temporal– Different data formats used in the community

Page 3: Australian Ecosystems Science Cloud

Data Use• Conventional data access – Need to find data– Access via services– copy from source to destination for further for

large datasets

Image from internet

Page 4: Australian Ecosystems Science Cloud

Storage and Compute• Advent of NeCTAR and RDS– Researchers are moving data and computation to

cloud.– Building tools (Virtual labs, research tools and

platforms)– However, easy accessibility of data is still an issue• Multiple interfaces to search for data• No clear access mechanism from different nodes

Page 5: Australian Ecosystems Science Cloud

Goal• Offer open data platform: harmonised cloud-enabled data

infrastructure for data interoperability with simplified service model

• Offer compute next to data to minimise data movement• Data accessibility to different research platforms and virtual

labs from common platform• Offer scalable managed computing environment with access

to distributed and data-intensive computation technologies• develop a support system for a cross-discipline use of data

Page 6: Australian Ecosystems Science Cloud

User Stories• As an ecosystem science continental-scale gridded data user, I wants to query a dataset, perform

spatial and temporal sub-setting of data, access and use that data from a cloud platform as a local file so that I can work on further analyses.

• As an application developer, I need enough compute and storage for short period of time to run a distributed large-scale data intensive application so that the output of the analyses are available in decent amount of time.

• As a regular ecology data user, I need a easily accessible cloud compute platform with common tools (Rstudio, Jupyter Python, NetCDF viewer, spatial data viewer, CSV file viewer) attached with the TERN ecology and biophysical data collection so that I can build applications for analysis and synthesis.

• As a data intensive application developer, I need a flexible approach to create and access to Hadoop cluster so that I can distribute my computation.

• As a data user, I want an easy access to reference datasets with compute resources so that I can use them in my analysis and research work.

• As a ecosystem data user, I want a one stop-shop to search, query and access ecosystem data and use in my analysis so that I don't have to go through multiple portals to access and use data.

• As an application developer, I want a cloud platform to run my simulation with a local access to data so that I don't move data around or download into my desktop.

Page 7: Australian Ecosystems Science Cloud

High-level conceptual Architecture

Page 8: Australian Ecosystems Science Cloud

Current status• Setup a Technical Advisory Group advice on the scoping and

implementation of the project.• In the first iteration: reference datasets will be made available – Remote sensing reference data (fractional Cover)– Long-term ecological monitoring data– Climate variables

• Scoping the mediation layer and overall architecture• Building a coalition of willing for partnership and collaboration

Page 9: Australian Ecosystems Science Cloud

Contributions• NeCTAR – Major project sponsor• TERN, ALA – NCRIS Domain Projects, partners• QCIF - implementation partner• NCI – collaborator, partners