eubrazilopenbio technologies
TRANSCRIPT
The Technologies behind EUBrazilOpenBio Services and Tools
Leonardo Candela
National Research Council of Italy
Science as an Open Enterprise: some key sentences
• “Rapid and pervasive technological change has created new ways of acquiring, storing, manipulating and transmitting vast data volumes, as well as stimulating new habits of communication and collaboration amongst scientists.”
• R1: “Scientists should communicate the data they collect and the models they create, to allow free and open access, and in ways that are intelligible, assessable and usable for other specialists in the same or linked fields wherever they are in the world.”
EUBrazilOpenBio Technologies 2Recife, 19th Sep. 2012
The Technological Challenges
• Nowadays science is posing “systems” engineers with challenging tasks:– highly-evolving requirements;
– large scale resources and players distribution;
– heterogeneity;
• This makes standard development approaches often too expensive and not sustainable– “from-scratch” development of ad-hoc solutions
– HW investment (even if intermittently needed)
… infrastructures …
EUBrazilOpenBio Technologies 3Recife, 19th Sep. 2012
Infrastructure vs e-Infrastructures
• Science has been traditionally based on infrastructures
EUBrazilOpenBio Technologies 4Recife, 19th Sep. 2012
• e-Infrastructures are becoming popular gathering (computer) resources and exposingthem to the scientific community throughuser-friendly interfaces
An Hybrid Data Infrastructure Approach
• A facility where research resources (HW, SW, Data) can be shared and exploited on-demand
• … built on existing systems, infrastructures and repositories
• … conceived to supplement but not supplant“systems” mandates and arrangements
• … supporting an innovative application-delivery-model
EUBrazilOpenBio Technologies 5Recife, 19th Sep. 2012
Hybrid Data Infrastructure
Application #1
Application #2
Application #N
Infrastructure /system A
Infrastructure /system B
Infrastructure /system Z
…
…
data
server
service
apps
Supporting Virtual Research Environments
EUBrazilOpenBio Technologies 6Recife, 19th Sep. 2012
A Virtual Research Environment is a complete “system” consisting of hardware, data, and applications deployed to support the needs of a community of
practice and promoting an effective collaboration
Supporting Virtual Research Environments [cont.]
User uploads/selects apps
User registers/selects data sets
User exploits most suitable resources
User invites co-workers
EUBrazilOpenBio Technologies 7Recife, 19th Sep. 2012
A Virtual Research Environment is a complete “system” consisting of hardware, data, and applications deployed to support the needs of a community of practice and
promoting an effective collaboration
EUBrazilOpenBio offers apps through a software repository
EUBrazilOpenBio offers a rich array of mediators interfacing with data sources
EUBrazilOpenBiodeploys, configures, executes and monitors services
EUBrazilOpenBio controls authentication and enforces policies
Supporting two models of provision
• For end-users– A GUI-centric approach focusing
on visual interfaces for accessing Data Infrastructure facilities via a Web Browser
• For service providers– An API-Centric approach focusing
on comprehensive set specifications and methods for accessing Data Infrastructure facilities in a programmatic way
EUBrazilOpenBio Technologies 8Recife, 19th Sep. 2012
EUBrazilOpenBio Infrastructure Current Constituents
• Rely on existing European and Brazilian resources and efforts– Data Providers
• CoL, speciesLink, Brazilian Flora, GBIF, PANGAEA, Biodiversity Heritage Library, Bioline International
– Service Providers• D4Science, VENUS-C, COMPSs
– Software Providers• gCube, openModeler, USTO.RE
EUBrazilOpenBio Technologies 9Recife, 19th Sep. 2012
…
* more will come from the community *
Data Discovery and Access Facilities
• Conceived to accommodate existing data repositories“integration” and exploitation– beyond “data import”– by relying on standards and protocols (e.g. TAPIR, OAI-PMH)– aiming at reducing the costs of developing ad-hoc mediators– aiming at improving data providers service (e.g. caching, query
expansion)
• Support for a rich array of data and metadata formats– from files to tabular data and complex objects– information objects can be annotated with multiple metadata
• Cater for efficient and scalable data storage• Offering unifying views over data coming from
heterogeneous and distributed systems
EUBrazilOpenBio Technologies 10Recife, 19th Sep. 2012
Data Processing Facilities
• A comprehensive offering of computational approaches and facilities
– different computational models and platforms
• Simplifying the realisation of advanced services having computation-intensive needs (e.g. ENM service)
• Hundreds of “computers” made available to the community
EUBrazilOpenBio Technologies 11Recife, 19th Sep. 2012
What OpenBio can do for the Biodiversity Community now
• Support for the deployment and operation of multiple VREs on demand
• Amplifying data providers “markets” – E.g. by enlarging the set of data presentation “formats”
• Support Taxonomies Integration by catering for– Effectively and easily comparing checklists – Extraction of checklists from multiple data sources– Enabling users to simply upload their checklists
• Support Ecological Niche Modeling by catering for– modeling, testing and projection “as-a-Service”, directly from
the web-browser. – retrieval of occurrence points from multiple data sources and
visualisation– enabling users to use their own data– massive modeling activities by relying on distributed
computing capabilities
* live demonstration after the break *
EUBrazilOpenBio Technologies 12Recife, 19th Sep. 2012