natsigdc’s within landcare research information management: towards integration and...
TRANSCRIPT
NatSigDC’s within Landcare Research
Information Management: towards integration and interoperability
Jerry CooperInformation Services, Landcare Research
Some Information Management Initiatives within LCR
covering the spectrum from non-digital, unstructured data through to highly structured digital data, and focused on internal
requirements
• Database Integration
• Spatial Data Management
• Knowledge Management Strategy - Tiaki Mātauranga
• Biodiversity Informatics Project – a prototype dataset repository
Towards data integration & interoperability
• LCR Database Integration Project
• Related ‘Biodiversity Informatics’ projects– within LCR– national projects– international projects
• Future directions– A National Interoperability Platform, NGI infrastructure,
GRIDs & the Semantic Web.
Database Integration Project
• 5 year, NSOF/FRST funded project (1999-2004)
• Incorporates some LCR NatSigDCs:– CHR: Plant herbarium collections, names & bibliography– PDD: Fungal herbarium collections, names,
bibliography, pathology, descriptions, keys, images– ICMP: Bacterial/fungal culture collections– NVS: Vegetation Survey Databank– NSD: National Soils Database
Database Integration Project
• Mixture of custom applications/databases and ‘off the shelf’ software solutions, e.g.:– custom, web-service based taxonomic/bibliographic editor– Robert Colwell’s Biota data management system for insect collections
• Mixture of data migration/ normalization/integration and interoperability solutions, e.g.:– Integrate names & their applications– Integrate bibliographic information– Ensure collection data is interoperable– Ensure geo-spatial data is interoperable
• Employs/develops appropriate standard schemas and data dictionaries – particularly from TDWG (Taxonomic Databases Working
Group)/CODATA
Database Integration Project• Coordinated by a Reference Group
– consisting of data curators, power users, project management & informatics specialists
• Some guiding principles:– Ensure participants share a common vision– provide dynamic access to managed primary data, not snapshots of
aggregated, secondary data (data applicability too short to maintain such systems).
– work iteratively on manageable, modular sub-projects with defined outcomes (but work towards the bigger picture)
– be flexible and prepared to re-invent as technology changes– incorporate/develop standards (but be flexible as standards change!)– interact/collaborate with other Informatics groups both nationally and
internationally– maintain the ownership of data by the providers/curators– keep things as simple as possible (system lifetime inversely related to
complexity)– empower providers/curators/users to manage and manipulate their own data
(desktop apps and the knowledge to use them)– maintain a strong information-science linkage. Informatics is a specialized
discipline, not a standard IT support function, and not generally available from contractors
Application Development
• Business Systems Development Team (BSDT): – 2 Business Analysts (Cooper & Kolster)– 4 programmer/analysts (Wilson, Spencer, Cochrane [contractor],
Connel [contractor])– 1 web developer (Fuglestad)
• Development environment:– MS Visual Studio .Net– MS SQL-Server– ESRI ARC-IMS/SDE (GIS)– Crystal Reports– MS Access (Prototyping)– TextML (XML Database)
• Emerging technologies:– XML-DBs, XML-schema, XSLT, X-Path (X-Query), Web Services
LCR BSDT recent projects
• NVS– web delivery of metadata & GIS-based plot localities
• Genetically Modified Organisms– bibliography and meta-analysis environment for data on crop
hybridization and pollination systems
• Carbon Monitoring System– pilot study of a data-entry and analysis tool for NZ’s carbon budget
implemented using web-services
• Biodiversity Informatics Platform – a prototype of a metadata/semi-structured data repository and
query/analysis environment for research-related data using XML and MS.Net technologies
• TFBIS/GBIF projects...
Related National Projects - TFBIS
• Terrestrial & Freshwater Biodiversity Information System Programme – managed by DOC and a cross-agency steering group, on
behalf of government, and in support of the NZ Biodiversity Strategy
– TFBIS is funding a number of projects in addition to DOC’s extranet GIS platform
– LCR has 8 projects (maybe) encompassing data/image/flora/fauna digitization, end-user analysis & GIS tools, web delivery of information
– LCR’s emerging ‘biodiversity information systems’
provide a natural backbone for such activities
Related National Projects – GBIF
• Global Biodiversity Information Facility– Based on an MOU signed by many countries
including NZ • including Europe, US, Australia etc • incorporates ITIS/Species2000 cataloging initiatives
– MOU requires, amongst other things, each GBIF participant to:• promote the sharing of biodiversity data in GBIF under
a common set of standards• form a node or nodes, accessible via GBIF, that will
provide access to biodiversity data– The initial GBIF network will be setup during this
year – The biodiversity data within NatSigDCs, and that being
made available through TFBIS, is an important component of the NZ contribution
Related International Activities• GBIF
– Penman: member of GB Governing Board, Chair Finance Committee– Cooper: member Scientific, Technical & Advisory Group for the work
programme on Data Access and Database Interoperability (DADI)– Cooper: member science committee for the work programme on
Electronic Catalogue of Names (ECAT)– Cooper: interim node manager for NZ (work plan for establishment of
the NZ GBIF node in preparation)
• VegBank (US Vegetation Survey Databank)– Proposed FGDC/NSDI standard vegetation classification model – Wiser & Cooper attended VegBank workshop.
• TDWG Standards Development– Cooper & Kolster: contributed to development of CODATA/BioCASE
ABCD XML-Schema (Access to Biological Collection Data)– Kolster: contributed to on-going development of Structured
Descriptive Data standard.– Cooper: chair new committee to formulate a data exchange standard
for biological names
Future Directions?
Effective participation in the knowledge economy requires the interoperability of information resources at the organizational, national and international levels
– through multi-agency, multi-disciplinary, funded, collaborative initiatives that address:
• use of emerging integration enabling technologies– e.g. XML, Web services, UDDI etc
• development of appropriate physical infrastructure– e.g. Next Generation Internet (NZ Broadband Internet)
• development of appropriate application environments– e.g. Computing GRIDs (big news everywhere – except NZ?)
• development of defined, and interoperable standard vocabularies
– e.g. domain ontologies, semantic maps and killer apps– The Semantic Web (major theme in EU Framework VI)