geoscientific data management principles

22

Upload: nigellaubsch

Post on 11-Jul-2015

780 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Geoscientific Data Management Principles
Page 2: Geoscientific Data Management Principles

2

Introduction

The quality and consistency of geoscientific data

management practices across the minerals

exploration and mining industry vary greatly.

This occurs for a variety of reasons;

• Budgetry constraints

• Technical knowledge constraints

• Lack of appreciation of the value of data within the organisation

• Lack of an IT Dept (or lack of integration with the IT group)

• Staff turnover

• Technology change-overs/updates

• “Islands” of accountability

Page 3: Geoscientific Data Management Principles

On the following slides are listed the primary factors conducive to best practice geoscientific data management.

Each principle is discussed first to allow for a clear understanding of what the principle involves and why, along with, in some instances, an example of it’s application or requirement.

Introduction

Page 4: Geoscientific Data Management Principles

Principle 1: Centralised Data Management

Simply put, this is the practice of having all geoscience data in a

centralised location, preferably not on an operations site, on fully

maintained/monitored servers, with fully tested back-up systems

in a industry standard server room. This data may be derived from

site/project based databases, or replicated to them – the main

point is that the version on site is not the only copy.

It is not strictly necessary for the site copy of the data to be

maintained however experience has shown that site personnel

have a significantly improved attitude towards the quality of the

data when a site copy is maintained as they have a stronger sense

of ownership of the data and view the centralized storage as

merely a backup.

Page 5: Geoscientific Data Management Principles

Principle 2: Standard Geoscientific Legend

This is a standard set of observational data codes used across all

sites/projects. Multiple legends within one organisation

frequently cause problems both in the database and at the point

of data capture.

Problems at the point of capture emerge as the different legends

will “cross-breed” as a geologist transferred to a new site will

sometimes use codes from his previous posting either out of

preference or by force of habit. This results in contaminated data

that is frequently useless if this practice is allowed to persist for

some time.

Problems at the database end are the result of either differing

legend codes being stored in the same field, or multiple fields

being created for each data type to cater for each legend. The

former is confusing whilst the latter is both inefficient and

confusing.

Page 6: Geoscientific Data Management Principles

Principle 3: Standard Geoscientific Data Model

This is simply the use of one data model across the organisation

rather than having a different data model for each mining

operation or exploration project. Some companies run one data

model for their mining operations and another for their

exploration.

Ultimately these data sets should be coming together so that all

data for a project, deposit or terrane is in one location and the

maximum value can be achieved from analysis of the data.

Complete data sets such as this are essential for understanding

the geological setting and processes involved in forming the

deposit, thereby allowing for predictive tools for discovering

another.

Page 7: Geoscientific Data Management Principles

Principle 4: Same System Digital Data Logging

This is simply the use of one data model across the organisation

rather than having a different data model for each mining

operation or exploration project. Some companies run one data

model for their mining operations and another for their

exploration.

Ultimately these data sets should be coming together so that all

data for a project, deposit or terrane is in one location and the

maximum value can be achieved from analysis of the data.

Complete data sets such as this are essential for understanding

the geological setting and processes involved in forming the

deposit, thereby allowing for predictive tools for discovering

another.

Page 8: Geoscientific Data Management Principles

Principle 5: Direct Data Transfer

This principle requires that the data is either transferred directly

from the data collection tool to the database, or is done via a

secure facility e.g., Acquire’s Briefcase mechanism.

Systems where data is exported to a text-based file is open to,

and frequently subjected to, manual editing which is outside of

the validation controls inherent in the system. This can result in

contaminated data in the database, or difficulty in loading the

data which then requires support from specialized users.

Furthermore, these files are commonly not transferred

immediately to the database and therefore are exposed to the

risks of loss and multiple versions.

A further point here is that importers must be constructed, have

validation coded in, and subsequently be maintained to enable

the importing of the exported, text-based file

Page 9: Geoscientific Data Management Principles

Principle 6: Digital Sample Submission

• This is where all sampling data is derived from a digital data

collection tool and submitted to the lab digitally. Where physical

sampling sheets are required, there should be a facility associated

with the data collection tool to provide a printed version.

• While barcode tags are now a common technology for assisting in

managing samples they still have issues of; having to be manually

handled at several points in the transport and processing of the

sample, and; can be difficult to get a reading from when dirty and/or

wet.

• It is recommended that RFID technology be used to manage samples

as this eliminates the multiple-point manual handling of the samples

to obtain their sample numbers. RFID tags are now extremely

affordable and readily available. Even in a small hole of a hundred

samples, the time saved by avoiding having to find and scan each

barcode is significant. Depending how samples are placed, this may

also remove the risk of injury through bending over or physically

lifting the samples.

Page 10: Geoscientific Data Management Principles

Principle 7: Automated Assay Loading

This principle involves the assays being imported directly into the

database without the opportunity to be manually edited by personnel.

The idea behind this is very similar to that behind Principle 5. Direct

Data Transfer – the analogy between data and a piece of medical

equipment for surgery; the more hands that come into contact with it,

the dirtier it gets. By avoiding personnel having the ability to manually

interact with, or edit, the data before it gets to the database, the cleaner

it is.

There are a multitude of ways of achieving this;

• Emails from the lab may be delivered to a common folder where a

batch process extracts and loads them into the database,

• The laboratory concerned my have a portal or some other web-

hosted access through which the DHDMS can acess the assay data

for loading.

• The laboratory has direct access to the DHDMS and loads the data

directly.

Page 11: Geoscientific Data Management Principles

Principle 8: Drillhole Data Staging

The recommendation here is that the data is loaded into the

database but is not available to general users or any

extraction/reporting facilities until it has been approved (i.e. checked

that all relevant data is present, QAQC is acceptable, etc). Ultimately

what is to be avoided is unapproved data being used in what may be

critical calculations or decisions.

An example would be a geochemist including assay data in an extract

he ran, when later it is revealed by the geologist responsible for the

data that it in fact failed it’s QAQC and was subsequently re-assayed

by the laboratory. Meanwhile the geochemist is unaware that he has

some poor quality data that has been superceded.

While this principle is intimately linked to the following one and may

at first appear to be the same, they are in fact separate as many

companies apply Principle 9 but not Principle 8.

Page 12: Geoscientific Data Management Principles

Principle 9: Drillhole Signoff/Approval

This principle is centred around the assigning of accountability for

the quality of the data to the person that responsible for it. This is

the logical subsequent step to the previous point and records the

name of the approver against the data.

Elements of a sense of ownership of the data, as discussed in

Principle 1, are equally valid here.

Page 13: Geoscientific Data Management Principles

Principle 10: Audit Trail Facility

The recommendation here is that all inserts/deletes/mods made

in the database are logged (date/time, userid, previous value) to

sufficient detail to allow for rollback to occur if required. This

then allows for the correction of data contaminated whilst in the

database whether by accident or malicious intent

Page 14: Geoscientific Data Management Principles

Principle 11: External Database Audits

This is a self explanatory principle – external audits provide an

independent assessment of the quality of the data stored and the

processes used in obtaining and approving it. Remembering that

large investment decisions may be made on the basis of this data,

it is essential that this process occur on a semi-regular basis.

Page 15: Geoscientific Data Management Principles

Principle 12: Database Photo Management

While storing photos within a database is a recent technology (e.g. SQL

Filestream), it is recommended to be adopted for the following reasons;

• Current folder-based systems do not easily allow for integration into

other systems or software packages

• Accidental or malicious deletion may not be recoverable in folder-

based systems

• There is currently no useful way to store metadata about the

image(s)

• Folder-based systems do not cater well for ATV/OTV images or

images from emerging technologies such as Hylogger.

Standardised Folder-based Photo Management

• If, due to budget or technology restrictions it is not possible to

implement Principle 12, then a folder-based system is still better

than no system at all. In this case it is imperative that the correct

permissions be set up on the folders/system to minimize the risk of

accidental or malicious deletion. Further steps should also be taken

to regularly backup the system for the same reason.

Page 16: Geoscientific Data Management Principles

Principle 13: One GIS Software Standard

A simple principle, though one that often gets overruled by

personnel in islands of accountability standing their ground and

insisting that that need a particular system despite the fact that

no one else in the organisation is using it.

The advantages are obvious;

• Potential savings on licensing costs

• Elimination of conflict with IT groups who logically want to

reduce the number of applications they need to cater for

• Data tends to get doubled up, i.e. stored for each system,

resulting in the potential for multiple, unsynchronized data

(“multiple truths”).

• Constantly converting data for one package from another

allows for the possibility of mistakes and contamination,

particularly where coordinate system conversions are involved.

Page 17: Geoscientific Data Management Principles

Principle 14: Controlled GIS Data

This principle is primarily concerned with avoiding multiple truths

and lost data. In the application of this principle all GIS data is

published to a structured area and users are expected to access

this area for their GIS data. Other data sets brought into the

organisation must go through this process of being published

prior to use.

Implementation of this principle may be done simply with a folder

structure where proper permissions have been set to avoid

deletion or over-writing of the published data. A more

sophisticated option would be an environment such as Sharepoint

where data can be checked in and out with full version control.

Page 18: Geoscientific Data Management Principles

Principle 15: Centralised Grid Transformations

Grid and coordinate conversions are a constant source of error

and contamination within many organisations. Implementation of

this principle involves a sophisticated system where grid

definitions are entered into a database by surveyor and their

userid is recorded against the entry in much the same vein as in

Principle 9. The system must be capable of versioning these

definitions as they do change over time.

The database then produces a definition file that is accessed by

the conversion software. The apparently complex part then is

integrating your GIS and other packages to utilize this conversion

software to do all coordinate conversions.

While the above does sound overly complex the truth is that it

the architecture and execution are not particularly difficult. What

this then allows for is;

Continued on next slide

Page 19: Geoscientific Data Management Principles

Principle 15: Centralised Grid Transformations (cont)

• Elimination of multiple versions of coordinate conversion

formulas and macros that once released are impossible to

control.

• Following on from the above is the elimination of potentially

expensive mistakes caused by using the wrong or outdated

conversion facility.

Standardised Grid Transformations

• If it is not possible to implement Principle 15 as described

above, then it recommended that surveyor approved

transformation parameters or formulae are published to a

central area where they can be accessed by users, in much the

same way as discussed in Principle 14. This area is likely a

folder structure and as such should have the correct

permissions to prevent deletion or editing except by the

surveyors.

Page 20: Geoscientific Data Management Principles

Principle 16: Controlled Geophysics Data

The principle in this instance is very similar to Principles 14 & 16

in that approved data is published to a central area, protected by

permissions , where users go to access the processed geophysical

data.

With regard to the raw geophysical data, while this is almost

useless to anybody but the geophysicists, the data should still be

stored in a protected folder system to prevent the contamination

or loss of the primary, unprocessed data.

Page 21: Geoscientific Data Management Principles

Principle 17: Database Driven Tenement Management

The principle in this instance is very similar to Principles 14 & 16

in that approved data is published to a central area, protected by

permissions , where users go to access the processed geophysical

data.

With regard to the raw geophysical data, while this is almost

useless to anybody but the geophysicists, the data should still be

stored in a protected folder system to prevent the contamination

or loss of the primary, unprocessed data.

Page 22: Geoscientific Data Management Principles

Principle 18: Exploration Embedded IT People

This principle involves IT specialists embedded in, and paid for by,

the exploration group but that have a reporting line through to

the company’s IT department. This is the preferable choice as the

personnel are fully exposed to the exploration requirements,

challenges and planning schedule but are grounded in the IT

requirements of standardisation where feasible and security

issues.

Exploration-centric IT People

Should Principle 18 not be a feasible option, then the

organisation’s IT group should have support and architecture

people in which a significant part of their focus is the exploration

group and is familiar with their requirements, sometimes rapidly

changing requirements and the limitations/demands of the

remote environs in which exploration personnel frequently work.