gci analysis

20
® QUAlity aware VIsualisation for the QUAlity aware VIsualisation for the Global Earth Observation system of Global Earth Observation system of systems systems Reading meeting. December 12-14th, 20 GCI Analysis December 12-14th, 2011 University of Reading CREAF-UAB

Upload: elma

Post on 13-Jan-2016

73 views

Category:

Documents


0 download

DESCRIPTION

GCI Analysis. December 12-14th, 2011 University of Reading CREAF-UAB. Numbers. Total metadata records in the Clearinghouse 97203 Metadata records with quality indicators 19107 Total number of quality indicators 52187 Metadata records with lineage 10899 (9261 process, 3771 source) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: GCI Analysis

®

QUAlity aware VIsualisation for the Global Earth QUAlity aware VIsualisation for the Global Earth Observation system of systemsObservation system of systems

Reading meeting. December 12-14th, 2011

GCI Analysis

December 12-14th, 2011 University of Reading

CREAF-UAB

Page 2: GCI Analysis

Reading meeting. December 12-14th,

2011

2

Numbers

• Total metadata records in the Clearinghouse– 97203

• Metadata records with quality indicators– 19107

• Total number of quality indicators– 52187

• Metadata records with lineage

– 10899 (9261 process, 3771 source)

• Metadata with usage– 1226

Page 3: GCI Analysis

Reading meeting. December 12-14th,

2011

3

Methodology

• Harvest (download) all XML documents

• Massive extraction of MD quality elements– Quality elements– Lineage– Usage

PTB. London 12-14th December 2011UAB. WP7

97203 XML documents

XLS Table document

GestBD

Xpath extraction

CSW

GEOSS Clearinghouse

Page 4: GCI Analysis

Reading meeting. December 12-14th,

2011

4

Quality indicators

• 19.66% Metadata records with quality• 2.7 quality indicators per document

Page 5: GCI Analysis

Reading meeting. December 12-14th,

2011

5

Quality indicators

Page 6: GCI Analysis

Reading meeting. December 12-14th,

2011

6

Quality indicators

PositionalAccuracy95.38%

ThematicAccuracy

2.60%

Completeness0.46%

TemporalAccuracy

0.06%Logical

Consistency0.02%

Quality Indicators in IDEC Metadata

GEOSS

IDEC

Page 7: GCI Analysis

Reading meeting. December 12-14th,

2011

7

Quality indicator values

• Quantitative 22275• Conformance 3669 (mainly conformance to INSPIRE) • Coverage 5

Page 8: GCI Analysis

Reading meeting. December 12-14th,

2011

8

19115-2 Extension for "per pixel" quality

Page 9: GCI Analysis

Reading meeting. December 12-14th,

2011

9

Coverage result (ISO19115-2 extension)

• Clearinghouse record: 273234, 273232, 273233, 273235, 273236)• Only 5 records use this. Bad news for visualizing data + quality maps

• Title: OMNO2e:OMI Column Amount NO2:ColumnAmountNO2CS30

<gmd:DQ_QuantitativeAttributeAccuracy><gmd:measureDescription>

<gco:CharacterString>The 'version 003' product is the second public release. It is based on improved radiance calibration. For details, please see document: http://disc.sci.gsfc.nasa.gov/Aura/OMI/OMTO3e_v003.shtml</gco:CharacterString>

</gmd:measureDescription><gmd:result><gmi:QE_CoverageResult>

<gmi:spatialRepresentationType><gmd:MD_SpatialRepresentationTypeCode codeList="./resources/codeList.xml#MD_SpatialRepresentationTypeCode" codeListValue="grid">grid</gmd:MD_SpatialRepresentationTypeCode></gmi:spatialRepresentationType>

<gmi:resultFile gco:nilReason="missing" /> <gmi:resultFormat>

<gmd:MD_Format><gmd:name><gco:CharacterString>CF-netCDF</gco:CharacterString></gmd:name></gmd:MD_Format>

</gmi:resultFormat></gmi:QE_CoverageResult></gmd:result>

</gmd:DQ_QuantitativeAttributeAccuracy>

Page 10: GCI Analysis

Reading meeting. December 12-14th,

2011

10

LI_Lineage

??

Page 11: GCI Analysis

Reading meeting. December 12-14th,

2011

11

LI_Lineage: MD_Source

• 3771• 1702 with temporal extent

• This is the simplest way to mention that this resource was derived from previous (re)sources

• Gives credit (attribution, and eventually some trust on them)

• Allows some from of descriptive quality report based on sources

class LI_Source_only

LI_Source

+ description :CharacterString [0..1]+ sourceSpatialResolution :MD_Resolution [0..1]+ sourceReferenceSystem :MD_ReferenceSystem [0..1]+ sourceCitation :CI_Citation [0..1]+ sourceMetadata :CI_Citation [0..*]+ scope :DQ_Scope [0..*]

constraints{"description" is mandatory if "scope" is not documented}{"scope" is mandatory if "description" is not documented}

LI_Lineage

+ statement :CharacterString [0..1]+ scope :DQ_Scope [0..*]

constraints{"source" role is mandatory if LI_Lineage.statement and "processStep" role are not documented}{"processStep" role is mandatory if LI_Lineage.statement and "source" role are not documented}

Metadata Information::MD_Metadata

+source 0..*

+resourceLineage0..*

Page 12: GCI Analysis

Reading meeting. December 12-14th,

2011

12

LI_Lineage: MD_ProcessStep

• 9261• 8035 MD_ProcessStep only• 292 with date

• Information on the exact list of processes execution and the order of these processes.

• Difficult to infer resource quality with only a process list

class From_LI_ProcessStep_to_LI_Source

LI_ProcessStep

+ description :CharacterString

+ rationale :CharacterString [0..1]

+ stepDateTime :TM_Primitive [0..*]

+ processor :CI_ResponsiblePartyInfo [0..*]

+ reference :CI_Citation [0..*]

+ scope :DQ_Scope [0..*]

LI_Lineage

+ statement :CharacterString [0..1]

+ scope :DQ_Scope [0..*]

constraints

{"source" role is mandatory if LI_Lineage.statement

and "processStep" role are not documented}

{"processStep" role is mandatory if

LI_Lineage.statement and "source" role are not

documented}

Metadata Information::MD_Metadata

+processStep 0..*

+resourceLineage0..*

Page 13: GCI Analysis

Reading meeting. December 12-14th,

2011

13

True Provenance:MD_ProcessStep with MD_Source

• 1226 cases

• List of processes execution and the order of these processes

• How and when the data sources where used

• We can deduce which sources have more influence in the quality of the final result

class From_LI_ProcessStep_to_LI_Source

LI_Source

+ description :CharacterString [0..1]+ sourceSpatialResolution :MD_Resolution [0..1]+ sourceReferenceSystem :MD_ReferenceSystem [0..1]+ sourceCitation :CI_Citation [0..1]+ sourceMetadata :CI_Citation [0..*]+ scope :DQ_Scope [0..*]

constraints{"description" is mandatory if "scope" is not documented}{"scope" is mandatory if "description" is not documented}

LI_ProcessStep

+ description :CharacterString+ rationale :CharacterString [0..1]+ stepDateTime :TM_Primitive [0..*]+ processor :CI_ResponsiblePartyInfo [0..*]+ reference :CI_Citation [0..*]+ scope :DQ_Scope [0..*]

LI_Lineage

+ statement :CharacterString [0..1]+ scope :DQ_Scope [0..*]

constraints{"source" role is mandatory if LI_Lineage.statement and "processStep" role are not documented}{"processStep" role is mandatory if LI_Lineage.statement and "source" role are not documented}

Metadata Information::MD_Metadata

+processStep 0..*

+resourceLineage0..*

+source

0..*

Page 14: GCI Analysis

Reading meeting. December 12-14th,

2011

14

Complete provenance in ISO19115-2

• LI_ProcessStep includes a LE_Processing that has a runTimeParameters attribute that allows us describing the exact list of parameters used in the execution.

• Exact information needed to execute the process and to reproduce it.

• There is a citation of the algorithm used (LI_Algorithm).

• We can completely evaluate de quality of the resulting product if we know the uncertainties that sources have in their metadata (sourceMetadata citation in LI_Source).

class From_LE_ProcessStep_to_LE_Source

From ISO 19115-2:2009 shown for informative purposes only

LI_Source

+ description :CharacterString [0..1]+ sourceSpatialResolution :MD_Resolution [0..1]+ sourceReferenceSystem :MD_ReferenceSystem [0..1]+ sourceCitation :CI_Citation [0..1]+ sourceMetadata :CI_Citation [0..*]+ scope :DQ_Scope [0..*]

constraints{"description" is mandatory if "scope" is not documented}{"scope" is mandatory if "description" is not documented}

LI_ProcessStep

+ description :CharacterString+ rationale :CharacterString [0..1]+ stepDateTime :TM_Primitive [0..*]+ processor :CI_ResponsiblePartyInfo [0..*]+ reference :CI_Citation [0..*]+ scope :DQ_Scope [0..*]

LI_Lineage

+ statement :CharacterString [0..1]+ scope :DQ_Scope [0..*]

constraints{"source" role is mandatory if LI_Lineage.statement and "processStep" role are not documented}{"processStep" role is mandatory if LI_Lineage.statement and "source" role are not documented}

Metadata Information::MD_Metadata

Data quality information - Imagery::

LE_ProcessStep

Data quality information - Imagery::LE_ProcessStepReport

+ name :CharacterString+ description :CharacterString [0..1]+ fi leType :CharacterString [0..1]

If "LE_NominalResolution.scanningResolution" is usedthen "LE_Source.scaleDenominator" is required

Data quality information - Imagery::LE_Source

+ processedLevel :MD_Identifier [0..1]+ resolution :LE_NominalResolution [0..1]

Data quality information - Imagery::LE_Processing

+ identifier :MD_Identifier+ softwareReference :CI_Citation [0..1]+ procedureDescription :CharacterString [0..1]+ documentation :CI_Citation [0..*]+ runTimeParameters :CharacterString [0..1]

Data quality information - Imagery::LE_Algorithm

+ citation :CI_Citation+ description :CharacterString

«Union»Data quality information - Imagery::

LE_NominalResolution

+ scanningResolution :Distance+ groundResolution :Distance

"description" is mandatory if "sourceExtent" is not documented

"sourceExtent" is mandatory if "description" is not documented

+processStep 0..*

+resourceLineage

0..*

+report 0..*

+output

0..*

+processingInformation0..1

+algorithm 0..*

Page 15: GCI Analysis

Reading meeting. December 12-14th,

2011

15

MD_ProcessStep with MD_Source Example

Clearinghouse record 131007 (simplified)• Compile survey input data from the best and most current survey records.

– BLM database of the index to all official (microfilm, CD, other) BLM survey records.– USFS survey records.– Private land surveyor records– GCDB Data Collection Attribute Definitions Version 2.0, Appendix A, 2/14/1991. Survey records used - source

abbreviations.• Compile listings of known locations of PLSS corners.

– USGS topographic quadrangles and other sources.– USC&amp;GS published coordinate data.– NGS published coordinate data.– BLM global positioning Data.– USFS global positioning data.

• Coordinates of control stations are entered into a control data base with associated reliabilities.• Topologically correct GIS coverages are modified to use FGDC compliant naming conventions and

then loaded into the LSI database. These layers can then be downloaded as shapefiles through the LSI website.

• GCDB Data was downloaded for Kiowa and Cheyenne Counties, Colorado.– C:\f\gis_data\sand\zipped\kiowa\twnshp.shp.xml

• Metadata imported and data was exported from regions format to shapefile format• Dataset copied.

– C:\f\gis_data\sand\data\basedata\plss\ck_gcdb_region_township• Source Contribution: Survey data in the form of official (microfilm, CD, other) survey and BLM,

abstracted into a vector digital format.online• Source Contribution: Survey and control data from the Cartographic Feature File (CFF) data set.disc• Source Contribution: Digitized control data from standard topological quadrangle sheets.disc

Page 16: GCI Analysis

Reading meeting. December 12-14th,

2011

16

User feedback

• There is one an small entry for user feedback in the current ISO-19115:

• MD_Usage– Brief description of ways in which

the resource is currently or has been used

Page 17: GCI Analysis

Reading meeting. December 12-14th,

2011

17

MD_Usage

• MD_Usage– specificUsage

• brief description of the resource and/or resource series usage– usageDateTime

• date and time of the first use or range of uses of the resource and/or resource series– userDeterminedLimitations

• applications, determined by the user for which the resource and/or resource series is not suitable

– userContactInfo • identification of and means of communicating with person(s) and organization(s) using the

resource(s)

• There are 1133 entries (only specificUsage and userContactInfo)

• ALL MADE BY ONE PERSON!!:– Landesvermessung und Geobasisinformation Brandenburg (LGB) – Tel +49-331-8844-123, Fax. +49-331-8844-16123 – Heinrich-Mann-Allee 103, Potsdam, Brandenburg 14473, Deutschland– [email protected] – http://www.geobasis-bb.de

Page 18: GCI Analysis

Reading meeting. December 12-14th,

2011

18

Conclusions

• Quality indicators are present and in many different kinds– A previous study in the Catalan SDI shows a worse situation

• Lineage information is rich in many documents with some documents with more that 100 entries in source or processSteps

• We have usage examples• Quality coverage results (by pixel) are almost inexistent an

the link not there

• Current data is enough to demonstrate search and visualization with some limitations

• Summary data tables and metadata xml files are available to the group to work with.

Page 19: GCI Analysis

Reading meeting. December 12-14th,

2011

19

Presented to

• EuroGEOSS conference• The EC as D7.2 deliverable

• If possible, we will extent this study to some of the capacity catalogues

Page 20: GCI Analysis

®

QUAlity aware VIsualisation for the Global Earth QUAlity aware VIsualisation for the Global Earth Observation system of systemsObservation system of systems

Reading meeting. December 12-14th, 2011

Thank you!