a framework for user centred privacy and security in the...
TRANSCRIPT
CLARUS – H2020-ICT-2014 – G.A. 644024
© CLARUS Consortium 1 / 80
A framework for user
centred privacy and
security in the cloud
Definition of Application Cases
Type (distribution level) Public
Contractual date of Delivery 30-04-2015
Actual date of
delivery 12-06-2015
Deliverable number D2.1
Deliverable name Definition of Application Cases
Version V1.1
Number of pages 80
WP/Task related to the
deliverable Task 2.1
WP/Task
responsible AKKA
Author(s) AKKA and FCRB Teams
Partner(s) Contributing All
Document ID CLARUS-D2.1-DefinitionOfApplicationCases-v1.1
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 2 / 80
Abstract This document analyses and specifies the application cases targeting
e-Health and publication of Geo-referenced data on the Internet. The
goal of this analysis is the identification of a number of
demonstration cases that are the main input for the refinement of
CLARUS requirements (WP2) and for the CLARUS implementation
(WP5). The demonstration cases cover all major aspects of the
CLARUS results. The demonstrations developed on the basis of this
specification will enable integration testing and support the final
evaluation of the project results to be carried out in WP6. In addition,
the application cases provide working examples that will support the
exploitation and dissemination activities (WP7).
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 3 / 80
Disclaimer
CLARUS (G.A. 644024) is a Research and Innovation Actions project funded by the EU Framework Programme for Research and Innovation Horizon 2020. This document contains information on CLARUS core activities, findings and outcomes. Any reference to content in this document should clearly indicate the authors, source, organisation and publication date. The content of this publication is the sole responsibility of the CLARUS consortium and cannot be considered to reflect the views of the European Commission.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 4 / 80
Table of Contents
1 INTRODUCTION ............................................................................................................................................ 8
1.1 SCOPE OF THE DOCUMENT ..................................................................................................................................... 8
1.2 METHODOLOGY / DOCUMENT PLAN ....................................................................................................................... 8
1.3 DELIVERABLE OUTCOME AND FITTING IN THE PROJECT WORKFLOW ................................................................................ 8
1.4 APPLICABLE AND REFERENCE DOCUMENTS ................................................................................................................ 8
1.5 REVISION HISTORY ............................................................................................................................................. 10
1.6 NOTATIONS, ABBREVIATIONS AND ACRONYMS ........................................................................................................ 11
1.6.1 Acronyms............................................................................................................................................. 11
1.6.2 Definitions ........................................................................................................................................... 12
2 GEO PUBLICATION APPLICATION CASE ....................................................................................................... 14
2.1 OVERVIEW ....................................................................................................................................................... 14
2.2 ACTORS ........................................................................................................................................................... 14
2.2.1 Data Providers ..................................................................................................................................... 14
2.2.2 Data Consumers .................................................................................................................................. 15
2.2.3 Application Providers .......................................................................................................................... 15
2.2.4 IT Team................................................................................................................................................ 15
2.2.5 Security Manager ................................................................................................................................ 15
2.2.6 Cloud Service Provider ......................................................................................................................... 15
2.3 DATASETS ........................................................................................................................................................ 15
2.3.1 Geospatial data ................................................................................................................................... 15
2.3.2 Geospatial datasets for CLARUS.......................................................................................................... 17
2.4 SERVICES.......................................................................................................................................................... 20
2.4.1 Introduction ......................................................................................................................................... 20
2.4.2 Publication Services ............................................................................................................................. 22
2.4.3 Access Services .................................................................................................................................... 23
2.4.4 Computation Services .......................................................................................................................... 24
2.4.5 Exploitation & Operation Services ....................................................................................................... 26
2.5 SECURITY EXPECTATIONS ..................................................................................................................................... 28
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 5 / 80
2.5.1 Expectations regarding geospatial data ............................................................................................. 29
2.5.2 Expectations regarding geospatial services ........................................................................................ 31
2.5.3 Expectations regarding personal data ................................................................................................ 33
2.5.4 Expectations regarding Cloud-based hosting ...................................................................................... 33
2.6 DEMONSTRATION CASES FOR CLARUS.................................................................................................................. 34
2.6.1 Geodata storage in the Cloud ............................................................................................................. 35
2.6.2 Geodata publication in the Cloud ........................................................................................................ 39
2.6.3 Collaboration on geodata in the Cloud ............................................................................................... 46
2.7 SECURING DEMONSTRATION CASES ....................................................................................................................... 52
2.7.1 Securing geodata storage in the Cloud ............................................................................................... 52
2.7.2 Securing geodata publication in the Cloud ......................................................................................... 53
2.7.3 Securing geodata collaboration in the Cloud ...................................................................................... 55
2.7.4 Summary ............................................................................................................................................. 56
3 E-HEALTH APPLICATION CASE ..................................................................................................................... 58
3.1 OVERVIEW ....................................................................................................................................................... 58
3.2 ACTORS ........................................................................................................................................................... 58
3.2.1 Data Providers ..................................................................................................................................... 58
3.2.2 Data Consumers .................................................................................................................................. 59
3.2.3 IT Team................................................................................................................................................ 59
3.2.4 Security Manager ................................................................................................................................ 59
3.2.5 Cloud Service Provider ......................................................................................................................... 59
3.3 DATASETS ........................................................................................................................................................ 59
3.3.1 Introduction ......................................................................................................................................... 59
3.3.2 Standards used in the e-Health use case ............................................................................................. 60
3.3.3 E-Health Dataset ................................................................................................................................. 63
3.4 SERVICES.......................................................................................................................................................... 64
3.4.1 Introduction ......................................................................................................................................... 64
3.4.2 Data Publication .................................................................................................................................. 69
3.4.3 Metadata Management...................................................................................................................... 69
3.4.4 Search .................................................................................................................................................. 69
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 6 / 80
3.4.5 Advanced Queries ............................................................................................................................... 69
3.4.6 Statistics Computation ........................................................................................................................ 70
3.4.7 Transformation Services ...................................................................................................................... 70
3.4.8 Exploitation Services ........................................................................................................................... 70
3.5 SECURITY EXPECTATIONS ..................................................................................................................................... 70
3.5.1 Expectations in terms of data ............................................................................................................. 70
3.5.2 Expectations in terms of services ........................................................................................................ 71
3.6 DEMONSTRATION CASES FOR CLARUS.................................................................................................................. 71
3.6.1 Securing Passive Medical Health Records storage in the cloud .......................................................... 72
3.6.2 Securing Passive Medical Health Records access and retrieval from the cloud .................................. 73
3.6.3 Securing Passive Medical Health Record for the Advanced Query ...................................................... 74
3.6.4 Securing Passive Medical Health Record for Statistics Computation query ....................................... 75
3.7 SUMMARY........................................................................................................................................................ 76
4 CONCLUSION .............................................................................................................................................. 79
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 7 / 80
Table of Figures
Figure 1 - INSPIRE network services ([R8]) .................................................................................................. 21
Figure 2 - Kriging interpolation formula ..................................................................................................... 25
Figure 3 - Kriging semi-variance formula .................................................................................................... 26
Figure 4 - Illustration of semivariogram components [R11] ....................................................................... 26
Figure 5 - Geodata Storage Activity Diagram .............................................................................................. 36
Figure 6 - Geodata Publication Diagram ..................................................................................................... 41
Figure 7 - Geo-processing Activity Diagram ................................................................................................ 45
Figure 8 - Server configuration of WPS Kriging implementation [R10] ...................................................... 46
Figure 9 - Geodata Collaboration (Consultation) Activity diagram ............................................................. 49
Figure 10 - Geodata Collaboration (Modification) Activity diagram .......................................................... 50
Figure 11 - Geo Data Demonstration Cases with regard to Security Expectations .................................... 57
Figure 12 - CLARUS Generic Scenarios with regard to Geo Data Demonstration Cases............................. 57
Figure 13 - Medical Health Records “Passivation” process ........................................................................ 65
Figure 14 - Medical Health Records access and data retrieval process ...................................................... 66
Figure 15 - Advanced Query to the CLARUS Cloud process ........................................................................ 67
Figure 16 - Medical Record Statistics Computation query ......................................................................... 68
Figure 17 - Medical Health Records storage in the cloud diagram ............................................................. 72
Figure 18 - Medical Health Records access and retrieval diagram ............................................................. 73
Figure 19 - Medical Health Record Advanced Query diagram .................................................................... 74
Figure 20 - Medical Health Record Statistics Computation query diagram ................................................ 75
Figure 21 - e-Health Demonstration Cases with regard to Security Expectations...................................... 78
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 8 / 80
1 Introduction
1.1 Scope of the document
The requirements elicitation and specification process for CLARUS methodologies, technologies and
tools is supported through the analysis of two main application cases that will serve in demonstrating
the appropriateness an applicability of project results (see [A1]).
This document defines more in details the applications cases that will inform all further development in
the project from the two main domains:
Publication of geo-referenced data on the Internet
e-Health
This integrate an identification of main actors involved, the services they use, maintain and/or develop,
leaning on various types of datasets that we try and categorize according to trust and security
perspectives. This shall give a relevant overview of the domains at stake for initiating further work,
including specification of CLARUS requirements.
1.2 Methodology / Document Plan
For this document, we have reviewed several of the most prominent projects and initiatives in each
application domain. This allowed converging towards lists of Actors, Datasets, and Application Cases
that appear relevant for demonstrating appropriateness and applicability of CLARUS solutions.
Chapter 2 describes the domain of geospatial data publication on the internet, presenting actors,
datasets, services, security expectations, and detailing three cloud-based scenario where data
confidentiality is an issue. These scenario will serve as demonstration cases for CLARUS.
Chapter 3 describes the domain of e-Health, presenting actors, datasets, services, security expectations
and detailing two cloud-oriented scenario where data privacy is an issue. These scenario will serve as
demonstration cases for CLARUS.
Chapter 4 provides a conclusion to the present document, paving the way for the definition of CLARUS
requirements.
1.3 Deliverable outcome and fitting in the project workflow
The publication of this deliverable should help in the specification of CLARUS requirements and in
defining the evaluation, testing and validation undertaken in WP6.
1.4 Applicable and Reference documents
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 9 / 80
App. /Ref. Title
[A1] CLARUS Grant Agreement 644024
[A2] CLARUS Consortium Agreement v3.0, dec.2014
[A3] Internet Security Glossary Glossary RFC4949.
[A4] CLARUS D2.1 Annex I - Review of EU geo-publication projects
[R1] INSPIRE Thematic Working Group Geology, Data Specification on Geology –
Technical Guidelines, dec. 2013
[R2] INSPIRE Thematic Working Group Mineral Resources, Data Specification on
Mineral Resources – Technical Guidelines, dec. 2013
[R3] INSPIRE Thematic Working Group EMF, Data specification on Environmental
Monitoring Facilities (EMF) – Technical Guidelines
[R4] INSPIRE Thematic Working Group Utility and governmental services, Data
Specification on Utility and governmental services - Technical Guidelines
[R5] Official Journal of the European Union, Directive 2007/2/EC of the European
Parliament and of the Council of 14 March 2007 establishing an Infrastructure
for Spatial Information in the European Community (INSPIRE)
[R6] Provisions contained in Article 4(2) Directive 2003/4/EC on public access to environmental information.
[R7] Provisions contained in article 8 (1) EU Data Protection Directive 95/46 on the protection of individuals with regard to the processing of personal data and on the free movement of such data
[R8] InGeoCloudS D2.1, Use Cases for InGeoCloudS Data and Services, Version 1.1 –
May2013
[R9] InGeoCloudS D2.2, Interface of web services and models of data, Version 1.0
[R10] Kriging implementation documentation, inGeoCloudS communication sheet,
E.Grinias, EKBAA
[R11] ArcGIS Help 10.1: How Kriging works. Available at
http://resources.arcgis.com/en/help/main/10.1/index.html#//00q90000001t00
0000
[R12] EGDI-Scope D5.1, Report on trust and authentication, Katleen Janssen, Jos
Dumortier (KU Leuven), August 2013
[R13] Design security and geo-rights management services in spatial data
infrastructure, T.Kubik et al.
[R14] Estimating Kriging-Based predictions with privacy, B.Tugrul, H.Polat, oct.2012
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 10 / 80
[R15] OpenStreetMap website. Available at https://www.openstreetmap.org
[R16] Successful Response Starts with a Map, Improving Geospatial Support for
Disaster Management, The National Academy of Sciences, 2007
[R17] Editing map client for geodata updating in emergency situations, M. Konecny et
al., Masaryk University
[R18] Geographic Information Systems (GIS) for Disaster Management, Brian
Tomaszewsk, 2015
[R19] InGeoCloudS D3.1.3, Analysis and Monitoring of Clouds for Geo-Data Services,
Version 1.0
[R20] InGeoCloudS D3.3, Maintenance plan and Service Profiling, Version 1.0
[R21] InGeoCloudS D4.2, Fully Operational InGeoCloudS Pilot, Version 1.0
[R22] Available at : http://www.hl7.org
[R23] Available at : http://www.who.int/classifications/icd/en/
[R24] Available at : https://loinc.org/
[R25] Available at : http://www.ihtsdo.org/snomed-ct
[R26] Available at : http://dicom.nema.org/
1.5 Revision History
Version Date Author Description
0.1 20/02/2015 AKKA AKKA internal initial iteration
0.2 09/03/2015 AKKA
AKKA internal iteration: integration of
harmonized structure for Use Cases
description
0.3 13/03/2015 AKKA, FCRB
Incorporation of 1st FCRB inputs, re-
structuration of §2.5, share with
consortium
0.4 14/04/2015 AKKA
General comments on v0.3 from partners
(mails+WebMeeting of 19/03) lead to
more details about treated datasets to
activity diagrams for better describing
business processes and to sections
focused on demonstrators.
0.5 30/04/2015 AKKA, FCRB
Mapping security expectations with
demonstrators in the geo case summary
section, Incorporation of FCRB inputs,
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 11 / 80
share with consortium for WebMeeting
30/04
0.6 13/05/2015 AKKA
Finalizing geo-publication part: added
Kriging description, explanation on
activity diagrams, securing demonstration
cases section, reference to annexI,
conclusion
0.7 29/05/2015 AKKA, FCRB
Finalizing e-Health part : added security
manager actor, securing for advanced
query and statistics computation queries
part, references, figure title.
Final version of deliverable D2.1.
0.8 01/06/2015 FCRB Added advanced query and statistics
computation explanations.
0.9 09/06/2015 AKKA
New release following reviews (MTI,
THALES) and KUL comments. Changes in
§1.6.2 definitions, §1.4 added R6, R7,R15
references , §Appendix, ToC, minor
format revisions.
1.0 12/06/2015 AKKA, FCRB New release following MTI review for e-
Health section. Final version of D2.1
1.1 12/06/2015 Jesús A. Manjón
(URV) Style modifications
1.6 Notations, Abbreviations and Acronyms
1.6.1 Acronyms
BRGM: ANSI: API: CDA:
Bureau de Recherche Géologique et Minière American National Standards Institute Application Programing Interface Clinical Document Architecture
CSW: CT:
Catalogue Service for the Web Computed Tomography
DoW: DICOM:
Description of Work Digital Imaging and Communications in Medicine
DP: Data Provider
EC: European Commission
ESRI: Environmental Systems Research Institute
FP7: Seventh Framework Programme for Research
FTPS: File Transfer Protocol Secure
GA: Grant Agreement
GEUS: Geological Survey of Denmark and Greenland
GIS: Geographical Information System
GML: HCCC: HIS:
Geography Markup Language Història Clínica Compartida de Catalunya Hospital Information System
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 12 / 80
HL7 Health Level 7 International
IaaS: ICD: IGN: IHTSDO:
Infrastructure as a Service International Classification of Diseases Institut National de l'Information Géographique et Forestière International Health Terminology Standards Development Organisation
InGeoCloudS Inspired GEOdata CLOUD Services
INSPIRE: LIS: LOINC: LOPD MHR:
Infrastructure for Spatial Information in Europe Laboratory Information Systems Logical Observation Identifiers Names and Codes Personal Data Protection Law in Spain (Ley Orgánica de Protección de Datos) Medical Health Record
OGC: Open Geospatial Consortium
OSM: OpenStreetMap
OWL: Web Ontology Language
PaaS: PACS
Platform as a Service Picture Archiving and Communication System
PMB: PCIS: PDF:
Project Management Board Primary Care Information Systems Portable Document Format
RDBMS RDF:
Relational Database Management System Resource Description Framework
REST: Representational State Transfer
RIF: Rule Interchange Format
SaaS: Software as a Service
SAML: Security Assertion Markup Language
SCP: Secure Copy
SFTP: SNOMED CT:
Secure File Transfer Protocol Systematized Nomenclature of Medicine Clinical Terms
ToC: Table of Contents
WFS: WHO:
Web Feature Service World Health Organization
WMS: Web Map Service
WP: Work Package
WPS: Web Process Service
1.6.2 Definitions
Critical
A condition of a system resource such that denial of access to, or lack of availability of, that resource
would jeopardize a system user’s ability to perform a primary function or would result in other serious
consequences, such as human injury or loss of life [A3].
Confidential
Confidential information refers to information that is restricted from public dissemination for reasons
related inter alia to the restriction of access based on law for example national security, intellectual
property, trade secrets, international relations, public security or national defence [R5][R6].
Sensitive
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 13 / 80
Sensitive information related to personal data is defined as “personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs, trade-union membership, and the processing of data concerning health or sex life” [R7].
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 14 / 80
2 Geo Publication Application Case
2.1 Overview
Earth systems are coupled and tightly integrated; that is why discovering and sharing geo-referenced
data is critical for environmental professionals. However finding data and processing resources across
disciplines is often an issue for this community.
All around the world, initiatives aim to remove technical obstacles to institutional information sharing
and to facilitate the adoption of open, spatially enabled reference architectures in enterprise
environments. Most relevant frameworks for (geo-)data professionals are the OGC (Open Geospatial
Consortium) that acts worldwide and INSPIRE initially designed and developed by the European
Commission through its JRC (Joint Research Centre):
The INSPIRE (Infrastructure for Spatial Information in the EC) Directive establishes rules for
geographic and environmental data (geodata) supporting environmental policies or relating to
any activities which might have an impact on the European environment. This Directive aims at
ensuring that geodata are consistently available, interpretable and usable across European
regional and state boundaries. The consequence of the Directive is a requirement that geodata
definitions follow agreed and established norms, standards and that the data be readily
available online. Many of the standards promoted by INSPIRE currently come from the OGC.
The Open Geospatial Consortium (OGC) is an international industry consortium of 506
companies, government agencies and universities participating in a consensus process to
develop publicly available interface standards. OGC members work together in Standards
Working Groups and Domain Working Groups (such as Earth Systems Science, Hydrology,
Metadata, Meteorology & Oceanography, Sensor Web Enablement) to provide free and openly
available standards to the market. The OGC leads worldwide in the creation and establishment
of standards that enable global infrastructures for delivery and integration of geospatial content
and services into business and civic processes.
While these initiatives do not impose any solution in terms of Data Infrastructures, the cloud has
received during the recent years an ever-growing interest through its capabilities of addressing common
requirements such as huge data volumes, ubiquitous access and quality of services, computation power
and economical competitiveness. Thus, security issues are very relevant for the different actors in the
field, in particular when confidential/critical data are at stake.
2.2 Actors
2.2.1 Data Providers
This designates actors producing or collecting information into a system. It could be commercial or non-
profit organizations (e.g. meteorological agency, geological survey, transport organization, Google,
IGN...), academic institution, government agency, scientific laboratories, citizens (amateur scientists).
Data providers use the system in an authenticated manner.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 15 / 80
2.2.2 Data Consumers
These are actors consuming data from the system. This could be end-users like citizens, commercial or
non-profit organisations, academic institutions, government agencies, or scientific laboratories. It could
also be a tier client application or system like an orchestration framework, an INSPIRE service.
Some services might not require any authentication from the user (e.g. public data published along
INSPIRE recommendation). Some data consumers might be granted access to data that are not public.
2.2.3 Application Providers
In the frame of a platform as a service (PaaS), these actors have the capability of integrating new data
management applications and services in the system. It could be commercial or non-profit organizations
(e.g., meteorological agency, geological survey, transport organization, Google, IGN...), academic
institution, government agency, scientific laboratories. They use the system in an authenticated manner.
2.2.4 IT Team
This groups the technical team in charge of managing, monitoring and maintaining the (geo-publication)
system in operational conditions. They notably ensure that all technical components operate in nominal
conditions, that system issues are solved as quickly as possible.
2.2.5 Security Manager
This actor is in charge of defining, provisioning, maintaining security policies in an organisation for the
system into consideration. This notably can include account management, authorisations management…
2.2.6 Cloud Service Provider
This designates the institution providing an IaaS, PaaS and/or SaaS type of service that are used by Data
Providers for pushing and managing own data in the cloud.
As an example for IaaS services, we can cite Amazon Web Services (AWS), Microsoft Azure etc.
For PaaS, we can cite ESRI Managed Cloud Services, InGeoCloudS, that notably allows application
providers to extend geo-spatial applications already online with additional services and technical
capabilities.
SaaS examples in the field include web mapping services such as MangoMap, ArcGIS Online,
InGeoCloudS, GISCloud…
2.3 Datasets
2.3.1 Geospatial data
2.3.1.1 Data types
Geographical information is encoded following the standards defined by international consortiums such
as the the OpenGIS Consortium (OGC) or by GIS software editors.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 16 / 80
The GIS data types can be classified into two different categories: vector and raster.
Vector data are geographical features considered as geometrical shapes. Different geographical features
are expressed by different types of geometry:
Points are geographical features with zero dimensions and can best be expressed by a single
point reference. Points are used as simple location for wells, peaks, points of interest, etc.
Lines are geographical features with one dimension and are used for linear features such as
rivers, roads, railroads, topographic lines, etc.
Polygons are geographical features with two dimensions and are used to cover a particular area
of the earth's surface such as the boundary of a city (on a large scale map), lake, or forest.
Raster datasets represent geographic features by dividing the world into discrete square or rectangular
cells laid out in a grid. Each cell has a value that is used to represent some characteristic of that location.
Raster datasets are commonly used for representing and managing imagery, digital elevation models,
and numerous other phenomena.
2.3.1.2 Formats
Formats vary and are designed either for data exchange or for data rendering and serving. There are
also very significant differences between performances of various formats. Serving big coverage data
sets with good performance requires some knowledge and tuning.
A good data serving format is raster data type, which allows for multi-resolution extraction, and
provides support for quick subset extraction at native resolutions. Popular raster formats are GeoTIFF or
BLOB in RDBMS.
On the other hand, vector data type is a better data processing format, that allows for visually smooth
and easy implementation of overlay operations, displays data as vector graphics, and simplifies
combining vector layers from different sources. Moreover, vector data is simpler to update and
maintain, is more compatible with relational database environments and usually smaller than raster
data. Popular vector formats are shapefile, mapinfo and PostGIS.
2.3.1.3 Storage types
There are two common ways of storing geospatial data:
2.3.1.3.1 GIS files
Geographical information is encoded in a standardized manner into a file. There are many options for
GIS files. Among the popular ones, we can mention the followings:
Shapefiles are a very common format for storage of vector data. It is developed and regulated
by Esri as a (mostly) open specification for data interoperability. The shapefile format is a digital
vector storage format for storing geometric location and associated attribute information. It is
possible to read and write with a wide variety of software. The shapefile format consists of a
collection of files with a common filename prefix, stored in the same directory. Three
mandatory files contain binary data: the main file (.shp) that contains geometry data, the shape
index file (.shx) and the feature attribute data file (.dbf) stored in dbase format.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 17 / 80
Mapinfo TAB is another popular format for storage of vector data. It is developed and regulated
by MapInfo Corporation as a proprietary format. Minimum files required are the main file (.tab)
that holds in ASCII format the information about the type of data, the feature attribute data file
(.dat) stored in dbase format, the map file (.map) that stores the graphic and geographic
information needed to display each vector feature on a map and its associated index file (.id).
GeoTIFF is a public domain metadata standard which allows georeferencing information to be
embedded within a TIFF file. It results from an effort by over 160 different remote sensing, GIS,
cartographic, and surveying related companies and organizations to establish a TIFF based interchange
format for geo-referenced raster imagery. GeoTIFF has emerged as a standard image file format for
various GIS applications worldwide.
2.3.1.3.2 Spatial databases
Spatial databases are alternative data sources to GIS files and are essential if GIS applications (either
web applications or rich-client applications) do transactions.
Spatial databases are designed to store and query geospatial data, using spatial indexes to speed up
database operations. In addition to typical SELECT statements, spatial databases provides a wide variety
of spatial operations such as set operations (e.g. union, difference), predicates (e.g. overlapping
between two features), functions (e.g. area, distance, perimeter, location of the center of a geometrical
shapes).
Although there are many options for spatial databases, PostgreSQL/PostGIS is usually recommended.
Alternatives are RDBMS like H2GIS, Oracle spatial, DB2, SQL Server with spatial extensions, Spatialite or
ArcSDE, but also non-relational databases like MongoDB.
The GIS objects supported by PostGIS are all the vector types defined in the "Simple Features for SQL
1.2.1" standard defined by the OpenGIS Consortium (OGC), and the ISO "SQL/MM Part 3: Spatial"
document. In addition, PostGIS supports a raster type (no standards exist to follow), and a topology
model (following an early draft ISO standard for topology that has not been published as yet).
2.3.2 Geospatial datasets for CLARUS
2.3.2.1 Introduction
2.3.2.1.1 Critical/confidential datasets
In this section we introduce the environmental datasets that require a certain level of security / privacy.
Qualification of datasets regarding security and confidentiality is of high importance for the definition of
CLARUS application cases. However, this information is not directly available in the different projects we
studied (e.g. as metadata or in data documentation).
Qualification is deduced from interviews of significant stakeholders and analysis of key projects in the
domain of publication of geo-referenced data (e.g. EGDI-Scope, Minerals4EU, InGeoCloudS) [A4]. In
current document, we deliberately consider several types of datasets in order to address the security
issues from a relatively broad perspective, trying and taking into account some specificities of each
dataset type.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 18 / 80
These datasets are further described in the Demonstration Cases section (2.6), where we try and
establish a categorization of the data manipulated in terms of criticity /sensitivity (taking the viewpoint
of data owners / service providers) and of the corresponding data types.
2.3.2.1.2 INSPIRE classification
The INSPIRE (Infrastructure for Spatial Information in Europe) Directive mandates all European Union
Member States to provide environmentally related datasets so that they can be easily accessed by
other public organisations within their own country, in surrounding European countries and by the
European Commission for Europe-wide policy making.
INSPIRE Regulations apply to data:
With a geographic reference (i.e. “geodata”)
Relating to an area where the Member states have or exercise jurisdictional rights
Held by a public authority, third party or others working on behalf of either
In electronic format
Relating to one of the 34 topics classified in the 3 INSPIRE themes below:
1. Addresses, Geographical names, Administrative units, Hydrography, Cadastral parcels,
Protected sites, Coordinate reference systems, Transport networks, Geographical grid
systems.
2. Elevation, Geology, Land cover, Orthoimagery.
3. Agricultural and aquaculture facilities, Habitats and biotopes, Population distribution
and demography, Area management, Human health and safety, Production and
industrial facilities, Atmospheric conditions, Land use, Sea regions, Bio-geographical
regions, Meteorological geographical features, Soil, Buildings, Mineral Resources,
Species distribution, Energy Resources, Natural risk zones, Statistical units,
Environmental monitoring Facilities, Oceanographic geographical features, Utility and
governmental services.
2.3.2.2 Human Health and Safety
The INSPIRE Human Health and Safety (HH) theme describes the geographical distribution of dominance
of pathologies, the effect on health or well-being of humans linked to the quality of the environment.
It does not address personal data whereas the E-Health Application Case (section 3) specifically points
the issues related to the protection of personal health data stored on the cloud.
Therefore because the e-Health application case is more relevant for demonstration of the CLARUS
solutions, we will not consider the INSPIRE Human Health and Safety theme in the field of geo
publication.
2.3.2.3 Geology
The INSPIRE Geology (GE) theme is split into the following sub-themes: Geology, Hydrogeology and
Geophysics.
2.3.2.3.1 Geology and boreholes from the oil industry
The particular field of Geology provides basic knowledge about the physical properties and composition
of geologic materials, their structure and their age as depicted in geological maps, as well as landforms
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 19 / 80
(geomorphological features). The model also covers boreholes - another important source of
information for interpreting the subsurface geology. [R1]
Boreholes falling under the Geology theme are categorized according to hydrocarbon production (i.e.
production of petroleum oil and/or gas). Borehole data from the oil industry has high commercial value
and security is a concern, as it emerged from our interview with the national geological survey from
Denmark and Greenland (GEUS).
2.3.2.3.2 Hydrogeology and groundwater boreholes
The particular field of Hydrogeology describes the flow, occurrence, and behaviour of water in the
subsurface environment. The two basic elements are the rock system (including aquifers) and the
groundwater system (including groundwater bodies). Man-made or natural hydrogeological
objects/features (such as groundwater wells and natural springs) are also included. [R1]
As it emerged from the EGDI-Scope stakeholders’ survey, security is critical for boreholes, wells and
groundwater data. (cf. Datasets section in the EGDI-Scope annex [A4] )
2.3.2.4 Environmental Monitoring Facilities
Location and operation of environmental monitoring facilities (EMF) includes observation and
measurement of emissions, of the state of environmental media and of other ecosystem parameters
(biodiversity, ecological conditions of vegetation, etc.) by or on behalf of public authorities. [R3]
2.3.2.4.1 EMF and groundwater boreholes
Groundwater boreholes can be modelled under either the Geology (GE) or the Environmental
Monitoring Facilities (EMF) INSPIRE themes. (see above for security aspects)
2.3.2.5 Mineral Resources
The Mineral resources data (MR) theme refers to the description of natural concentrations of very
diverse mineral resources of potential or proven economic interest. Mineral resources are used in
various domains [R2] :
Management of resources and exploitation activities: Providing information on inventoried
mineral resources.
Environmental impact assessments: mapping and measuring environmental geological
parameters at desk, in the field and in laboratory, for assessing geological material to be used
for construction and rehabilitation at the mine site.
Mineral exploration: the quantitative assessment of undiscovered mineral resources, the
modelling of mineral deposits, the mapping of lithological areas and units potentially hosting
mineral deposits, the use of by-products from natural stone quarrying as "secondary
aggregates" or as raw material for other industries.
Promotion of private sector investment: providing geodata and services for mining and
exploration companies.
2.3.2.5.1 Mineral resources and Rare Earth Elements
Rare Earth Elements (REE) are a group of critical raw materials.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 20 / 80
As it emerged from the EGDI-scope and Mineral4EU projects, the rare-earths market is critical to many
defense, energy, and other high-tech products. China today controls about 95 % of the world’s REE
production, making the sustainable supply of these elements for European industry highly vulnerable
and therefore a concern for national security of member states. [A4]
2.3.2.6 Utility and Governmental Services
The INSPIRE Utility and Governmental Services (US) theme includes both utility networks (such as
electricity, oil & gas, sewage, telecommunication and water networks) and administrative/social
governmental services (such as public administrations, civil protection sites, schools and hospitals).
2.3.2.6.1 Utility networks
The scope of this sub-theme covers 6 distinct categories of network:
Water Network,
Sewer Network,
Electricity Network,
Oil, Gas & Chemicals Network,
Thermal Network,
Telecommunications. Utility networks sub-theme overlaps other INSPIRE themes such as Hydrography, Production and
industrial facilities and energy resources. [R4]
The INSPIRE Directive states that “in every particular case, the public interest served by disclosure shall
be weighed against the interest served by limiting or condition the access” [R5]. In some instances,
spatial datasets covered by the INSPIRE directive may therefore not be made available to the public. For
example in the case these datasets are linked to public security. Or if they belong to a third-party that
does not give permission for re-use.
Utility networks are especially major concern to public security. On the sole gas distribution networks of
the Member states there are numerous damages every day, sometimes with very serious consequences
for the safety of both workers and residents and for the protection of the environment and the
economy. At the same time, utility networks datasets have an important business value. Data on
pipelines are the property of their operators, often private companies, and they cannot be used without
permission.
These datasets are therefore particularly interesting in the case of emergency and geo-hazard risk
management. Utility data must be quickly available for disaster-response personnel, but confidentiality
of these data should be guaranteed to the organisations to which they belong. Otherwise these
organisations might be reluctant to share them.
2.4 Services
2.4.1 Introduction
The INSPIRE directive also applies to spatial data services through which it is possible to access or use
the data described in the previous section.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 21 / 80
These services (called geodata services, or network services), are operations that can be executed on
the Web thanks to software applications managing geospatial data and/or geospatial metadata. These
services are:
Discovery,
Download,
View,
Transformation.
INSPIRE makes available a number of technical guidance documentation intended to assist government
and public bodies make their information available using standards such as OGC services. Among the
recommended standards, there is:
OGC Catalog Service for the Web (CSW) for discovery,
OGC Web Map Service (WMS) for viewing,
OGC Web Feature Service (WFS) for direct access download,
ATOM or WFS for pre-defined dataset download.
Figure 1 - INSPIRE network services ([R8])
These INSPIRE architecture services form the building blocks of some of the services offered by a
Geodata Cloud Service Provider. The combination of such services forms a SaaS application. Moreover a
Geodata Cloud Service Provider can provide the solution as PaaS, allowing consumers to deploy their
own application. An example is given by the FP7/CIP InGeoCloudS project that provides an open-source,
cloud-based platform (PaaS) with Geodata services (SaaS). [A4]
These Geodata cloud services can be divided into:
Publication services,
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 22 / 80
Access services,
Computation services,
Recovery services.
We describe these services in general terms in the following section. Demonstrators description section
(Section 2.6) shows in more detail the data workflows that we will have to consider in CLARUS.
2.4.2 Publication Services
The data publication use case gathers services about the publication of geospatial datasets [R8]. These
services are more dedicated to:
Install custom applications in the cloud,
Upload datasets into the cloud,
Publish the datasets through OGC/INSPIRE services ,
Manage metadata
2.4.2.1 Custom Application
The application providers have the capability to install their custom application on the cloud.
The cloud provider delivers a computing platform (PaaS), including operating system, programming
language execution environment, database, and web server.
Two kinds of applications can be installed by application providers on the cloud:
A Web application dedicated to the domain of publication of geo-referenced data addressed by
the application provider.
An application utility dedicated to the synchronization of the dataset stored on the cloud with
the dataset stored on premises.
2.4.2.2 Data Import / Synchronization
The data providers have complete knowledge and skill of their datasets and elect the data to push into
the cloud. Each data provider has a dedicated and secured storage space on the file system and on the
database server. Only the owner of the data can access it.
The data providers also have complete skill on the procedures to manage their data on the cloud,
controlling how and when data are pushed, updated or deleted. The cloud solution provides basic
features (SaaS) to help data providers:
Services to manually manage or synchronize their datasets from own premises (FTP, FTPS, SCP,
SFTP),
Services to register synchronization tasks running on the cloud.
2.4.2.3 Service Provisioning
This use case defines how data providers publish the datasets through interoperable OGC/INSPIRE
compliant services (SaaS). They create or edit layers, configure Web Map Service (WMS) for raster data
and/or Web Feature Service (WFS) for vector data.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 23 / 80
The data providers define access rights to their published data for critical, confidential or marketable
data.
2.4.2.4 Metadata Management
Metadata allow data providers to describe their data, maps and services. Metadata are published and
used by data consumers who are looking for particular datasets and services.
These metadata are available and managed through a so-called Catalogue compliant with interoperable
OGC/INSPIRE CSW service.
Standard tools for managing and exposing discovery services (SaaS) include Geonetwork, Geosource,
Deegree, etc.
2.4.3 Access Services
2.4.3.1 Search
Discovery Services allow GIS softwares searching for spatial datasets and spatial data services on the
basis of the content of corresponding metadata, and displaying the metadata content.
These operations are performed through HTTP(S), following the CSW or OpenSearch standards.
2.4.3.2 View
GIS client applications invoke interoperable OGC/INSPIRE compliant services (SaaS) to display, navigate,
zoom in/out, pan, or overlay spatial datasets and display legend information and any relevant content of
metadata. They retrieve layers (raster data) and legend through Web Map Service (WMS, WMS-C,
WMTS) and/or features (vector data) through Web Feature Service (WFS).
2.4.3.3 Download
Download Services (SaaS) enable copies of complete spatial datasets, or of parts of such sets, to be
downloaded. Download Services could be of two types:
Pre-defined dataset download service(s): A pre-defined dataset download service provides for
the simple download of predefined datasets (or pre-defined parts of a dataset) with no ability to
query datasets or select user-defined subsets of datasets. A pre-defined dataset or a pre-
defined part of a dataset could be (for example) a file stored in a dataset repository, which can
be downloaded as a complete unity with no possibility to change content, whether encoding,
the CRS of the coordinates, etc. Pre-defined datasets are usually downloaded through INSPIRE
compliant services (SaaS) such as WFS or ATOM (optionally extended with GeoRSS or
OpenSearch).
Direct access download service(s): A direct access download service extends the functionality of
a pre-defined dataset download service to include the ability to query and download subsets of
datasets. The direct access download service allows more control over the download than the
simple download of a pre-defined dataset or pre-defined part of a dataset. In this case, the
spatial information is typically stored in a repository (e.g. a database) and only accessible
through a middleware data management system (although the precise implementation may
vary). The query can be based upon spatial or temporal criteria, or by specific properties of the
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 24 / 80
instances of the spatial object types contained in the repository. Direct access download is
usually done through OGC/INSPIRE compliant WFS service (SaaS).
2.4.4 Computation Services
2.4.4.1 Transformation Services
Transformation services (SaaS) enable spatial datasets to be transformed with a view to achieving
interoperability. Examples of transformations are:
Data format transformation. It is used, for example, to transform dataset in a data provider’s
proprietary format to a dataset in a standard format such as GML.
Coordinates reference system transformation. Examples of coordinates systems frequently used
in geospatial dataset are the “Universal Transverse Mercator” coordinate system, the “British
national grid” reference system or the “United States National Grid”.
Data and/or Application providers provision, configure and publish transformation services in order to
allow data consumers to retrieve their geospatial datasets in a format that is compliant to a specific
standard or their own application needs.
Application providers:
o Deploy transformation services implementation to the cloud,
o Publish transformation services and related metadata.
Data providers:
o Manage transformation configurations (source and target data schema, mapping
definition) and related metadata
o Publish transformation configurations and related metadata
The implementation of transformation services rely on geospatial domain standards (RIF, GML) and/or
on other standards (XSD, XSLT, OWL (RDF), SQL). Commercial Solutions can also be used (e.g. Talend)
Data consumers invoke transformation service in order to retrieve datasets provided by data providers
in a format that conforms to their specific needs.
Data consumers:
Gather information about available transformation services
Invoke a dataset transformation service providing
o Source dataset and schema (by reference or by value)
o Target dataset destination and schema
Retrieve transformed dataset
Applicable standards: WSDL, WADL, WS-Addressing.
2.4.4.2 GeoProcessing / Invokation Services
Geospatial data often need to be processed before the information can be used effectively. The
geographic calculations run on the cloud and are provided as a service (SaaS). Web Processing Service
(WPS) is an OGC standard which provides rules for standardizing the implementation of geographic
calculations ("processes") as a Web Service.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 25 / 80
The WPS standard defines three types of operations:
GetCapabilities: describes the contents and properties of the service; lists the treatments; lists
the supported operations and methods,
DescribeProcess: provides the description of a process (description of inputs and outputs, i.e.
requests and responses),
Execute: allows to invoke the process/calculation and gather the corresponding result by
providing input parameters.
2.4.4.2.1 Kriging Example
Kriging (named after D.Krige) is an example of widespread geographic calculation that could be
implemented via WPS. More precisely Kriging is a spatial interpolation method, “which relies on the fact
that as distance between points increases, their similarity, defined by the covariance or correlation
between points, decreases.” [R10]
2.4.4.2.1.1 Principles
Let us consider a Regionalized Variable (Z), e.g. the concentration of some mineral in a geographical
zone, the temperature in a region, etc. Given the value of Z at a set of sample points (𝑆1 … 𝑆𝑛), a spatial
interpolation method aims at predicting its value at any other point (𝑆0) of the region. Broadly speaking,
the principle is to weight the sample points:
𝑍(𝑆0) = ∑ 𝑤𝑖 𝑍(𝑆𝑖)
𝑛
𝑖=1
Figure 2 - Kriging interpolation formula
When using interpolation in geography, we assume that locations that are close to each other tend to
be more similar than locations that are far apart. The weight (𝑤𝑖) given to values measured at distant
points will therefore be less than the weight given to values measured at points near the prediction
location.
For instance we can consider the weight to be directly a function of distance (or inverse distance)
between the prediction location and the sample points. This deterministic interpolation is called Inverse
Distance Weighting (IDW).
Unlike IDW, Kriging also takes into account the statistical relationships of the measured points between
themselves (i.e. autocorrelation) – using statistical methods. Kriging weights are based not only on the
distance between the measured points and the prediction location, but also on the overall spatial
arrangement of the measured points. This is particularly useful when measurements are spatially
correlated or when they show a directional bias (e.g. N-S vs. W-E), as it is often the case in geology and
soil science.
Kriging is a multi-step process:
First step (“Variography”) is to quantify and depict the spatial autocorrelation of the measured sample points - i.e. to express the degree of relationship between them. In this prospect, Kriging uses the semi-variance, which is simply half the variance of the differences between all possible points spaced a constant distance apart.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 26 / 80
γ(si,sj) = ½ var(Z(si) - Z(sj))
Figure 3 - Kriging semi-variance formula
The relationship between semi-variance and distance can be shown in a graph called a semi-variogram.
Once the squared difference between the values for all pairs of locations is plotted on the experimental variogram, the next step (“Spatial Autocorrelation Modeling”) is to fit a model through them. Indeed the experimental variogram, being irregular, cannot be used directly for calculating the Kriging weights. Instead a smooth mathematical function (model) must be used, e.g. Spherical, Exponential, Circular, Gaussian, Linear, etc. Even though they are different, these variogram models share certain characteristics (namely the range, the sill, and the nugget).
Figure 4 - Illustration of semivariogram components [R11]
Last step is to predict the unknown values (“Prediction”), either at a specific location (single point) either for a continuous surface (map grid).
In the case of a prediction for a specific location: the semi-variance values between the prediction point
(𝑆0) and the surrounding sample measurements (𝑆1 … 𝑆𝑘) (sample subset within a given search radius)
are computed using the variogram model; then the prediction value is calculated from a series of linear
equations. (cf. Kriging document [R10].
Actually Kriging uses the same data twice: the first time to estimate the spatial autocorrelation of the
dataset (variography + spatial modeling) and the second time to make the prediction.
2.4.4.2.1.2 Implementation
There are different technical solutions to implement Kriging solutions. An example of WPS Kriging
Execution operation will be described in order to demonstrate the process. Please refer to section
2.6.2.2 for more details.
2.4.5 Exploitation & Operation Services
The data providers have confidence in a geo publication solution running on the cloud if integrity of
their geospatial data and availability of the provided geo-services are preserved.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 27 / 80
Data recovery, monitoring and user management must be addressed in order to achieve an acceptable
level of quality.
2.4.5.1 Back-Up and Data Recovery
Minimizing data loss and improving performance are opposing goals. CSPs usually provide different
services (IaaS) to target those goals:
object storage services (e.g. S3 on Amazon, Swift on OpenStack) are designed to reduce the risk
of data loss,
disk storage services (e.g. Instance Store or EBS on Amazon, Ephemeral Disk or Cinder on
OpenStack) focus on performances (high I/O).
The geo publication solution must rely on the two kind of services to achieve an acceptable performance
level while avoiding data loss.
Disk storage services (IaaS) allow to improve performances so that client requests are served with a
good response time. However, even if CSPs offers solution to minimize lost of data stored on disk (e.g.
Amazon EBS volumes are tied to one data center and are automatically replicated within their data
center), they do not provide turnkey solutions to ensure data recovery (e.g. EBS volumes tied to a data
center are lost in case of failure the whole data center).
The geo publication solution must define a backup and data recovery procedures that rely on the object
storage service (IaaS) provided by the CSP (or on another solution independent of the CSP) [R19].
Object storage services (IaaS) are suitable to store backup of data because they replicate data to
minimize data loss. Some CSPs go further with replication on multiple data centers (e.g. Amazon S3
replicates data on multiple data centers in the same region). An alternative is to rely on object storage
services provided by other CSPs.
The geo publication solution must at least backup all data provider’s data and optionally provide a
service allowing data providers to backup/restore explicitely their datasets.
2.4.5.2 Monitoring
CSPs (IaaS) do not ensure high availability of the services provided by applications running on the cloud.
A service may fail for any reason: application bug (e.g. disk full), cyber attack, CSP hardware failure or
CSP maintenance task, etc.
CSPs usually provide SLA with redundant IaaS services running in multiple data centers to achieve high
availability. However customers do not automatically benefit of this strategy for their applications. The
architecture of the geo publication solution must be adapted, following the best practices guides
provided by the CSP.
If the geo publication solution does not achieve high availability, monitoring services should allow to
react as soon as possible to restart the faulty service. The IT team must rely on the monitoring and
support services provided by the CSP [R21]:
Health dashboard publishes information on availability of the CSP’s services (IaaS),
Monitoring service for cloud resources provisioned by the geo publication solution running on
the cloud (SaaS),
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 28 / 80
Support service must notify the IT team (e.g. by mail) of failure events and scheduled
maintenance tasks that will imply stopping the service,
Support service must also help the IT team, according to the subscribed level of support.
In order to achieve monitoring of the services provided by the geo publication solution, the IT team
must use a dedicated application that relies on specific metrics [R20].
Moreover, in case of a geo publication solution that acts as a PaaS or SaaS platform - managing
geospatial data of multiple data providers, sharing the costs due to the cloud resources provisioned on
the underlying IaaS platform is difficult. Costs should be shared according to the usage by each data
provider of the geo publication solution: storage (e.g. volume of data), access services (e.g. number of
client requests), computation services (e.g. custom geo processing) [R19].
2.4.5.3 User Authentication and Authorization
Data providers expect the geo publication solution to authenticate users and authorize access to data
and services according to their needs in term of security. The geo publication solution must therefore
provide an authentication mechanism.
Although the geo publication solution usually integrates multiple software components, the end user
have to authenticate once and then shall be able to access to all the services and applications (according
to its rights). The geo publication solution must therefore provide an authorization mechanism that
implement single sign on (SSO). Application standards are OAuth and SAML.
Data providers may need to audit access to its data or may need to restrict access to its data to
authorized people. The geo publication solution must therefore controls and logs access to the services
that operates the data provider’s data.
Moreover a data provider that needs to audit access or needs to restrict access to some features of its
application has to integrate its application with the authentication mechanism (SSO) provided by the
geo pblication solution.
2.4.5.4 User Management
User authentication and authorization imply to provide a service to the IT team to manage users and
their rights[R21].
In some cases, the geo publication solution may also provide a restricted access to user managment to
allow data providers to manage users’ permissions on their data and on their application.
2.5 Security Expectations
When considering the use of a geospatial data infrastructure, it is essential that a certain level of
confidence is guaranteed to both data providers and users. This level of confidence relies mainly on
security measures.
As a prerequisite, we should therefore assess what are the security measures required. The aim of this
section is to sum up security requirements coming from various projects representative of the
“publication of geo-referenced data on the internet” domain.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 29 / 80
These requirements cover three main aspects:
The geospatial data,
The geospatial services,
The protection of personal data.
We should note that confidence also relies on transparency. Transparency means that the user finds it
easy to answer questions such as "- where are my data?"; "- what do the services do?"; "- how is the
privacy of my personal data guaranteed?"
2.5.1 Expectations regarding geospatial data
The users need to have confidence in the environmental datasets they can access. This confidence
derives on a great deal from guarantees given about the security of the data:
To the user – the data are not altered (data integrity, authenticity),
To the data provider – the data are protected (access control).
2.5.1.1 Preserving accessibility and protecting data against alteration
In the case of public data, the security requirements are limited because there is no need to restrict data
access for some people. We should nevertheless be careful about threats to data accessibility which can
seriously damage the business and affect the image of the organization and/or institution.
Protecting data accessibility implies being able to ensure the authenticity and integrity of the data (e.g.
give guarantees about data integrity using hash codes). Indeed the data that are not protected can
easily be altered in an undetectable way. This protection is applicable to all data, including public data.
Possible solution(s) are:
Implementing Data Origin Authentication,
Ensuring traceability : it is not enough to know that data has been changed, it is necessary to
know if it has been modified by an authorized person through authentication and authorization
mechanisms
2.5.1.2 Publish access limitations
In order to improve confidence with their data, it is recommended to define and publish information on
the legal aspects and on restrictions on the use of data.
2.5.1.2.1 Using metadata to define access limitations
Geospatial metadata allow the users to retrieve the datasets or services that best fit their needs. They
may contain legal information and information on the restrictions on the use of data.
According to the INSPIRE Metadata implementing rules*, ISO 19115 provides a general mechanism for
documenting different categories of constraints applicable to the resource or its metadata. There are
two major requirements expressed in the Directive in terms of documentation of the constraints as part
of the metadata:
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 30 / 80
The limitations on public access: the Member States may limit public access to spatial datasets
and spatial data services in a set of cases defined in Article 13. These cases include public
security or national defense, i.e. more generally the existence of a security constraint.
The conditions applying to access and use of the resource, and where applicable, the
corresponding fees (Articles 5-2(b) and 11-2(f)).
2.5.1.2.1.1 Limitations on public access (AccessConstraint)
The contents of this property must be an XML fragment corresponding to the AccessConstraints element
as defined in the OGC WMS 1.1.1 DTD.
Metadata restriction code list (source: ISO19115)
MD_RestrictionCode Limitation(s) placed upon the access or use of the data
copyright exclusive right to the publication, production, or sale of the rights to a
literary, dramatic, musical, or artistic work, or to the use of a commercial
print or label, granted by law for a specified period of time to an author,
composer, artist, distributor.
Patent government has granted exclusive right to make, sell, use or license an
invention or discovery
patentPending produced or sold information awaiting a patent
Trademark a name, symbol, or other device identifying a product, officially
registered and legally restricted to the use of the owner or manufacturer
License formal permission to do something
intellectualPropertyRights rights to financial benefit from and control of distribution of non-tangible
property that is a result of creativity
Restricted withheld from general circulation or disclosure
otherRestrictions limitation not listed
2.5.1.2.1.2 Conditions applying to access and use (useLimitation)
This free text metadata is used to describe:
Terms and conditions, including where applicable, the corresponding fees that shall be provided
through this element,
A link (URL) where these terms and conditions are described.
To our knowledge, there are no wide-spead technical solution that enforces access limitations from
metadata declaration; ad hoc implementations are used.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 31 / 80
2.5.1.3 Protecting data confidentiality
This security measure aims at ensuring that the content of the information remains secret, except for
authorized people. In order to ensure the confidentiality of a dataset, we should define who has the
right to access the corresponding data, divide users into different groups or classes, and assign a role to
these groups.
A particular type of data whose confidentiality should always be protected is personal data. But as
geological data cannot be qualified as personal data, the question is : are there geological data for which
it is required to ensure confidentiality? At least the answer is yes for the personal data relating to
geological data, that is: - who has read or downloaded this particular data, who has used this particular
service.
There are several ways to ensure confidentiality among which :
Access control,
Cryptography.
2.5.1.4 Ensuring data quality assessing authenticity of sources
Trust in the geospatial data also depends on guarantees about the quality (i.e. how the user can ensure
that the data is reliable, good quality and corresponds to the expected use), which can be derived from
metadata. However it can be difficult for non-expert users to assess the quality of geospatial data
directly from metadata, as metadata contain little or no information on the expected use of data.
On the other hand, quality of geospatial data can be assessed through the use of authoritative data and
the ability to check the authenticity of sources. Quality is assumed when for instance the data come
from government agencies responsible for collecting data (e.g. national geological surveys). The concept
of authentic sources is related to one of the basic principles of INSPIRE: collect the data once at the most
suitable place, and re-use the data multiple times [R12].
2.5.2 Expectations regarding geospatial services
In order to set up a security policy for the publication of geo-referenced data, we must pay attention to
services – and more precisely to service continuity and to access management.
The security policy developed for the data needs to be extended to the services. It is important to
ensure continuity and to protect services against (distributed) Denial of Service attacks, power failures
and other external incidents.
2.5.2.1 Protecting services against unauthorized access
Data providers need to control the dissemination of their data in the geospatial value chain.
Access management is essential: it helps ensuring that only authorized individuals have access to the
service and can use the service for the purposes identified as part of their role.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 32 / 80
2.5.2.1.1 Implementing GeoRM services to manage rights
The INSPIRE directive specifies payment regulations for accessing spatial services provided by public
authorities (article 14) and mentions the use of services for electronic commerce, licensing, and other
mechanisms to ensure the preservation of rights and security of transactions.
Ideally, implementations of geospatial data infrastructure should follow open standards defined by the
OGC, ISO and INSPIRE. These standards include the OGC specification of GeoDRM architecture (digital
rights management), for geographic information. This reference model aims at simplifying the
management and protection of intellectual property on geospatial data.
Examples of features offered by GeoRM services are:
Authorization,
Authentication,
Pricing,
Billing,
Licensing access to the data (limited in time, spatial extent, specific position, particular user,
etc.).
The model allows to adapt licensing to different types of relationships between participants (direct
licensing, licensing indirect, B2B, B2C, licensing for WMS and WFS, with modules for RM, REL,
encryption, license verification, authentication, authorization, etc.).
Authentication standards and rights management are not in the CLARUS scope, however the GeoRM
specification is important to consider in parallel when securing geospatial services.
2.5.2.2 Publish access limitations
The access constraints for the data can be extended to the geospatial services, e.g. WMS, WFS or WPS
services.
2.5.2.2.1 Using service metadata to define access limitations
Geospatial metadata allow the users to retrieve the services that best fit their needs. They may contain
legal information and information on the restrictions on the use of a service. Even though authorization
mechanism based on such metadata are rarely implemented [R13].
For instance the wms-service-AccessConstraints element specifies any access constraints associated with
the WMS service. This information applies to the whole WMS instance, that is to say: it is not particular
to any layer or piece of data published by the WMS instance.
Metadata-based access limitations should be implemented by the application itself.
2.5.2.3 Additional considerations
Service Level Agreement: The users depend on certain services in order to retrieve their data
(cf. INSPIRE), they expect a certain level of service (SLA) from the Cloud Service Provider.
Ease of understanding: Trust in the geospatial services relies on ease of understanding. Ease of
understanding means that security should not complicate the user experience. Security and
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 33 / 80
user-friendliness should be properly balanced, i.e. a security policy should not deter potential
users from using the service.
Using secure communication protocols: In this use case, Cloud Service Providers provide
secured communication protocols to manage metadata.
2.5.3 Expectations regarding personal data
Restricting access to a given set of data implies personal data collecting and processing. More precisely:
information on persons authorized to access this data or service, on the moment when they access this
data or service, on the volume they used, etc.
Data consumers: It is important for data consumers to know where the data that they want to
access come from, whether they can access it or not, and why. Furthermore, they want to
ensure that information about their identity and their use of the data will not be misused by the
data provider.
Data providers: It may be important for data providers to know who is using their data/services
and how they are used.
2.5.3.1 Ensuring identity data protection
Ensuring personal data protection may imply managing identity (through registration, identification,
authentication, and the administration of rights and privileges).
Managing identity could also lead to identity federation matters, i.e.:
Outsourcing identification and authentication processes to third parties,
Making life easier for users through a Single Sign-On for multiple services rather than using
many user-ID and passwords (cf. use case below)
In addition, any solution aiming at ensuring personal data protection should be harmonized with EU’s
Data Protection Directive. These directives may change over time, i.e. data which are considered public
may become private.
2.5.3.2 Ensuring access auditing
As stated before, knowing who is using their data/services and how they are used is an important
concern for data providers.
2.5.4 Expectations regarding Cloud-based hosting
The transition to the cloud leads to specific security risks that should be compared with risks linked to
information systems outsourcing. The CSPs traditionally implement their own governance rules with
limited visibility and choice for the customer/user. It is one of the main objectives of CLARUS to give
back control to the user.
2.5.4.1 Risks regarding data location
In Europe, the legal framework for the personal data protection is based on the principle that it should
be possible at any time to control the data location (territoriality principle).
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 34 / 80
Yet in a public cloud service getting this information is not possible. This raises the question of the
jurisdiction of courts and applicable law. The fact that it is not possible to perform audits hinders the
control on security measures implementation.
Similarly, personal data transfer outside EU’s borders is regulated. Without a consistent level of data
protection and guarantees as to the security measures implemented, data confidentiality is uncertain.
2.5.4.2 Risks of information system loss of control
Governance: using the services of a cloud provider, the customer grants the provider full
control, including the management of security incidents.
Portability/Reversibility: cloud services do not always ensure reversibility of data, applications
or services. In these conditions it seems difficult to consider changing provider.
2.5.4.3 Risks related to multi-tenancy and resource sharing
Faulty isolation: resource separation mechanisms (storage, memory) may be faulty and integrity
and confidentiality of data compromised;
Incomplete or insecure deletion: there is no guarantee that the data is actually deleted or that
there are no other copies stored in the cloud.
2.6 Demonstration Cases for CLARUS
The aim of this section is to describe precisely those application cases that will be used as
demonstrators for CLARUS (especially in the frame of WP6 work), i.e. cases that will help in specifying
CLARUS, designing its architecture and validating it all along the project.
We have seen that the geopublication domain is wide and numerous scenarios of use of datasets and
services can be identified. We have listed a wide range of security expectations, even if some of them
cannot be fully answered by foreseen CLARUS. We nevertheless try and focus on those most common
scenarios where cloud technologies are used and where CLARUS could bring breakthrough solutions to
important security expectations,
Both IaaS and PaaS delivery models are covered as well as situations where distinct CSPs can be
involved.
In particular three main cases are detailed, that could be mapped to the different CLARUS typical
scenarios:
Storage of geospatial data,
Publication and processing of geospatial data,
Collaboration on geospatial data.
For each of these cases, a four-sided perspective has been adopted, answering key questions about:
Why the corresponding demonstration case is relevant for both CLARUS and the geospatial
domain,
Who is susceptible of using it in a real-life context,
What are the sample data selected, their type and their sensitivity/criticality,
How the application case is usually designed and implemented.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 35 / 80
2.6.1 Geodata storage in the Cloud
2.6.1.1 Purpose (Why?)
Exploiting the cloud’s storing facilities is crucial for the actors in the geospatial field, as the volume of
data they use (data with a spatial reference usually called geodata) becomes more and more important
over time. In addition to storage capabilities, the cloud provides the users with ubiquitous access to and
sustainability of their data.
However securing the storage of geodata is a challenge.
For instance, applying ordinary measures to secure data access like disabling external connections to the
database could be very detrimental for the geospatial data users. Indeed geodata viewing, editing, and
analysis sometimes require the use of dedicated rich-client applications (e.g. QGIS) running on premises.
2.6.1.2 Actors (Who?)
People who create, interpret, and use geodata: geomatics/Geo-IT professionals, geodata and content
providers (e.g. public authority, national survey, territorial authority, etc.).
2.6.1.3 Data (What?)
The data that need to be secured in the typical context of a geospatial infrastructure are:
Geographical coordinates, and/or,
Scientific data (e.g. measurement value, measurement type, etc.).
For this demonstration case, any geospatial dataset requiring a certain level of security / privacy may fit
(cf. §2.2).
Example: Rare Earth and Minerals Resources Data.
This dataset includes information about the nature, genesis, location, extent, mining and distribution of
mineral resources, presence of rare earth, etc.
2.6.1.4 Design (How?)
In this typical case the data is accessed through a rich-client application supporting spatial databases
and GIS file formats (e.g. QGIS). A typical use case of QGIS is to extract data from a spatial database (e.g.
PostGIS) and convert them into a shapefile in order to use them in another Geographical Information
System.
The rich-client application runs on premises whereas Geodata are stored on the cloud.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 36 / 80
Figure 5 - Geodata Storage Activity Diagram
Cloud storage is partitioned into multiple dataspaces in order to structure hierarchy of geo-datasets.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 37 / 80
A dataspace can be either stored on a file storage system (with one or more GIS files depending on the
GIS file format), or on a database server (as a spatial database).
GIS files: the rich-client application (e.g. QGIS) requires accessing to GIS files locally via the file
system, whereas file storage is on the cloud. Therefore a mount point (on Unix) or a virtual drive
(on Windows) must be created and mounted to the remote file storage server (AWS S3,
OpenStack Swift, Ceph or instances exposing a remote directory through NFS, GlusterFS or CIFS).
Spatial databases: the rich-client application accesses directly to remote spatial databases
running on the cloud. Therefore the remote database server must support spatial databases
(PostgreSQL with PostGIS extension)
Settings of a dataspace include configuration specific to the type of storage or specific to GIS format but
also include security settings:
GIS file(s): in order to make management efficient, a dataspace is a directory where GIS file(s) is
(are) stored. The administrator manages the security settings of the GIS directories as if they are
located on premises, through the file system.
Spatial database: a dataspace is necessarily a spatial database on the database server. The
administrator manages the security settings of the spatial databases directly on the database
server.
2.6.1.4.1 Dataspace management (use case)
The way the administrator creates, modifies or deletes a dataspace depends on the type of storage:
GIS file(s):
o Create dataspace: the administrator creates a dedicated directory (using any file tool),
then the administrator may configure specific settings if required by the GIS file
format,
finally the administrator sets the owner of the dedicated directory and sets
access rights for all users (POSIX permissions on Unix, ACLs on Windows). Access
rights can be read-only, write or none,
o Modify settings: the administrator can modify name, path and access rights of a GIS
directory,
o Delete dataspace: the administrator simply deletes the GIS directory dedicated to the
dataspace.
Spatial database:
o Create dataspace: the administrator creates a dedicated spatial database directly on the
database server (using any database administration tool),
configures the policy of client authentication for the new database,
finally the administrator sets the owner of the spatial database and grants
privileges to other users. Access rights can be read-only, write or none,
o Modify settings: the administrator can modify any setting of a spatial database,
including the policy of client authentication and granted privileges,
o Delete dataspace: the administrator deletes the spatial database dedicated to the
dataspace directly on the database server (using any database administration tool). The
administrator also deletes the associated policy of client authentication.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 38 / 80
2.6.1.4.2 Data management (use case)
2.6.1.4.2.1 Upload:
The way the provider uploads dataset depends on the type of storage:
GIS file(s): the data provider copies the GIS file(s) directly to the GIS directory (using any file
tools). The GIS file(s) are transparently transferred into the cloud storage.
Spatial database: through any database client tool, the data provider initializes the remote
spatial database with a database script that defines the schema of the database and/or contains
geodata.
2.6.1.4.2.2 Modify geodata:
The way the provider creates, modifies or deletes geodata of a dataspace depends on the type of
storage:
GIS file(s):
o Connection: through the rich-client application (e.g. QGIS), the data provider navigates
in the file system to the GIS directory and select the main GIS file to open,
the rich-client application open the GIS file(s),
o View geodata: the data provider can browse, search and view all geodata stored in the
GIS file(s) using the rich-client application,
o Modify geodata: the data provider can modify geodata stored in the GIS file(s) using the
rich-client application,
o Delete geodata: the data provider can delete geodata stored in the GIS file(s) using the
rich-client application,
o Create geodata: the data provider can create new geodata in the GIS file(s) using the
rich-client application,
o Disconnection: through the rich-client application (e.g. QGIS), the data provider closes
the GIS file(s).
Spatial database:
o Connection: through the rich-client application (e.g. QGIS), the data provider connects
directly to the remote spatial database using a user account,
the data provider must specify all connection information: host, port, database
name, user and password,
the rich-client application connects to the remote spatial database,
o View geodata: the data provider can browse, search and view all geodata stored in the
spatial database using the rich-client application,
o Modify geodata: the data provider can modify geodata stored in the spatial database
using the rich-client application,
o Delete geodata: the data provider can delete geodata stored in the spatial database
using the rich-client application,
o Create geodata: the data provider can create new geodata in the spatial database using
the rich-client application,
o Disconnection: through the rich-client application (e.g. QGIS), the data provider closes
the connection to the spatial database.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 39 / 80
2.6.2 Geodata publication in the Cloud
2.6.2.1 Web mapping (metadata and map creation)
2.6.2.1.1 Purpose (Why?)
Geospatial data providers like public authorities are required to conform with the INSPIRE European
directive, which states that the data they are responsible for must be:
accessible on the internet and reusable,
through research, consultation and downloading,
thanks to metadata and services
Securing the geo-publication is an important concern as data providers often want to limit the access to
some of their spatial datasets and data services.
Access to data and/or services could be restricted due to public security or national defense,
and more generally to the existence of a security constraint.
Data could also be withheld from general disclosure due to business matters (i.e. selling data). In
this case the data has a high commercial value, not only for the provider (who sells it) but even
more for the companies who have acquired the data and reported it to the provider.
Furthermore some data are confidential (information about which companies are buying which
data).
2.6.2.1.2 Actors (Who?)
Public authority acting as geo-referenced data providers like national survey, municipality, territorial authority, etc.
2.6.2.1.3 Data (What?)
Example: Boreholes, Wells and Groundwater Data
In the case of a restricted access due to national security concerns, the dataset used to
demonstrate the publication scenario could be the groundwater boreholes data in France (i.e.
groundwater bank of basement maintained by the BRGM, the French national geological
survey). This dataset is subject to restricted access due to national security matters (at least
from the 1:100,000 scale)
Data category Data name / data type
Critical Geom (spatial geometry)
Gisement (varchar)
Confidential
In the case of a restricted access due to business concerns, the dataset used to demonstrate
the publication scenario could be the borehole data from the oil industry in Denmark and
Greenland (maintained by GEUS, the Danish geological survey).
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 40 / 80
Data category Data name / data type
Critical
Confidential the whole dataset
The publication of geodata should be done through the definition of metadata, helping the
users to find the data they are looking for. Metadata need not be secured themselves, however
they contain information about the category of constraints applicable to the data or the service
itself. For instance the AccessConstraints element specifies any access constraints associated
with the corresponding resource (see below).
Screenshot - Access constraints metadata on groundwater dataset
2.6.2.1.4 Design (How?)
Publication is done thanks to services used by geo web applications (javascript library e.g. OpenLayers
/Leaflet)
Geo catalogue services : CSW (provided by a cataloguing application, e.g. GeoNetwork),
Geo web services : WMS, WFS (provided by a geospatial server, e.g. MapServer/GeoServer) ,
Download services : ATOM or WFS (for simple download i.e. whole dataset), WFS (for direct
download).
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 41 / 80
Figure 6 - Geodata Publication Diagram
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 42 / 80
Different data are stored on the cloud:
Geo-spatial data storage is in charge of storing and serving geodata. Geodata are scientific data
associated to geographical features (points, lines, polygons, etc.).
Maps and layers storage is in charge of storing and serving information which is necessary to
build an image for a map: source of geodata (GIS files or spatial database), symbolic to represent
data, etc.
Metadata storage is in charge of storing and serving metadata on geodata and on maps and
layers.
Tile cache is in charge of storing and serving tiles built from an image of a map.
All data are exclusively accessed from the computation services.
Computation services and storage services are dissociated on the diagram, which means that
different CSPs could be used: one for running the services and one (or more) for data storage.
However, in order to have an efficient solution, we should consider using a single CSP for both
running the services and for data storage. The combination of computation services running on
the cloud and storage on the cloud forms a SaaS application provided by the CSP.
2.6.2.1.4.1 Publish data (use case)
On premises, the data providers prepare geodata and maps for the geopublication:
Import data: the data provider uses the transfer file service to upload the dataset files on the
cloud through any file transfer tool which supports SCP or SFTP protocols.
Publish data: on premises, the data provider publishes geodata:
o Creating a map with all its layers:
Defining the source of geodata and the symbolic,
Publishing layers of the map: background layer is published as WMS service and
other layers are published as WFS service.
All information (map, layers, symbolic, WMS service, WFS service, etc.) are
stored in a single .map file,
o Then using the transfer file service, the .map file is uploaded on the cloud through any
file transfer tool which supports SCP or SFTP protocols,
o Finally the data provider creates and publishes the metadata for the geodata and for the
map using the metadata catalog running on the cloud.
After that, geodata can be found using the dedicated Web application running on the cloud or using the
CSW service. Geodata, maps and tiles are accessible using the WFS, WMS and ATOM services.
2.6.2.1.4.2 Access data (use case)
On premises, the data consumer uses an application compatible with OGC services to search and view
geodata:
Search: through the GIS application, the data consumer invokes the CSW service running on the
cloud to search for geodata and discover how to access it (metadata gives endpoints for WFS,
WMS and ATOM services),
o Then the GIS application displays the list of available layers.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 43 / 80
Display map: then the GIS application calls the WMS service and displays the background image
of the map:
o On the cloud, the tile service builds the background image from the cached tiles,
o If a tile does not exist in cache, the tile service invokes the map service via WMS. The
returned image is split into multiple tiles and put in the tile cache.
Selection of layers: then the data consumer selects one or more layers,
o For each selected layer, the GIS application calls the WFS service and displays the
features (geodata) on the map.
Navigation: each time the data consumer zooms in/out or pans, the map must be refreshed.
Step is executed again.
Download: the data provider may download geodata:
o Only a subset through the WFS service,
o Or the entire dataset through the ATOM service or through the WFS service.
2.6.2.2 Geo-processing (Kriging computation)
The following demonstration case is about securing a geo-processing service for commercial purpose.
This scenario might not be found in a single organization but at least all the elements of the scenario are
known to occur in different organizations: for instance providers selling geospatial data on one hand,
and providers offering computational services on the other.
In this demonstration case, we have selected a dataset (borehole data) known to have a high
commercial value . And we have selected a computational service requiring important efforts and
investment, suggesting that the service provider may want to get some benefits back (at least to
compensate what has been spent): namely, a Kriging service.
Kriging is an interpolation method, which is widely used to estimate the value of an unmeasured
location from known measurements observed at nearby locations (cf.2.4.4.2) .
2.6.2.2.1 Purpose (Why?)
Why securing a commercial computational service?
In a transaction involving a Kriging interpolation, the service provider holds measurements together
with their related coordinates and wants to provide the computational services in return of some
benefits. On the other side of the transaction, the data consumer wants to obtain an estimated
prediction for a specific location without making measurements.
According to B. Tugrul and H. Polat, the problem is that although Kriging is increasingly becoming
popular and widely used for estimating predictions, it fails to protect confidentiality. Data providers
and data consumers could hesitate to participate in Kriging transactions. [R14]
2.6.2.2.2 Actors (Who?)
Data providers : public authority proposing commercial services, business data providers,
Data consumers : companies that are ready to pay for using the geoprocessing service.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 44 / 80
2.6.2.2.3 Data (What?)
From the provider perspective, measurements collected for Kriging interpolations and their related
coordinates are considered as confidential. They are its valuable assets. The provider “might lose
competitive edge over other rival companies” in case of data disclosure. [R14]
From the consumer perspective, the location where a prediction is requested and the estimated
prediction for that location are considered confidential. Based on the outcome of the Kriging
interpolation, data consumers plan investments and they do not want to reveal the location and the
related estimated prediction to the service provider.
To sum up, we could say that the confidential data that should be protected against involving parties
are:
Coordinate values of the sample locations (provider-side),
Observed measurements (provider-side),
Coordinate values of the research location (consumer-side),
Estimated prediction (consumer-side).
In addition privacy should also be preserved, i.e.:
Commercial transaction (which company buys/sells which data )
Kriging interpolation is used in various areas (mine reservoirs, petroleum industry, environmental
sciences, agriculture, etc.).
In this case, we plan to demonstrate a privacy-preserving Kriging interpolation on the groundwater
borehole data in Denmark and Greenland.
Data category Data name / data type
Critical
Confidential Geom (spatial geometry)
Owners and buyers of data
2.6.2.2.4 Design (How?)
Geo web services : WPS,
Used by geo web applications.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 45 / 80
Figure 7 - Geo-processing Activity Diagram
The Geo-processing activity diagram completes the Geodata publication diagram adding a Kriging
service running on the cloud.
2.6.2.2.4.1 Publish kriging (use case)
The data provider is in charge of publishing the Kriging application:
Install Kriging application: the data provider uses the transfer file service to upload the Kriging
application on the cloud through any file transfer tool which supports SCP or SFTP protocols.
Publish Kriging service: the data provider creates and publishes the metada for the Kriging
service using the metadata catalog running on the cloud.
After that, Kriging service can be found using the CSW service and is accessible using the WPS service.
2.6.2.2.4.2 Invoke a kriging service (use case)
On premises, the data consumer uses an application compatible with OGC services to search and invoke
a Kriging service:
Search: through the GIS application, the data consumer invokes the CSW service running on the
cloud to search for the kriging service and discover how to access it (metadata gives endpoints
for WPS services).
Selection of features: on the map, the data consumer selects the features to be used by the
Kriging service.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 46 / 80
Estimation: then the GIS application invokes the kriging service with the selected point through
the WPS service.
Display estimated points: the GIS application displays the estimated points on the map.
How the Kriging WPS is implemented ?
WPS Kriging process can be implemented in different ways: for instance within a GIS server like
GeoServer (which provides WPS support and Sextante integration) or within a java-based WPS container
like the 52°North Web Processing Service.
The following diagram shows an implementation of the Kriging process through a Java class in a Linux
machine, using Apache Tomcat Web Java Server and the 52 North WPS 3.1.1 implementation. Ordinary
Kriging is applied on the input data using the R Gstat package. The interconnection between the Java
module located at the WPS Container and R is handled by the TCP/IP server Rserve. [R10]
Figure 8 - Server configuration of WPS Kriging implementation [R10]
2.6.3 Collaboration on geodata in the Cloud
Aside from storage, publication and computation, the collaborative edition of geographic data is another
important feature of a Geographic Information System (at least in some cases). One of the most popular
initiatives to collaborate on geo-referenced data online is OpenStreetMap (OSM), an open-source
project which provides online JavaScript/Flash-based editors and a RESTful editing API for collaborating
on geodata [R15]. Thanks to OSM, any mapper (whether amateur or professional) could add or edit any
feature in any area – such as roads, but more generally any geographic object one may wish to
reference (phone boxes, bus stops, parks, public toilets, etc.)
OpenStreetMap mainly relies on volunteers to work on the task of collecting geodata and providing
maps, free to use, without restriction, i.e. it promotes a crowdsourcing approach to geography. The
benefit is that while in other systems, the update cycle of maps is often very slow, OSM is continually
updated.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 47 / 80
This project is very popular, but this popularity comes with drawbacks, from deliberate vandalism to
inaccuracies or mistakes. As OSM keeps a full editing history for every object, any mistakes or
deliberated vandalism can be rolled back, however these drawbacks may become critical in some
domains – for instance the domain of disaster-management. Indeed map projects sometimes take place
in an “extremely sensible environment like a war zone, where tolerance for inaccuracy is low, security
and privacy concerns for respondents are too high. In these cases, encouraging the crowd to submit
sensitive data might not be the best strategy.” [R18]
In the case we propose as a demonstration for geodata collaboration, we have decided to focus on
disaster-management but, despite the vital role it plays in this domain (as exemplified by the
Humanitarian OpenStreetMap Team), letting aside the OSM crowdsourcing approach and choosing a
standard-GIS solution using the WFS-T service.
2.6.3.1 Purpose (Why?)
The geospatial community can meet the challenge of disaster management, and providing support to
situations of emergency is one of the key roles of a GIS. This is especially the case for complex
emergency actions where geo-information plays an important role. It is crucial that information on the
spatial extent and consequences of disasters are made available within a short time to decision makers,
and shared between disaster-response personnel for the coordination of emergency efforts. Mobile
applications are obviously desirable in such a context as they could help in taking near real-time
decisions in the field. However these tasks require more than ‘classic’ GIS functionalities, as not only
visualization is needed, but also quick update of geo-information.
A common approach for supporting this geodata collaboration scenario is to provide a web-based
application with a user interface to select and update map features. This interface (a mapping client
application) allows to draw new features, move, modify or delete features in the map according to the
current situation. This editing client is primarily designed for an operational officer in the field, for
emergency tasks with a spatial reference such as marking evacuation routes, marking the place for
helicopter landing, etc. [R17]
However, security is an important concern. It is “one of the major reasons cited by organizations for
failing to share data in support of emergency response.” While there are “enormous amounts of data”
essential to emergency management, “many organizations are unwilling to share their data or will
provide it only under very restrictive agreements because of concerns about data security or liability”.
[R16]
It is therefore crucial to develop a set of security requirements for data to be shared in the event of a
local, regional, or national emergency. These guidelines should be implemented by the parties involved.
2.6.3.2 Actors (Who?)
Geohazard experts (in the field), Disaster-response personnel.
2.6.3.3 Data (What?)
Utility and governmental services susceptible of being damaged in the event of a disaster, and whose
security and confidentiality is critical for authorities and/or private companies: oil & gas pipeline, Water
supply system network, Electricity transmission lines, Transmission network for different kind of data/
signals.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 48 / 80
This broad dataset is part of the INSPIRE data specifications, whose typical use example is risk
management. In the context of the Seveso Directive, which is of major importance in regulating
management of risk, access to utility data is needed.
Data to be secured in this context:
Geospatial coordinates of critical points
2.6.3.4 Design (How?)
Through mobile applications (mobile-friendly javascript map library)
Using WFS-T* (for transactions)
With geospatial servers supporting WFS-T (e.g. GeoServer/Degree)
Publishing data from a PostgreSQL/postGIS database (layers) with OGC web services
The editing (mobile) application allows users to display WFS layers from a geospatial server (e.g.
Geoserver or Degree). The user can edit a selected WFS layer by clicking on a dedicated button ("Edit").
The Layer being added is highlighted in the layer list and simultaneously an editing toolbox is displayed.
In this toolbox, the user can choose either to move existing features from the edited layer, create new,
or delete existing ones. When the user edits features a WFS transaction is created (i.e. an XML file
specifying which features have been modified and how). When the user clicks the Save button in the
editing toolbar, the generated WFS transaction (WFS-T) is sent to the server and the data updated. The
modifications are then displayed to any other device connecting to the application
* WMS and WFS services are specified as read-only. WFS-T is a part of the WFS specification that allows
updates of the underlying data (rarely implemented due to inefficiency of XML and HTTP to upload large
GIS data)
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 49 / 80
Figure 9 - Geodata Collaboration (Consultation) Activity diagram
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 50 / 80
Figure 10 - Geodata Collaboration (Modification) Activity diagram
Different data are stored on the cloud, and all these data are exclusively accessed from the computation
services. (see previous case for details).
2.6.3.4.1 Publish data (use case)
The data providers prepare geodata for the geopublication on premises and create maps on the cloud:
Import data: the data provider uses the transfer file service to upload the dataset files on the
cloud through any file transfer tool which supports SCP or SFTP protocols.
Publish data: on the cloud, the data provider publish geodata:
o Creating a map with all its layers:
Defining the source of geodata and the symbolic,
Publishing layers of the map: background layer is published as WMS service and
other layers are published as WFS service.
o Then the data provider creates and publishes the metadata for the geodata and for the
map using the metadata catalog running on the cloud.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 51 / 80
After that, geodata can be found using the dedicated Web application running on the cloud or
using the CSW service. Geodata, maps and tiles are accessible using the WFS, WMS and ATOM
services.
2.6.3.4.2 Access data (use case)
On devices, the data consumer uses an application compatible with OGC services to search and view
geodata:
Search: through the GIS application, the data consumer invokes the CSW service running on the
cloud to search for geodata and discover how to access it (metadata gives endpoints for WFS,
WMS and ATOM services),
o Then the GIS application displays the list of available layers.
Display map: then the GIS application calls the WMS service and displays the background image
of the map:
o On the cloud, the tile service builds the background image from the cached tiles,
o If a tile does not exist in cache, the tile service invokes the map service via WMS.
Returned image is split into multiple tiles and put in the tile cache.
Selection of layers: then the data consumer selects one or more layers,
o For each selected layer, the GIS application calls the WFS service and displays the
features on the map.
Navigation: each time the data consumer zooms in/out or pans, the map must be refreshed.
Step is executed again.
2.6.3.4.3 Modify data (use case)
On devices, the data provider or mandated people use an application compatible with OGC services to
modify features:
Select the layer: through the GIS application, the data provider or mandated people selects a
layer to modify
o The GIS application displays the edit toolbox,
o The GIS application calls the WFS-T service to lock the features.
Modify the layer: then the data provider or mandated people modify the layer
o Either selecting the feature to modify:
Modifying the feature value,
Deleting the feature,
o Or creating a new feature (specifying the coordinates and the scientific data),
o Then the data provider or mandated people can continue to modify the layer.
Save modifications: then the data provider or mandated people saves the modified layer:
o The GIS application calls the WFS-T service with all modifications (new features, features
modified and features deleted):
The WFS-T service modifies all the features in one transaction,
The WFS-T service releases the lock on the data,
o The GIS application closes the edit toolbox,
Update metadata: then the data provider or mandated people update the metadata for the
map using the metadata catalog running on the cloud.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 52 / 80
2.7 Securing demonstration cases
The aim of this section is to give an overview on security expectations as they relate to the different
demonstration cases described in the previous part of the document. A particular attention is drawn
upon important matters regarding the CLARUS requirements, i.e.:
Protecting the data, knowing:
o Where the data is
o How it should be protected
o Who is accessing it (audit)
Protecting confidentiality vs :
o The CSP,
o Potential attackers.
2.7.1 Securing geodata storage in the Cloud
2.7.1.1 Protecting the data
Access to a dataspace must be protected, so that only users with granted privileged can access it. The
way a dataspace is protected depends on the type of storage:
GIS files: because users access GIS files locally via the file system, thanks to a mount point (on
Unix) or a virtual drive (on Windows), access control must rely on the system user account and
on the rights defined on the GIS directory. Only the administrator is allowed to manage the GIS
directories (create, modify, delete).
Spatial database: because users connect directly to the spatial database, access control must
rely on the database user, on the defined policy of client authentication, and on the privileges
given for the spatial database. Only the administrator is allowed to manage the spatial
databases, using a privilege account (create, modify, delete).
A user that can connect to a dataspace can access all data of the dataspace.
Data is accessed by:
Dataset administrators are in charge of managing datasets using suitable tools (e.g. file browser,
database admin tool) running on premises.
Data providers manage the content of dataspace on which they have write permissions using
suitable tools (e.g. QGIS) running on premises.
2.7.1.2 Protecting confidentiality
Because dataspaces are located on the cloud whereas rich-client application runs on premises, traffic
must be filtered and communications protected:
Storage service must be configured to allow access only from the premises.
Users must be authenticated and/or requests must be signed.
Pprotocol(s) used to access the remote file storage server (AWS S3, OpenStack Swift, Ceph or
instances exposing a remote directory through NFS, GlusterFS or CIFS) and protocol(s) used to
access the database server (PostgreSQL protocol) must be encrypted.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 53 / 80
In case of data leakage out of the cloud storage, data must be unusable. As mentioned previously, data
that need to be particularly secured are geographical features and/or scientific data (e.g. measurement
value, measurement type, etc.). The way those data are encoded depends on the type of storage and on
the GIS format. However the data types are generally the following ones:
Geographical features are encoded with a composite type which vary according to the GIS
format. A geographical feature is by nature difficult to interpret but is easy to identify.
Scientific data are generally encoded with a primitive type (boolean, number, string). A scientific
data may be difficult to identify because it is drowned among the other data, but is easy to
interpret.
2.7.2 Securing geodata publication in the Cloud
2.7.2.1 Protecting the data
2.7.2.1.1 Constraints and limitations
Geodata, maps and tiles are intended for publication. However data providers may want to restrict
access to authorized people only, or give public access to low-quality data only, degrading exactness and
accuracy of maps and data:
Restrict access to the entire dataset and maps: only authorized data consumers can access all
the geodata, maps and tiles.
Degrade exactness and accuracy of maps and data: any data consumer can access the geodata
maps and tiles, but data is made imprecise and inoperable at small scale levels.
Metadata do not need to be protected and can be used by anyone to have information on data
(including information on security).
Data integrity may be important for data consumers. Maps and tiles could be signed in order to
guarantee the data consumers of the origin and of the integrity of the returned images.
2.7.2.1.2 Roles and access
In the case of geo-publication, data is accessed by:
Data providers, in charge of
o Managing geodata and maps through the transfer services (SCP, FTP, etc.) (write),
o Managing metadata through the dedicated Web application running on the cloud or
through CSW (write).
Data consumers, likely to
o Request images of map or tiles through the WMS service and request features (geodata)
through the WFS service (read-only),
o Download the entire dataset through the ATOM service or through the WFS service
(read-only).
Services (on behalf of the data consumer)
o Get tile service, that creates and caches tiles when requested tiles do not exist (write),
o Get map service, that creates an image of the requested map (write).
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 54 / 80
2.7.2.1.3 Geo-processing peculiarities (Kriging)
The Kriging data is accessed by:
Data providers, in charge of
o Installing and maintaining the application (write) that serves the Kriging service through
the WPS service,
o Managing metadata on WPS through the dedicated Web application running on the
cloud or through the CSW service (write).
Data consumers, that request Kriging processing running on the cloud through the WPS service
(read-only)
Kriging service (on behalf of the data consumer), that processes estimation of points from the
geodata samples (read-only).
The data consumer may request a private estimation, so the result should be considered as private data
that the data provider and CSP should not store and operate.
Samples of geodata must not be directly accessible on the cloud storage. All users must use
computation services to access data.
In the case a degradation of data is required for security reasons, imprecise data must not imply
incorrect estimation in the Kriging service.
2.7.2.2 Protecting confidentiality
Because geodata, maps and tiles are intended for publication, confidentiality is generally not an issue.
However, geodata, maps and tiles must not be directly accessible on the cloud storage. All users must
use computation services to access data.
Moreover, confidentiality must be protected when a restricted access to the entire dataset and maps is
required.
Traffic must be filtered and communications protected:
Compute service could be configured to allow access only from a white list of IP networks.
Users must be authenticated and requests could be signed.
Protocols used to access geodata, maps and tiles must encrypted and protected (OGC services
and ATOM service rely on HTTP/S).
In case of data leakage out of the cloud storage, data must be unusable. As mentioned previously, data
that need to be particularly secured are geographical features and/or scientific data (e.g. measurement
value, measurement type, etc.). The way those data are encoded depends on the type of storage and on
the GIS format. However the data types are generally the following ones:
Geographical features are encoded with a composite type which vary according to the GIS
format. A geographical feature is by nature difficult to interpret but is easy to identify.
Scientific data are generally encoded with a primitive type (boolean, number, string). A scientific
data may be difficult to identify because it is drowned among the other data, but is easy to
interpret.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 55 / 80
When degraded accuracy of maps and data is required, Geodata must be imprecise and maps
generated for small levels must be inaccurate in case of data leakage out of the Cloud storage.
2.7.3 Securing geodata collaboration in the Cloud
2.7.3.1 Protecting the data
Geodata, maps and tiles are intended for publication. They can be viewed or downloaded by any data
consumer but only data provider or mandated people can modify them.
Metadata do not need to be protected and can be used by anyone to have information on data
(including information on security). However, only data provider or authorised people can modify
metadata.
Data integrity is very important for people mandated for data modification. Geodata, maps and tiles
could be signed in order to guarantee the mandated people of the origin and of the integrity of the
returned data and images.
2.7.3.1.1 Roles and Access
In the case of geodata collaboration, data is accessed by:
Data providers, in charge of managing
o Geodata through the transfer services (SCP, FTP, etc.) (write),
o Maps and layers through the dedicated Web application running on the cloud (write),
o Metadata on geodata and on maps through the dedicated Web application running on
the cloud (write).
Data providers or mandated people, likely to:
o Modify features through the WFS-T service (write)
o Update metadata associated to modified features through the dedicated Web
application running on the cloud (write).
Data consumers, likely to:
o Request images of map or tiles through the WMS service and request features (geodata)
through the WFS service (read-only).
Services (on behalf of data providers)
o Get tile service, that creates and caches tiles when requested tiles do not exist (write).
o Get map service, that creates an image of the requested map (write).
2.7.3.2 Protecting confidentiality
Because geodata, maps and tiles are intended for publication, confidentiality is usually not an issue.
However, confidentiality is an issue while modifying geodata (see previous section for expectations
regarding the protection of confidentiality).
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 56 / 80
2.7.4 Summary
As a brief summarization we have built the mapping of the various demonstration cases proposed in §
2.6, and the security expectations on which they depend.
Note these security expectations will not necessarily be all covered by the CLARUS solution. As well
CLARUS may also be constrained by others parameters (performance, availability...etc). These non-
functional constraints will be expressed in the deliverable D2.2 of the same Work Package WP2 related
to the requirements specification. The only expectations considered here, are the security expectations
thought as prerequisite of confidence (cf. §¡Error! No se encuentra el origen de la referencia.).
Security
Expectations
Geo Data
Demonstration Cases
Geospatial subjects
Refers to § Storage Publication Processing Collaboration
2.6.1 2.6.2.1 2.6.2.2 2.6.3
Data
accessibility and
alteration
protection
2.5.1.1
metadata-based
access limitations
2.5.1.2
confidentiality
2.5.1.3
authenticity of
sources (data
quality)
2.5.1.4 []
Services
access
authentication (e.g.
GeoRM)
2.5.2.1
[]
metadata-based
access limitations
2.5.2.2.1
secure
communication
protocols
2.5.2.3
Personal
Data
personal data
protection
2.5.3.1
access auditing
2.5.3.2
Cloud-
based
hosting
data location risks
2.5.4.1
information system
loss of control risks
2.5.4.2
multi-tenancy and
resource sharing
risks
2.5.4.3
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 57 / 80
Relevance Levels : : relevant [] : potentially relevant
Figure 11 - Geo Data Demonstration Cases with regard to Security Expectations
For each of these demonstration cases, we have additionally tried to propose the following mapping for
the different CLARUS scenarios.
CLARUS Generic Scenarios
Refers to [A2] §2.1.2.1(Technical approach)
Geo Data
Demonstration Cases Refers
to §
Scenario A B C
Scenario configuration
Cloud customer(s) 1 More than one More than one
CSP
1 1
More than one
(>1 CSP or
1CSP and >1
user account)
Storage 2.6.1 A, C
Publication 2.6.2.1 B, C
Processing 2.6.2.2 A, B, C
Collaboration 2.6.3 B, C
Figure 12 - CLARUS Generic Scenarios with regard to Geo Data Demonstration Cases
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 58 / 80
3 E-Health Application Case
3.1 Overview
Due to the high degree of digitalization of the Medical Health Record in Hospitals nowadays, and
because of many legal regulations related to long term data preservation, there is a need to transform
active Medical Records, which contain the medical information of active patients (the ones who keep
contact with the Hospital for Healthcare purposes), to passive ones, based on different criteria, i.e.
cease of activity (when for any reason the patient stops having contact with the Hospital) and/or death,
with a time constraint that varies amongst the different European countries due to their specific laws (5
years in Spain).
The Hospital Healthcare professionals need to search and retrieve passive Medical Health Records data
for different reasons, for instance and majorly for healthcare purposes, and also for legal requirements,
typically one Medical Health Record at a time.
Another need we should take into account, specifically in the research field, is the possibility to make
Advanced Queries with or without Statistics Computation, over the passive Medical Health Records
data, in order to retrieve datasets (with multiple Medical Health Records extraction) meeting certain
conditions specified by the Hospital Healthcare professionals and to apply different statistical
calculations when required by the physician and/or the researcher.
In big Hospitals this creates an increasing demand on storage capacity, which pushes the required
technology assets and related costs of ownership. Within this scenario it is highly recommended to
move towards cloud storage services in order to cover these growing needs, which at the same time
arises the security issues related to the fact that we are dealing with very sensitive personal health data
cloud services, as we know them currently, are not able to fulfil the Security needs of such sensitive
data, as they cannot assure the integrity and confidentiality of the datasets they store. This is the main
reason why CLARUS approach, understood in this case as a layer of solutions on top of a cloud storage
services, must be a suitableanswer to the issues previously mentioned.
3.2 Actors
3.2.1 Data Providers
Patients’ Medical Health Records stored in the Hospital Information System (HIS), as active Medical
Records, are the source of data to be transformed into passive Medical Records, and then to be stored
in the cloud services using the CLARUS methodologies, technologies and tools.
The Medical Records Department of the Hospital will be responsible for the task of transforming the
active Medical Records into passive ones, so they will periodically trigger the Medical Records
“Passivation” process, based on different time constraints criteria as mentioned in the overview.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 59 / 80
3.2.2 Data Consumers
Hospital Healthcare professionals (mostly Physicians and Nurses) are the main potential users of the
passive Medical Records that will be stored in the cloud services using the CLARUS solutions. They need
to access those records, for data search and retrieve, in order to give answer to Healthcare needs
(access to a patient’s or patient’s relatives historical data, for instance to check family background), legal
requirements (access to any patient record derived from a Court order), research studies (access to
historical clinical data of groups of patients with certain diseases for data analysis and statistical
calculations), among others. The access to this data would always be done through the current Hospital
Clinical Workstation interface, keeping the Hospital security policies related to Medical Records to grant
access to the Healthcare professionals.
3.2.3 IT Team
Application developers of the Hospital IT Team, who are in charge of the enhancement and deployment
of the HIS functionalities, will have to assure that the Healthcare professionals have the possibility to
access (search, retrieve, query and compute) the passive Medical Records of the Hospital patients’
through the Clinical Workstation interface, stored in the cloud services using the CLARUS functionalities
provided through its API. The Clinical Workstation interface actually is the HIS application interface, as
mentioned above. Nevertheless, for testing and simulation purposes, in the CLARUS project framework,
it will be a stand-alone web based interface module with only part of the functionalities of the HIS,
developed for that single purpose, which is a simpler and easier approach.
3.2.4 Security Manager
This actor is in charge of defining, provisioning, maintaining security policies in the Hospital for the HIS.
This usually includes accounts management, profiles management, and authorisations management.
3.2.5 Cloud Service Provider
The company or companies that will provide an IaaS and/or PaaS type of service which will be used by
the Data Providers and Data Consumers, when storing and accessing the passive Medical Records of the
Hospital patients’ through the Clinical Workstation interface, developed by the IT Team of the Hospital.
3.3 Datasets
3.3.1 Introduction
There are several different types of standards commonly used in the Healthcare “business” aimed at
allowing the semantic interoperability of the Information Systems present in the cosmos of any
community of Healthcare Providers (Hospital Information Systems – HIS, Primary Care Information
Systems – PCIS, Social Care Information Systems, Mental Health Information Systems, Laboratory
Information Systems – LIS, Departmental Information Systems, etc.). They are also aimed at the future
capability to access old Datasets being able to understand their contents (semantics).
All those standards allow us to represent and exchange the Healthcare data stored in the mentioned
Information Systems, in a commonly understood manner (semantics), which eases again the
information exchange amongst all of them, and with other communities.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 60 / 80
In the e-Health domain we find different types of standards, and the most important of them are:
Messaging standards: which define the structure and workflow of the messages required to
exchange Healthcare Information between Healthcare Information Systems
Clinical documentation standards: which define the structure of clinical documents to be
stored in and/or exchanged between Healthcare Information Systems
Terminology standards: which define the codification or classification of different clinical terms
(diseases, clinical observations/results, drugs, etc.) that will be embedded as specific segments
in the standardized messages or in the standardized clinical documents.
As mentioned in the overview, all data contained in the e-Health use case Dataset, the patients’ passive
Medical Records, is considered by the Personal Data Protection Law (LOPD in Spain, expecting new
legislation for all Europe in the near future) as personal data of high level of sensitivity, which means
that requires the highest possible level of security and privacy.
3.3.2 Standards used in the e-Health use case
In the following sections, we describe the different standards currently in use by the Healthcare
Information Systems that are in production in Hospital Clínic de Barcelona
3.3.2.1 HL7 : Health Level 7 International [R22]
HL7 is a not-for-profit ANSI (American National Standards Institute) accredited standards developing
organization dedicated to providing a comprehensive framework and related standards for the
exchange, integration, sharing, and retrieval of electronic health information that supports clinical
practice and the management, delivery and evaluation of health services.
HL7 Version 2.X messaging standard is the workhorse of electronic data exchange in the clinical domain
and arguably the most widely implemented standard for healthcare in the world. This messaging
standard allows the exchange of clinical data between systems. It is designed to support a central
patient care system as well as a more distributed environment where data resides in departmental
systems. Hospital Clínic de Barcelona is using this standard to connect to the internal departmental
systems and also for the exchange of clinical information with other Healthcare providers in the
community, Atenció Integral de Salut de Barcelona Esquerra (AISBE), using the standards based
interoperability platform provided by Departament de Salut - TicSalut (iSISS.Cat).
The HL7 Version 3 Clinical Document Architecture (CDA®) is a document markup standard that specifies
the structure and semantics of "clinical documents" for the purpose of exchange between healthcare
providers and patients. It defines a clinical document as having the following six characteristics: 1)
Persistence, 2) Stewardship, 3) Potential for authentication, 4) Context, 5) Wholeness and 6) Human
readability. Hospital Clínic de Barcelona is using this standard to publish structured clinical documents to
the Shared Medical Record of Catalonia (HCCC – Història Clínica Compartida de Catalunya), but we must
say that this standard is actually under discussion because of its complexity, which led us to use it only
when mandatory.
3.3.2.2 International Classification of Diseases (ICD)[R23]
The International Classification of Diseases (ICD) is the standard diagnostic tool for epidemiology, health
management and clinical purposes. This includes the analysis of the general health situation of
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 61 / 80
population groups. It is used to monitor the incidence and prevalence of diseases and other health
problems, proving a picture of the general health situation of countries and populations.
ICD is used by physicians, nurses, other providers, researchers, health information managers and coders,
health information technology workers, policy-makers, insurers and patient organizations to classify
diseases and other health problems recorded on many types of health and vital records, including death
certificates and health records. In addition to enabling the storage and retrieval of diagnostic
information for clinical, epidemiological and quality purposes, these records also provide the basis for
the compilation of national mortality and morbidity statistics by World Health Organization (WHO)
Member States. Finally, ICD is used for reimbursement and resource allocation decision-making by
countries.
The ICD is developed and administered collaboratively between World Health Organization (WHO) and
international centers. All the countries implicated in CLARUS project are currently members of WHO.
The ICD is revised regularly to incorporate changes in the medical domain, in order to reflect advances in
medical knowledge. The latest revision of the ICD is ICD-10, published between 1992 and 1994, which
doubles the amount of diagnoses codes compared to ICD-9, providing more precise descriptions.
Hospital Clínic de Barcelona is using this standard, in its revision ICD-9, for the codification of diagnoses
all over the Hospital, as it is a mandatory requirement from the Departament de Salut – CatSalut
(catalan public insurer), for official reporting purposes. Revision ICD-10 is becoming compulsory in Spain,
and specifically in Catalonia, starting January 2016. Again the new standard increases complexity, and
the more complex it becomes the more resistance we find from our Physicians as it is a time demanding
activity. Then, codification is a process done by the Medical Records Department professionals (coders),
always after the clinical encounters have occurred, what means that a higher complexity will increase
time demand and costs of this Service.
3.3.2.3 Logical Observation Identifiers Names and Codes (LOINC) [R24]
LOINC terminology was initiated in 1994 by the Regenstrief Institute associated with the US Indiana
University, and it is used worldwide.
LOINC is a rich catalog (terminology) of measurements, including laboratory tests, clinical measures like
vital signs and anthropomorphic measures, standardized survey instruments, and more. LOINC enables
the exchange and aggregation of clinical results for care delivery, outcomes management, and research
by providing a set of universal codes and structured names to unambiguously identify things you can
measure or observe. The LOINC codes are universal identifiers for laboratory tests and other clinical
observations.
The main issue solved by LOINC is related to the interoperability of laboratory tests results, mainly from
the Laboratory Information Systems (LIS) to the Hospital Information Systems (HIS), allowing the use of
universal identifiers instead of their own internal code values, embedded in standard HL7 messages.
Hospital Clínic de Barcelona is using this standard to publish structured clinical documents (CDA)
containing the laboratory tests results to the Shared Medical Health Record of Catalonia (HCCC –
Història Clínica Compartida de Catalunya), as it is mandatory, and taking advantage of this requirement
we are also using this codification internally in the patients’ Medical Health Records.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 62 / 80
3.3.2.4 Systematized Nomenclature of Medicine – Clinical Terms (SNOMED CT) [R25]
SNOMED CT is a comprehensive, multilingual clinical healthcare terminology, with scientifically validated
clinical content that enables consistent, processable representation of clinical information in electronic
health records. It allows meaning-based retrieval of the clinical information, which increases the
opportunities to support evidence based care, real time decision support, and more accurate
retrospective reporting for research and management.
It is supported and developed by the International Health Terminology Standards Development
Organisation (IHTSDO), in a collaborative manner to ensure that it meets the diverse needs and
expectations of the worldwide medical profession. Several countries of the EU are members of the
Organisation (CLARUS partners: Belgium, Spain and United Kingdom).
SNOMED CT contains concepts with unique meanings and formal logic based definitions and is
organized into hierarchies, and is represented using three types of components: concepts, descriptions
and relationships (among concepts). It is also mapped to other health-related classifications and
terminologies in use around the world: ICD-9, ICD-10, ICD-11 (foundation layer) and LOINC.
Hospital Clínic de Barcelona is using this standard to code the Pathology diagnoses that are published in
structured clinical documents (CDA) to the Shared Medical Health Record of Catalonia (HCCC – Història
Clínica Compartida de Catalunya), as it is mandatory, and as a means to have standard codification of
those diagnoses in the patients’ Medical Health Record. There are no plans for now to widen the use of
this standard, again because of its high degree of complexity, and the efforts will be put in the evolution
from ICD-9 to ICD-10 in the near future.
3.3.2.5 Digital Imaging and Communication in Medicine (DICOM) [R26]
DICOM is the international standard for medical images and related information (ISO 12052). It defines
the formats for medical images that can be exchanged with the data and quality necessary for clinical
use. DICOM is implemented in almost every radiology, cardiology imaging, and radiotherapy device (X-
ray, CT, MRI, ultrasound, etc.), and increasingly in devices in other medical domains such as
ophthalmology and dentistry. With tens of thousands of imaging devices in use, DICOM is one of the
most widely deployed healthcare messaging standards in the world. DICOM has revolutionized the
practice of radiology, allowing the replacement of X-ray film with a fully digital workflow.
Hospital Clínic de Barcelona is using this standard since 2002, for the storage in the PACS1 of any medical
image generated within the Hospital walls, or even coming from other Healhtcare providers when
necessary, always tied to the patients’ Medical Records in the Hospital Information System. Since then,
there is an on-going process of integrating any medical device that generates images to the PACS.
1 PACS stands for Picture Archiving and Communication System, it is in fact the medical imaging repository which
typically links the medical images of a patient to his/her Medical Records stored in the Healthcare Information
Systems, and the access to those images is granted from the interface of the latter
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 63 / 80
3.3.2.6 OntoFarma
OntoFarma is an ontology of Pharmaceutical products, basically drugs or medicines, which has been
entirely developed by the Hospital Clínic de Barcelona Medical Informatics team from the Information
Systems Department, using Ontology Web Language (OWL), under a Project ordered by the Spanish
Ministerio de Sanidad (Health Ministery) by means of an official tender in 2014. This ontology allows us
to have a standardized codification of the medication prescribed to the patients by the Physicians in the
Hospital in-patient premises. It also allows “intelligent” querying to search for non explicit equivalent
drugs based on their action principles and to check drug interactions based on the implicit knowledge
that it contains.
3.3.2.7 OntoRegClin
OntoRegClin is an ontology of clinical observations, basically the ones that come from clinical monitors,
but not excluding the manually obtained ones, which is under development by the Hospital Clínic de
Barcelona Medical Informatics team from the Information Systems Department, using OWL, own funded
by Hospital Clínic de Barcelona on our own interest, as an evolution of the Hospital Information System
and the web based Clinical Workstation Interface, that we have in productive (HIS ,CWI). The natural
next step of this ontology, once finished, will be to map it to the LOINC standard. This ontology will allow
us to assure that wherever a clinical observation appears and is shown and/or stored throughout our
HIS and our CWI, it will be correctly classified for future understanding of the data, follow-up and
comparison purposes across time (historical view), and of course for interoperability requirements with
other Information Systems and with other Healthcare Providers or Healthcare Authorities.
3.3.3 E-Health Dataset
The Passive Medical Records Database stored in a cloud service using the CLARUS services and tools,
which we expect will enhance the security and privacy of the Dataset, will contain all the historical
clinical information of the patients of Hospital Clínic de Barcelona that have become “Passive”, due to
lack of encounters over an specific period of time (5 years according to Spanish specific legislation), due
to death or change of residence, etc. The patients’ historical clinical information is composed by a set of
different types of data, starting from PDF documents, unstructured data (free text), structured data
(based on standard terminologies or not), and finally medical images.
The goal is to keep all those types of data as much standardized as possible in the Passive Medical
Records Database, using the standards described in the previous sections, in order to guarantee the
future accessibility to them being able to fully clinically understand their contents (semantics), and
obviously to facilitate their meaningful use and exploitation.
We describe the different types of data we will have in the Passive Medical Records Database in the
following sections.
3.3.3.1 PDF documents
The PDF documents we can find in the Medical Health Records are Discharge and Results reports, which
are physically structured in a patient friendly layout to facilitate their reading, as they are usually given
to the patients, and must be stored in the Hospital Information System (HIS) as a legal requirement.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 64 / 80
They must be stored, “as is” in the Passive Medical Health Records Database, to keep them legally,
nevertheless their contents are mostly based on other types of data also present in the HIS that will be
covered in next sections. No standards would apply in this case.
3.3.3.2 Unstructured data (free text fields)
There are plenty of data fields in the HIS that contain unstructured data, as free text fields, which of
course means that we will not be capable to map their contents to any standard terminology.
Nevertheless we will have to store them in the Passive Medical Records Database, embedding them in
CDA standard documents (or any equivalent clinical document standard commonly used), using the
corresponding segment associated with the semantics of the specific field when mapping it into the
standard document.
The unstructured data fields in our HIS are among others: Clinical Notes, anamnesis, reason for
encounter, allergies, family background, personal background, physical exploration, etc.
3.3.3.3 Structured data
The structured data we can find in the Medical Health Records belongs to two different types, data
already mapped to a standard terminology or ontology, and data not mapped to any standard.
For the data already mapped to a terminology, we will have to store them in the Passive Medical
Records Database, embedding them in CDA standard documents (when applies) or directly as specific
database fields or sets of fields, using the corresponding segment to the terminology used when
mapping it into the standard document.
For the data not mapped to any standard, we will have to store them in the Passive Medical Records
Database, embedding them in CDA standard documents (when applies) or directly as specific database
fields or sets of fields, using the corresponding segment associated with the semantics of the specific
field when mapping it into the standard document.
The structured data fields in our HIS are: diagnoses (ICD-9), pathology findings (SNOMED-CT), laboratory
test results (LOINC), in-patient medication prescriptions (OntoFarma), vital signs and other clinical
registers (OntoClinReg).
3.3.3.4 DICOM medical images
The DICOM images, as they are already standardized in our Picture Archiving and Communication
System (PACS), will have to be stored in the Passive Medical Records Database “as is”, keeping the
accession number which links any image to the corresponding patient’s Medical Health Record in our
HIS.
3.4 Services
3.4.1 Introduction
The Clinical Workstation interface of the Hospital Information System (HIS) will be the means by which
the Medical Records Department will have the capability to “Passivate” active Medical Records for
storing, and the other Hospital Healthcare professionals will have access, for search, retrieval, query and
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 65 / 80
computation, to the passive Medical Records. This means that the CLARUS Services will be always
invoked through that application.
We show this in the following figures where we include a diagram of each of the four main processes
involved in this use case.
The first diagram shows the process in which the Data Providers are involved, Medical Health Records
“Passivation”:
Figure 13 - Medical Health Records “Passivation” process
Active patient
Medical Health
Record
Patient MHR
Passivation
Process
Patient becomes passive: Healing,
change of residence, death, lack
of encounters (time lapse: 5 yrs.)
Data Provider
HIS
HIS
HIS PASSIVE
MHR DB
PDF documents
CDA documents:
- Unstructured data
- Structured data
DICOM images
Cloud +
CLARUS
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 66 / 80
The second diagram shows the process in which the Data Consumers are involved, passive Medical
Health Records access and data retrieval:
Figure 14 - Medical Health Records access and data retrieval process
Passive patient
Medical Health
Record Query
Data Consumer
HIS
Passive patient
MHR data
retrieval
HIS
HIS PASSIVE
MHR DB
PDF documents
CDA documents:
- Unstructured data
- Structured data
DICOM images
Cloud +
CLARUS
Healthcare Professional requires to
Access a patient’s Passive Medical
Health Record for consultation
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 67 / 80
The third diagram shows another procces in wich the Data Consumers are involved: Advanced Query to
the CLARUS Cloud
Figure 15 - Advanced Query to the CLARUS Cloud process
Active patient
Medical
Health Record
Advanced
Query
Data Consumer
HIS
PASSIVE
MHR DB
Cloud +
CLARUS
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 68 / 80
The last diagram shows another process in wich Data Consumers are involved : Medical Record Statistics
Computation query.
Figure 16 - Medical Record Statistics Computation query
In the following sections we describe the services required to execute the different processes described
above.
Active patient
Medical Health
Record
Statistics
computation
Data Consumer
HIS
PASSIVE
MHR DB
Cloud +
CLARUS
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 69 / 80
3.4.2 Data Publication
Data publication services will be used by the “Passivation” process present in the Clinical Workstation
interface, that will transform active Medical Records into passive ones, which means that those records
will be moved from the HIS application database (or a fake dataset for testing purposes) to the cloud
data storage space using the CLARUS project API.
3.4.3 Metadata Management
In the publication process, i.e. “Passivation”, the required Metadata will be added to the records moved
from the HIS application database (or a fake dataset for testing purposes) to the cloud data storage
space using the CLARUS tool, in order to keep the same structure of Clinical documentation
categorization present in the HIS application database.
3.4.4 Search
The Clinical Workstation interface will provide a search capability, which will allow the Healthcare
professionals to access the passive Medical Records of a certain patient by his/her Patient ID, previously
recorded in the cloud data storage space using the CLARUS project.
Once the search done, the Healthcare professionals will be able to view the passive Medical Records of
that patient, always keeping the same structure of Clinical documentation categorization present in the
HIS application database. When necessary, they will also be able to download any of the Medical
Records of that patient, for any purpose they may require.
3.4.5 Advanced Queries
The Clinical Workstation interface will also provide an advanced search capability, which will allow the
Healthcare professionals to make Advanced Queries on structured data contained in the passive Medical
Records by means of the use of multiple fields search criteria, i.e. by defining different conditions to be
met by multiple structured data fields, like specific values or ranges of values, presence or absence of
certain medication records, duration , etc.
Once the search done, the Healthcare professionals will be able to view the dataset resulting from the
Advanced Query, and also to download it, for any purpose they may require (population assessment,
scientific research, etc.).
A couple of examples of this type of Advanced Queries would be:
Example 1: List of patient IDs that meet the following conditions:
o Diagnose: HIV (ICD9 = 042, V08, 042+079.53, etc.)
o Way of infection: infected (field INF4B = 2)
o Prescribed drugs: 2 nucleotides and 1 IP (OntoFarma codes)
o Prescription duration: > 1 year
o Follow-up duration: > 1 year
o Result: Checking of the viral load evolution over time: CD4 (LOINC = 3325)
Example 2: List of patient IDs that meet the following conditions:
o Diagnose: HIV + HPC
o Follow-up duration: > 12 weeks
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 70 / 80
o Combination of drugs prescribed (specific combination)
o Result: Checking of the RNA HPC, comparing level at the beginning to level at the end of
the treatment
3.4.6 Statistics Computation
The Clinical Workstation interface will also provide a last additional capability, which will allow the
Healthcare professionals to request Statistics Computations on structured data contained in the passive
Medical Records, which will be performed in the cloud, i.e. without requiring to download any data to
user Information System. Those Statistics Computations will typically consist in the calculation of means,
standard deviations, frequencies (%), etc., always performed on structured data fields.
Once the calculations done, the Healthcare professionals will be able to view the figures resulting from
the Statistics Computations requested, and also to download them, for any purpose they may require.
An example of this type of Statistics Computations would be:
For the patients identified in any of the previous examples of advanced queries:
o Age: mean and standard deviation of treated patients
o Sex: % of males and females
o Infection way: % of each predefined category of infection way
o Follow-up time: mean and standard deviation of treated patients
o Outpatient: mean and standard deviation of number of visits
o Inpatient: mean and standard deviation of number of stays in Hospital
3.4.7 Transformation Services
In order to preserve the integrity, but specifically the confidentiality of the passive Medical Records
stored in the cloud services using the CLARUS project, when data is pushed into the cloud by the Clinical
Workstation interface at “Passivation” time, this will have to use CLARUS project services for masking or
anonymizing the data to be stored.
At search and retrieve time, the Clinical Workstation interface will have to use CLARUS project services
for unmasking or de-anonymizing the retrieved data.
3.4.8 Exploitation Services
In order to preserve the integrity and availability of the passive Medical Records stored in the cloud
services using the CLARUS framework, we expect to be able to use Back-up, Data Recovery and
Monitoring services, which should be offered by the Cloud Service Provider.
3.5 Security Expectations
3.5.1 Expectations in terms of data
As mentioned in the overview, we expect the assurance of the preservation of the integrity,
confidentiality and accessibility of the passive Medical Records stored in the cloud data storage space
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 71 / 80
using the CLARUS methodologies and tools, and of course to accomplish the legal regulations on
personal data protection (LOPD in Spain), that in our case are very sensitive (high level according to law).
It is important to mention that any user security check, due to the nature of how we have planned the
different users to execute their processes directly from the Clinical Workstation Interface of the Hospital
Information System, will be done by the Hospital Information System, that is where all security issues
related to Medical Health Record access, Active or Passive, rely on. In other words the HIS will be in
charge of the user access control to any data stored in the Passive Medical Health Records in the Cloud
Service Storage, previously to invoking any CLARUS service or tool.
The only main benefit we expect from CLARUS framework is to guarantee the security and privacy of all
the Passive Medical Health Records stored in the Cloud Service. This is done by masking and/or
anonymizing of the data sent to the cloud when “Passivating” it, and of course its unmasking and de-
anonymizing when accessed and retrieved.
3.5.2 Expectations in terms of services
Besides, we expect that the services will also assure confidentiality and availability of the Medical
Records stored in the cloud data storage space using the CLARUS framework.
3.6 Demonstration Cases for CLARUS
The aim of this section is to describe precisely those application cases that will be used as
demonstrators for CLARUS (especially in the frame of WP6 work), i.e. cases that will help in specifying
CLARUS, designing its architecture and validating it all along the project.
In particular four main cases are detailed, that could be mapped to the different CLARUS typical
scenarios:
- Storage of Passive Medical Health Records (Data providers)
- Access and retrieval of Passive Medical Health Records (Data consumers)
-Advanced Queries on structured data contained in Passive Medical Health Records (Data
consumers)
- Statistics Computation on structured data contained in Passive Medical Health Records (data
consumers)
For each of these cases, a four-sided perspective has been adopted, answering key questions about:
- Why the corresponding demonstration case is relevant for both CLARUS and the e-Health
domain
- Who is susceptible of using it in a real-life context
- What are the sample data selected, their type and their sensitivity/criticality
- How the application case is usually designed and implemented
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 72 / 80
3.6.1 Securing Passive Medical Health Records storage in the cloud
Figure 17 - Medical Health Records storage in the cloud diagram
3.6.1.1 Purpose (Why?)
Demonstrate that sensitive information (Health Medical Records) could be stored in a secure cloud.
3.6.1.2 Actors (Who?)
Data Providers: experts in collecting or producing medical data into the Hospital Information System.
This experts are responsible of the Medical Record “Passivation” process and responsible also to store
this passive medical data in CLARUS cloud through the Clinical Workstation interface.
3.6.1.3 Data (What?)
Non-structured information: Discharge reports, Results Reports, Allergies, Clinical Notes. Sructured
information: Diagnostics (ICD-9), Lab Results (LOINC), in-patient medication, etc.
3.6.1.4 Design (How?)
Fake medical records but with the same structure as the real ones will be stored in CLARUS cloud, by
means of the use of the Clinical Workstation interface.
Create Dataset
Search Data
Dat
a P
rovi
der
Cloud
Service
Healthcare
Data Storage
Browse Data
Upload Dataset
Found
View Data
Modify Data
Delete Data
Not Found
On Premises On Cloud
Create Dataset
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 73 / 80
3.6.2 Securing Passive Medical Health Records access and retrieval from the cloud
Figure 18 - Medical Health Records access and retrieval diagram
3.6.2.1 Purpose (Why?)
In many cases, it is necessary to retrieve the passive medical health records that could be used for
medical practice, medical research, legal purposes, and others. Data will be searched, accessed and
retrieved only by Patient ID.
3.6.2.2 Actors (Who?)
Medical doctors who need the Passive Health Records for the purposes mentioned above.
3.6.2.3 Data (What?)
The data stored in CLARUS cloud: Non-structured information: Discharge reports, Results Reports,
Allergies, Clinical Notes. Sructured information: Diagnostics (ICD-9), Lab Results (LOINC), in-patient
medication, etc.
3.6.2.4 Design (How?)
Medical doctors will acces this information through the Clinical Workstation interface, which will
manage the access to and retrieval of the data stored in CLARUS cloud.
Search Data
Dat
a C
on
sum
er
Cloud
Service
Healthcare
Data Storage
Browse Data
Found
View Data
Download
Data
Not Found
On Premises On Cloud
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 74 / 80
3.6.3 Securing Passive Medical Health Record for the Advanced Query
Figure 19 - Medical Health Record Advanced Query diagram
3.6.3.1 Purpose (Why?)
In many cases, it is necessary to retrieve specifical data information for certain purposes as local
organization (distribution of it´s efforts), research and others. This type of search will involve the use of
multiple fields for the search criteria, i.e. by defining different conditions to be met by multiple
structured data fields, like specific values or ranges of values, presence or absence of certain medication
records, duration , etc.
3.6.3.2 Actors (Who?)
Medical doctors who need the Passive Health Records for the purposes mentioned above.
3.6.3.3 Data (What?)
The data stored in CLARUS cloud: Non-structured information: Discharge reports, Results Reports,
Allergies, Clinical Notes. Sructured information: Diagnostics (ICD-9), Lab Results (LOINC), in-patient
medication, etc. The expected result is a dataset containing a single or multiple Passive Health Records
meeting the conditions stablished by the Healthcare professionals when making the Advanced Query, or
an empty dataset if the conditions are not met.
3.6.3.4 Design (How?)
Medical doctors will acces the Advanced queries results through the Clinical Workstation interface,
which will manage the access to and retrieval of the data stored in CLARUS cloud.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 75 / 80
3.6.4 Securing Passive Medical Health Record for Statistics Computation query
Figure 20 - Medical Health Record Statistics Computation query diagram
3.6.4.1 -Purpose (Why?)
In many cases, it is necessary to make certain Statistics computation on Medical Recordsfor research
purposes, quality analysis or cases where is needed a health improvement in certain health areas.
3.6.4.2 Actors (Who?)
Medical doctors or Mediacal Managers who need the Passive Health Records for the purposes
mentioned above.
3.6.4.3 Data (What?)
The calculations of the Statistics Computations will be performed over structured data stored in CLARUS
cloud: Diagnostics (ICD-9), Lab Results (LOINC), in-patient medication records, etc.
3.6.4.4 Design (How?)
Medical doctors will access the Statistics computation results through the Clinical Workstation interface,
which will request the execution of the calculation to the cloud and then will manage the access to and
retrieval of the resulting calculated figures by CLARUS cloud.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 76 / 80
3.7 Summary
For comparisons purposes, we have also built the mapping of the four demonstrations cases proposed
for the e-Health part, and the security expectations correlated as submitted in §2.7.4
Security
Expectations
E Health
Demonstration Cases for CLARUS
Security
items Refers to §
MHR
Medical Health
Records
Storage
MHR
Medical Health Records
Access and Retrieval
0 ¡Error! No se encuentra el origen de
la referencia.
Data
accessibility and
alteration protection
/data integrity
2.5.1.1
¡Error! No se
encuentra el
origen de la
referencia.
Even though MHR are restricted data
access
metadata-based
access limitations 2.5.1.2.1 NA NA
confidentiality 2.5.1.3
authenticity of sources
(data quality) 2.5.1.4 NA NA
Anonymisation,
masking and inverse
¡Error! No se
encuentra el
origen de la
referencia.
Services
access authentication
(e.g. GeoRM)
2.5.2.1 NA NA
metadata-based
access limitations
2.5.2.2.1 NA NA
secure communication
protocols
2.5.2.3 NA NA
Personal
Data
personal data
protection
2.5.3.1
¡Error! No se
encuentra el
origen de la
referencia.
* *
* It is assumed that all data contained in MHR is under personal data
protection
access auditing 2.5.3.2 NA NA
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 77 / 80
Cloud-based
hosting
data location risks 2.5.4.1 NA NA
information system
loss of control risks
2.5.4.2 NA NA
multi-tenancy and
resource sharing risks
2.5.4.3 NA NA
Security
Expectations
E Health
Demonstration Cases for CLARUS
Security
items Refers to §
MHR
Passive Medical
Health Record for the
Advanced Query
MHR
Passive Medical Health
Record for Statistics
Computation Query
3.6.4 3.6.4
Data
accessibility and
alteration protection
/data integrity
2.5.1.1
¡Error! No se
encuentra el
origen de la
referencia.
Even though MHR are restricted data
access
metadata-based
access limitations 2.5.1.2.1 NA NA
confidentiality
2.5.1.3
authenticity of
sources (data quality) 2.5.1.4 NA NA
Anonymisation,
masking and inverse
¡Error! No se
encuentra el
origen de la
referencia.
Services
access authentication
(e.g. GeoRM)
2.5.2.1 NA NA
metadata-based
access limitations
2.5.2.2.1 NA NA
secure communication
protocols
2.5.2.3 NA NA
Personal
Data personal data
protection
2.5.3.1
¡Error! No se
encuentra el
origen de la
referencia.
* *
* It is assumed that all data contained in MHR is under personal data
protection
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 78 / 80
access auditing 2.5.3.2 NA NA
Cloud-based
hosting
data location risks 2.5.4.1 NA NA
information system
loss of control risks
2.5.4.2 NA NA
multi-tenancy and
resource sharing risks
2.5.4.3 NA NA
Legend : : relevant ; [] : potentially relevant ; NA: Not applicable Security item : common Health/Geodata ; e-Health only ; Geodata only.
Figure 21 - e-Health Demonstration Cases with regard to Security Expectations
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 79 / 80
4 Conclusion
In this document, two main application cases have been analysed in order to demonstrate the
appropriateness and applicability of the CLARUS methodologies, technologies and tools, namely, e-
Health and publication of geodata on the Internet. This analysis resulted in the definition of actors,
datasets and services in the corresponding domains, as well as a comprehensive list of security
expectations.
Preserving privacy in the cloud is a rather intuitive concern when thinking about the e-Health scenario.
Healthcare information systems are dealing with very sensitive personal health data, and at the same
time there is a growing demand for storage and computing capacity, which makes transition to the
cloud almost inevitable in the near future.
In comparison, geospatial data (owned by European institutions and environmental actors) can not be
considered as very sensitive personal data. However, geospatial data are sensitive for different reasons:
mission-critical data for public safety and national security, data having a strong business potential
(datasets and/or services are sold), etc.
As it is shown in this document, the needs for security and confidentiality when moving to the cloud are
somewhat different depending on whether one considers a Healthcare or a Geospatial information
system. Providing demonstration cases in each of these two domains therefore gives us the assurance to
cover a broad range of application scenarios for the CLARUS solution.
On the other hand, this detailed analysis highlights similarities and common security expectations.
Ultimately the demonstration cases proposed also provide a different perspective (through distinct
actors, datasets and services) on these common issues.
While some of the security expectations identified can be covered thanks to already existing solutions,
implemented either by the CSP either by the application itself, some others cannot be fulfilled today
without impairing typical cloud functionality or reducing the benefits associated with it. This fully
justifies the development of an innovative set of techniques improving trust in cloud environments.
The way to integrate such innovative set of security techniques in an application is considered
differently by the two main application cases. In the e-Health application case, CLARUS is “a layer of
solutions on top of a cloud storage service”. With this approach, CLARUS is a proxy running exclusively
on premises. It is a kind of cloud storage service with advanced security features, and could be
transparently integrated in any application (provided that application relies on transfer protocols
supported by CLARUS). On the other hand, in the Geo Publication application case, some services
operate geospatial data and are running on the cloud, which involves the use of CLARUS innovative
techniques directly on the cloud, either explicitly using some CLARUS libraries, or transparently using
CLARUS as a proxy.
During the further course of WP2, the applications cases described will serve the definition of
requirements for such a solution, namely the CLARUS framework, seeking to comply with legal
requirements and technical standards set up or approved by European authorities.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-Definition of Application Cases-V1.01
© CLARUS Consortium 80 / 80
Appendix A. Review of Related Project
A review of several projects in the field of Geodata publication is joined in the additional document
“CLARUS D2.1 Annex I - Review of EU geo-publication projects” [A4].
*** End of the Document ***
CLARUS – H2020-ICT-2014 – G.A. 644024
© CLARUS Consortium 1 / 27
A framework for user
centred privacy and
security in the cloud
Definition of Application Cases – Annex I:
Review of EU geo-publication projects
Type (distribution level) Public
Contractual date of Delivery 30-04-2015
Actual date of delivery 12-06-2015
Deliverable number D2.1-AnnexI
Deliverable name Definition of Application Cases – Annex I: Review of EU geo-
publication projects
Version V1.0
Number of pages 15
WP/Task related to the
deliverable Task 2.1
WP/Task responsible AKKA
Author(s) AKKA Team
Partner(s) Contributing
Document ID CLARUS-D2.1-DefinitionOfApplicationCases-AnnexI-v1.0
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 2 / 27
Abstract This document analyses relevant projects representative of
publication of Geo-referenced data on the Internet.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 3 / 27
Disclaimer
CLARUS (G.A. 644024) is a Research and Innovation Actions project funded by the EU Framework Programme for Research and Innovation Horizon 2020. This document contains information on CLARUS core activities, findings and outcomes. Any reference to content in this document should clearly indicate the authors, source, organization and publication date. The content of this publication is the sole responsibility of the CLARUS consortium and cannot be considered to reflect the views of the European Commission.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 4 / 27
Table of Contents
1 INTRODUCTION ............................................................................................................................................ 6
1.1 SCOPE OF THE DOCUMENT ..................................................................................................................................... 6
1.2 APPLICABLE AND REFERENCE DOCUMENTS ................................................................................................................ 6
1.3 REVISION HISTORY ............................................................................................................................................... 6
1.4 NOTATIONS, ABBREVIATIONS AND ACRONYMS (OPTIONAL) ......................................................................................... 6
2 INGEOCLOUDS .............................................................................................................................................. 8
2.1 OVERVIEW ......................................................................................................................................................... 8
2.2 BUSINESS USE CASES OF THE INGEOCLOUDS PARTNERS ............................................................................................... 8
2.2.1 UC1: Publication of Geospatial Dataset in the Environmental Field ..................................................... 8
2.2.2 UC2: Susceptibility Map of Triggering Landslides Due to Rainfall Forecast .......................................... 9
2.2.3 UC3: Shakemaps ................................................................................................................................... 9
2.2.4 UC4: Pesticides in Groundwater ............................................................................................................ 9
2.2.5 UC5: Ground Water Resources Management in Granular Aquifers .................................................... 10
2.2.6 UC6: Active Landslide Inventory Mapping and Susceptibility Zoning .................................................. 10
2.3 ACTORS ........................................................................................................................................................... 10
2.4 SERVICES.......................................................................................................................................................... 11
2.4.1 Account management ......................................................................................................................... 11
2.4.2 Workspace .......................................................................................................................................... 12
2.4.3 Application installation and maintenance .......................................................................................... 12
2.4.4 Data import/Synchronization .............................................................................................................. 13
2.4.5 Data publication services .................................................................................................................... 14
2.4.6 Metadata management ...................................................................................................................... 14
2.4.7 Search service ...................................................................................................................................... 15
2.4.8 View service ......................................................................................................................................... 15
2.4.9 Download service ................................................................................................................................ 15
2.4.10 Processing service ........................................................................................................................... 16
2.4.11 Account billing ................................................................................................................................ 16
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 5 / 27
2.4.12 Support, maintenance and monitoring service .............................................................................. 16
2.5 SECURITY IN INGEOCLOUDS ................................................................................................................................ 18
2.5.1 Protecting data against unauthorized access ..................................................................................... 18
2.5.2 Protecting services against unauthorized access ................................................................................ 18
2.5.3 Ensuring identity data protection ....................................................................................................... 18
3 EGDI-SCOPE ................................................................................................................................................ 19
3.1 OVERVIEW ....................................................................................................................................................... 19
3.2 ACTORS ........................................................................................................................................................... 19
3.2.1 EGDI stakeholder panel ....................................................................................................................... 19
3.2.2 EGDI user categories ........................................................................................................................... 19
3.3 DATASETS ........................................................................................................................................................ 20
3.3.1 EGDI methodology for prioritizing datasets ........................................................................................ 20
3.3.2 Relevant answers to EGDI stakeholder survey .................................................................................... 20
3.3.3 Compilation of the INSPIRE dataset inventory .................................................................................... 21
3.4 USE CASES ........................................................................................................................................................ 21
3.4.1 Geohazards: Ground Instability in densely populated areas ............................................................... 21
3.4.2 Raw Materials: Rare Earth Element potential within the European Union ......................................... 22
3.4.3 Renewable Energy - Planning for offshore Wind Farms ...................................................................... 22
3.4.4 Geology and Soils – Ecosystem Mapping ............................................................................................ 23
3.5 TRUST AND AUTHENTICATION ............................................................................................................................... 23
3.5.1 Elements of trust ................................................................................................................................. 23
3.5.2 Moving to the cloud ............................................................................................................................ 24
4 MINERALS4EU............................................................................................................................................. 26
4.1 THE RAW MATERIALS INITIATIVE .......................................................................................................................... 26
4.2 THE PROJECT .................................................................................................................................................... 26
4.3 ACTORS: DATA OWNERS AND CONSUMERS ............................................................................................................. 26
4.4 SERVICES.......................................................................................................................................................... 27
4.4.1 Mineral Statistics ................................................................................................................................. 27
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 6 / 27
1 Introduction
1.1 Scope of the document
The aim of this document is to describe different projects representative of the “publication of geo-
referenced data on the internet” domain, providing further details on the Geo-Publication application
case in the CLARUS Deliverable D2.1 (Definition of Application cases).
1.2 Applicable and reference documents
This document refers to the following documents:
CLARUS Grant Agreement
CLARUS-D2.1-Definition of Application Cases-V1.0
INGC-D2.1-Use Cases for InGeoCloudS Data and Services-v1.1
INGC-D2.2-Interface of web services and models of data-v1.0
INGC-D2.3-InGeoCloudS Web Services Covering Use Cases-v1.1
INGC-D4.2-Fully Operational InGeoCloudS Pilot-v1.0
EGDI-scope User requirements and use cases
EGDI-scope D2.3 Functional User Requirements and Use Cases
EGDI-scope D5.2 Report on trust and authentication
1.3 Revision History
Version Date Author Description
0.1 15/01/2015 AKKA Initial version
1.0 12/05/2015 AKKA V1.0 Version
1.4 Notations, Abbreviations and Acronyms (optional)
API: Application Programming Interface
CRUD: Create/Read/Update/Delete
CSP: Cloud Service Provider
CSW: Catalog Service for the Web
DG ENTR: Directorate General for Enterprise and Industry
DG JRC: Directorate General Joint Research Centre
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 7 / 27
DoW: Description of Work
EC: European Commission
EEA: European Environment Agency
EGDI: European Geological Data Infrastructure
ENISA: European Network and Information Security Agency
EPPO: Earthquake Planning and Protection Organisation
ETPSMR: European Technology Platform on Sustainable Mineral Resources
FTP: File Transfer Protocol
FTPS: File Transfer Protocol Secure
GA: Grant Agreement
GEMAS : Geochemical Mapping of Agricultural Soils of Europe
GeoZS: Geological Survey Of Slovenia
GIS: Geographical Information System
GML: Geography Markup Language
HTTP: Hypertext Transfer Protocol
IaaS: Infrastructure as a Service
InGeoCloudS: INspired GEOdata CLOUD Services
NGO: Non-governmental Organization
OGC: Open Geospatial Consortium
OWL: Web Ontology Language
PaaS: Platform as a Service
PMB: Project Management Board
PSI: Public Sector Information
RDF: Resource Description Framework
REE: Rare Earth Elements
REE: Rare Earth Elements
RIF: Rule Interchange Format
SAML: Security Assertion Markup Language
SCP: Secure Copy
SFTP: Secure File Transfer Protocol
SQL: Structured Query Language
ToC: Table of Contents
UC: Use Case
WADL: Web Application Description Language
WFS: Web Feature Service
WMS: Web Map Service
WMS-C: Web Mapping Service Caching
WMST: Web Map Service Time
WP: Work Package
WS-Addressing: Web Services Adressing
WSDL: Web Service Description Language
XSD: XML Schema Definition
XSLT: eXtensible Stylesheet Language Transformations
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 8 / 27
2 InGeoCloudS
2.1 Overview
The INspired GEOdata CLOUD Services (InGeoCloudS) project aimed at demonstrating the feasibility of
employing a cloud-based infrastructure coupled with the necessary services to provide seamless access
to geospatial public sector information, especially targeting the geological, geophysical and other
geoscientific information.
Project partners' data and services, available under more traditional infrastructures are easily deployed
to the cloud. One of the project challenges has been the linking of the partners’ data among themselves
and with relevant external datasets.
On top of the cloud services implemented (mostly according to the European Directive INSPIRE), the
project demonstrates the ability to build more intelligent services by using Linked Open Data principles
and technologies to combine data seamlessly integrated through the cloud.
The project has been co-funded by the European Commission under: The Information and
Communication Technologies Policy Support Programme
2.2 Business use cases of the InGeoCloudS partners
As described in InGeoCloudS project documentation, the use cases implemented by InGeoCloudS have
been proposed by the data providers and have been elaborated through interactions between data
providers and interested stakeholders. The implemented use cases cover the areas of a) hydrogeology
and b) hazards due to earth phenomena.
The following gives an overview of each use case.
2.2.1 UC1: Publication of Geospatial Dataset in the Environmental Field
Public authorities producing or collecting geo-information in the environmental field, have to publish
and share data with public and other authorities: geological maps, piezometric maps, geo-hazards maps
and other geospatial datasets as defined in the annexes of the INSPIRE Directive.
The use case allows a provider to publish his dataset on the web, to create a specific map and to be
compliant with the INSPIRE technical requirements without the intervention of an IT team or INSPIRE
experts.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 9 / 27
A provider is any public authority (local, national, associations, etc.) which have to publish and share
geo-information in the environmental field. Public or other authorities can view and/or download the
shared data.
The use case is split in 2 business processes:
the process for the provider in charge of the creation of a map and dataset publication,
the process for the user who searches, visualizes datasets, re-uses data created by the system.
2.2.2 UC2: Susceptibility Map of Triggering Landslides Due to Rainfall Forecast
The system predicts (in a best possible way) the areas where the probability of triggering landslides is
increased due to higher precipitation levels. The endangered zones are predicted using the combination
of the landslide susceptibility map, the precipitation forecast and the landslide triggering threshold
values.
GeoZS (Geological Survey Of Slovenia) provides information about geo-hazards induced by mass
movement process. Intended users are Administration of the republic of Slovenia for civil protection and
disaster relief, Slovenian Environment Agency, Municipalities, Planners, Infrastructure owners and
operators, First responders, general public/citizens and Real Estate Agencies.
The use case is split in 2 business processes:
the process for the provider in charge of pushing raw data and periodic execution of the
calculation of landslide susceptibility map,
the process for the user who searches and re-uses data created by the system.
2.2.3 UC3: Shakemaps
The Shakemaps service provides shake-maps for major earthquakes in Greek region. Shake-maps are
maps showing ground movement and shaking intensity following major earthquakes. The shake-maps
are calculated using information about earthquakes that are extracted automatically in near-real time
(in a few minutes) from accelerogram records.
EPPO (Earthquake Planning and Protection Organisation) provides shake-maps for major earthquakes in
Greek region. Other data providers can also use Shakemaps service to calculate their own shake-maps.
The use case is split into 2 business processes:
the production process for the provider in charge of pushing initial data parameters, uploading
sensor data and executing calculation,
the process for the user who searches and re-uses data created by the system.
2.2.4 UC4: Pesticides in Groundwater
This use case makes it possible for users to find areas where there are high concentrations of pesticides
in the groundwater. It could be either pesticide in general or specific pesticides. It is also possible to
restrict the output to pesticides found at a certain depth interval and/or from certain geology (lithology
or lithostratigraphy).
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 10 / 27
Intended users include NGOs, EEA, national environmental authorities, national or European
environmental portals and researchers.
The use case is split into 2 business processes:
the production process for the provider in charge of the configuration of the map and the
synchronization with local DB,
the process for the user who searches and re-uses data created by the system.
2.2.5 UC5: Ground Water Resources Management in Granular Aquifers
The use case provides data from field measurements, from chemical analyses in accredited laboratories
and from database of various geospatial data. Considering the above data and the fact that there is very
good and adequate scientific knowledge of geological, hydrological, hydro-geological and hydro-
chemical characteristics and properties of the study areas, important interdisciplinary and multi-layered
conclusions could be conducted such as a) Water balance estimation and assessment b) Piezometric
surface maps for dry and wet periods’ piezo maps c) Hydro chemical maps.
The use case is split into 2 business processes:
the production process to process groundwater statistical operations with geo-processing tools
(e.g. kriging) on the fly,
the process for the user to access data published.
2.2.6 UC6: Active Landslide Inventory Mapping and Susceptibility Zoning
The use case provides an active inventory map of the occurred landslides (e.g. location, classification,
volume, activity, date of occurrence of landslide, etc.) updated after every new event recorded in the
system. It is possible to retrieve data concerning the landslides’ characteristics (type of movement,
causes, season and year of occurrence,..), as well as any information available for the region of
occurrence (geology, precipitation, altitude, slope,…).
The calculation of the spatial probability produces a susceptibility zoning map available to the system.
The map is a result of the analysis between the spatial distribution of the landslides (landslides’ density)
and a group of generative causes (geological, topographical, hydrological etc… characteristics of the
area) based on the fact that landslides in the future will occur under the same circumstances that they
occurred in the past. The calculation of the density map is redone every time when new data for the
study area are provided to the system.
The use case is split into 2 business processes:
the production process to perform the calculation of the spatial probability to produce a
susceptibility zoning map,
the process for the user to access data published.
2.3 Actors
Data consumer: Actor consuming information provided by the InGeoCloudS platform. This
category of actors interacts with the system to find and view information, download data,
access to treatment or geoscience modeling. The user can also consume Web services or APIs. In
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 11 / 27
most cases, this access is one without authentication but it is possible, for safety or traceability
reasons, to request authentication for certain functions.
Data provider: Actor producing or collecting information (or having a function of data
collection) in relation to the thematic geoscience and / or environmental project. This category
of actors interacts with the system to push information or launch the harvest of information
from their local system, configure treatments dedicated to these data, develop models, etc. The
producer interacts with the platform in an authenticated mode and secure access.
Application provider: Actor with the capability to integrate new applications and services in the
InGeoCloudS infrastructure. The application provider can use the API infrastructure and build
specific services; they can contemplate the use cases and their implementation as examples of
integration.
Registered user: Actor that needs to be authenticated on the system to access additional
functionalities in providers’ applications (e.g. register for notification). The application provider
is fully responsible of the access control to its application.
Public: Actor that is not authenticated. The InGeoCloudS platform sees public as a guest.
Stakeholder: Actor that does not interact directly with the system but has a direct or indirect
interest in the features offered by the system.
Administrator: Actor managing, monitoring and maintaining the InGeoCloudS platform. The
administrator is also in charge of the management of the accounts of type data provider and
application provider and also of the management and backup of workspaces.
Cloud provider: Actor which is responsible of management of the underlying cloud platform. A
cloud provider provides the necessary infrastructure (IaaS) to run the InGeoCloudS platform but
does not interact directly with (InGeoCloudS platform is seen as a black box).
2.4 Services
The InGeoCloudS project provides services in the cloud with a very scalable and efficient infrastructure
in the environmental field.
Within the frame of the public infrastructure, users consume the various services offered by the
platform, whether for the viewing, downloading or reuse through technical and interoperable services
or other Application Programming Interface (API).
2.4.1 Account management
Administrator creates/deletes/modifies accounts of type data provider or application provider. (All UCs)
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 12 / 27
Application provider creates/deletes/registers/unregisters accounts of type registered user. (All UCs)
Administrator, data provider, application provider, registered user modify their profile. (All UCs)
2.4.2 Workspace
Data/Application providers have a dedicated and secured workspace on the InGeoCloudS platform to
store their datasets. Workspace is mainly a dedicated directory on the file system, but a dedicated
database can optionally be created. (All UCs)
Administrator automatically and transparently creates a dedicated directory when a new account of
type Data provider or Application provider is created. (All UCs)
Administrator creates a dedicated database when an Application provider request for it. (UC1, UC2, UC4)
2.4.3 Application installation and maintenance
The InGeoCloudS solution is a computing platform (PaaS) including operating system, programming
language execution environment, database, and web server.
The application providers have the capability to install their custom application on the cloud (web
applications dedicated to the publication of geo-referenced data or application utilities dedicated to the
synchronization of the dataset stored on the cloud with the dataset stored on premises).
- APPLICATION INSTALLATION
This use case defines how application providers manage the installation of application files stored in
their dedicated space on the cloud.
Application providers upload their application files using a file transfer utility (FTP, FTPS), a file transfer
API or a web application provided by InGeoCloudS. (All UCs)
Application providers connect to the InGeoCloudS platform using SSH to configure their application and
initialize its database. (UC1, UC2, UC4)
- APPLICATION MAINTENANCE
This use case defines how application providers maintain their application running in the cloud.
Application providers download (for analysis) the log files of their application using a file transfer utility,
a file transfer API or a web application provided by the InGeoCloudS platform.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 13 / 27
2.4.4 Data import/Synchronization
The data providers choose the data to be pushed into InGeoCloudS. Each data provider has a dedicated
and secured storage space on the file system and on the database server. Only the owner of the data
can access it.
Thanks to dedicated features, data providers manage their data inInGeoCloudS, controlling how and
when data are pushed, updated or deleted.
- MANUAL MANAGEMENT OF DATASETS FILES
This use case defines how data providers and/or application providers manage datasets files stored in
their dedicated spacein InGeoCloudS.
o Data providers manage their dataset files (i.e. upload, move, delete, etc.) using a file
transfer utility, a file transfer API (the InGeoCloudS API), or a web application provided
by the platform (the InGeoCloudS workspace explorer) (All UCs)
o Application providers manage their dataset files using their own application.
- MANUAL MANAGEMENT OF DATASETS STORED IN A DATABASE
In this use case, data providers manage the content of their dedicated database using a web application
provided by the InGeoCloudS platform.
On the other hand, application providers manage the content of their dedicated database through their
own application.
- SYNCHRONIZATION OF DATASETS FILES FROM DP’S OWN PREMISES
This use case allows data providers to update datasets files stored onInGeoCloudS, in their dedicated
space, using a synchronization process running on their own premises. Data providers typically configure
on their premises an automatic process that runs locally and updates the datasets files inInGeoCloudS,
through a standard transfer protocol (FTP, FTPS) or through a file transfer API provided byInGeoCloudS.
- SYNCHRONIZATION OF DATASETS IN INGEOCLOUDS
This use case allows application providers to register synchronization tasks that periodically update
datasets files stored in their dedicated space, and/or datasets stored in their dedicated database.
A synchronization task is an application installed by an application provider that is executed by
InGeoCloudS on behalf of the application provider:
o Application providers manage synchronization jobs (to list, register/unregister or modify
data) using a web application or a task management API provided byInGeoCloudS (UC2,
UC4)
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 14 / 27
The InGeoCloudS platform executes the registered tasks at the scheduled time on behalf of the
application provider.
2.4.5 Data publication services
This use case defines how data providers publish the datasets through interoperable OGC/INSPIRE
compliant services. They create or edit layers, configure Web Map Service (WMS) for raster data and/or
Web Feature Service (WFS) for vector data.
Data providers:
o create/open a map in their environment and define characteristics of the map. (UC1,
UC2, UC5, UC6)
o create layers in a map with their files or other public WMS/WFS services and defines
characteristics of the layers. (UC1, UC2, UC5, UC6)
o publish each layer as WMS and/or WFS services. (UC1, UC2, UC5, UC6)
o publish the map on the Internet. (UC1, UC2, UC5, UC6)
In addition, data providers define access rights to their published data for critical, sensitive or
marketable data.
2.4.6 Metadata management
Metadata allow data providers to describe their data, maps and services. Metadata are published and
used by data consumers who are looking for particular datasets and services. These metadata are
available and managed through a so-called Catalogue.
In the InGeoCloudS solution, the standard tool used for managing and exposing discovery services is
Geonetwork.
- METADATA CREATION AND STORAGE
In the use cases (UC1, UC2, UC5, UC6), Data providers:
- perform CRUD operations on data metadata, service metadata (protocols, security constraints,
etc.) and/or map metadata (i.e. map layers and portrayal parameters).
- publish these metadata (optionally compliant with OGC/INSPIRE standards: CSW)
- define access rights to their published metadata.
Application providers may publish their data/service metadata using other web service mechanisms.
In addition the data providers may register their own discovery service to a global discovery service (e.g.
Member State discovery service). This might help in improving trust in services and data.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 15 / 27
2.4.7 Search service
Discovery Services search for spatial datasets and spatial data services through interoperable
OGC/INSPIRE CSW service and on the basis of the content of corresponding metadata, and display the
metadata content.
- SEARCH and CONSULTATION THROUGH METADATA
In these cases (UC1, UC2, UC5, UC6), Data consumers:
o invoke getCapabilities() facility in order to design and build queries on metadata
o browse a geo-portal by area of interest
o perform a keyword-based search on a geo metadata catalogue
These operations are performed through HTTP(S), following the CSW standard.
- SEARCH and CONSULTATION THROUGH OTHER MEANS
In this use case (UC1, UC2, UC5, UC6), data consumers use the search functionalities of the application
provided by the application provider (inGeoCloudS SmartQueries, Shake Maps application, full text
search).
2.4.8 View service
View Services as a minimum, display, navigate, zoom in/out, pan, or overlay spatial datasets and display
legend information and any relevant content of metadata
- VISUALIZATION THROUGH STANDARDIZED SERVICES
In these use cases (UC1, UC2, UC5, UC6), Data consumers:
o display spatial data and interacts with them (zoom in/out, navigate, pan, overlay)
through a web application provided by the cloud platform
o reuse the spatial data in their own applications or services
o consult legend information about the spatial data displayed.
These operations are performed through HTTP-S), following the WMS standard.
- VISUALIZATION THROUGH OTHER MEANS
In this use case, Data consumers display spatial data and interact with them through a web application
provided by the application provider (inGeoCloudS SmartQueries, Shake Maps application).
2.4.9 Download service
Download Services enable copies of complete spatial datasets, or of parts of such sets, to be
downloaded.
- DOWNLOAD THROUGH STANDARDIZED SERVICES
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 16 / 27
In these use cases (UC1, UC2, UC5, UC6), Data consumers:
o download complete (or a part of) spatial datasets as static resources/URL (through
ATOM, WFS)
o queries subsets of spatial datasets.
- DOWNLOAD THROUGH OTHER MEANS
In this use case, Data consumers download complete (or a part of) spatial datasets using:
o a file transfer utility
o the file transfer API provided by the InGeoCloudS platform
o the web application provided by the InGeoCloudS platform
o the web application provided by the application provider (UC3, UC4).
2.4.10 Processing service
This use case defines how:
o Application providers invoke a custom geo-processing service through the InGeoCloudS
API (HTTPS) to calculate or process their data. (UC2, UC3)
o Cloud provider creates one temporary instance to run the custom geo-processing
service. Instance is terminated after geo-processing service is completed. (UC3)
o Application provider, data provider and data consumer invoke a shared geo-processing
service provided by the InGeoCloudS platform (e.g. kriging). (UC6)
2.4.11 Account billing
- Cloud provider provides periodically costs details for the usage of its services. (All UCs)
- Administrator splits costs between data providers and application providers. Split is based on
usage statistics by users. (All UCs)
- Application provider and data provider are in charge of costs according to the usage
statistics.(All UCs)
2.4.12 Support, maintenance and monitoring service
- Cloud provider:
o provides services to administrator for monitoring the cloud services. (All UCs)
- Administrator:
o monitors the usage using the monitoring service provided by the InGeoCloudS platform.
(All UCs)
o adjusts the computation and storage capacities (scalability) of the InGeoCloudS platform
according to the usage statistics. (All UCs)
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 17 / 27
o performs automatically and transparently a periodic backup (every night) of the
workspaces (file system and database) of the data providers and application providers.
(All UCs)
- Application providers:
o perform explicitly backup of their database using the InGeoCloudS API (HTTPS). (UC1,
UC2, UC4)
o perform basic maintenance operations (rebuild indexes, vacuum) on their database
using the InGeoCloudS API (HTTPS). (UC1, UC2, UC4)
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 18 / 27
2.5 Security in InGeoCloudS
2.5.1 Protecting data against unauthorized access
- AUTHENTICATION WITH FILE TRANSFER UTILITIES
In this use case, data providers and application providers need to authenticate to access their dedicated
space using a file transfer utility and a standard transfer protocol (FTPS, SCP, SFTP).
2.5.2 Protecting services against unauthorized access
- AUTHENTICATION FOR PUBLICATION
In this use case, data providers need to authenticate to publish their data..
2.5.3 Ensuring identity data protection
- SINGLE SIGN ON TO WEB APPLICATIONS AND APIs PROVIDED BY THE SYSTEM (USE CASE)
In this use case, users need to authenticate once in order to have access to the Web applications and
APIs provided by InGeoCloudS (according to their permissions).
- Users:
o log on once using a login page provided by theInGeoCloudS platform, or calling a session
management API provided by the InGeoCloudS platform (valid session + HTTPS)
o can access to web applications and/or call APIs provided by the InGeoCloudS platform
according to their permissions
o log out using a logout page provided by theInGeoCloudS platform, or calling a session
management API provided by the InGeoCloudS platform (valid session + HTTPS)
- Security Manager:
o invalidates a user session using a web application provided by theInGeoCloudS platform,
or calling a session management API provided by the InGeoCloudS platform
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 19 / 27
3 EGDI-Scope
3.1 Overview
The EGDI-Scope project was a 2012-2014 feasibility study on the creation of a European Geological
Data Infrastructure (EGDI), co-funded by the European Commission under the 7th Framework
Programme for Research and Technological Development, and executed by a consortium of four
National Geological Surveys (NL, UK, FR, DK), the Catholic University of Leuven (BE) and EuroGeoSurveys
– with the direct contribution of the 32 member European Geological Surveys.
The EGDI-Scope study set the basis of a pan-European Geological Service for facilitating easy open
access to digital geological data at the European scale.
In the context of the CLARUS application case definition, the EGDI-Scope project has been identified as
representative of the “publication of geo-referenced data on the internet” domain, along with other EU
projects and directives (inGeoCloudS, Minerals4EU, INSPIRE, etc.).
3.2 Actors
3.2.1 EGDI stakeholder panel
A core group of pan-European institutions and projects have been assembled in a “stakeholder panel”,
regularly consulted by the consortium in order to ensure that the project was on the right track. Among
members of this panel were the European Environment Agency (EEA), the Directorate General for
Enterprise and Industry (DG ENTR), the Directorate General Joint Research Centre (DG JRC) and
Insurance Europe.
3.2.2 EGDI user categories
Four general types of EGDI users have been identified, each of them being potentially subdivided
according to the sector they represent:
High level end-users: Users such as policy makers that will not need direct access to the
geological data infrastructure, but who depend on the ability for experts to have access to up-
to-date, reliable data to respond quickly to requests for information.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 20 / 27
System end-users: Users that access the geological data infrastructure directly in order to find
data and information.
For example:
- end-users of inGeoCloudS, Minerals4EU, OneGeology-Europe, Promine, etc.
- some members of the stakeholder panel (i.e. the European environment agency and Insurance
Europe) as well as geological experts from different domains (EGS expert groups)
Data providers: Stakeholders that will feed data into the geological data infrastructure.
Representatives of all EuroGeoSurveys (the association of the European Geological Surveys)
members are involved.
Other stakeholders: Organizations that have an interest in EGDI‐Scope to ensure integration to
other projects and programs (on a political or technical level). This concern past and ongoing
European projects, for example Minerals4EU.
3.3 Datasets
3.3.1 EGDI methodology for prioritizing datasets
In order to prioritize datasets (and identify infrastructure needs) the EGDI consortium developed a
questionnaire and distributed it to all engaged members of the Stakeholder Panel. In addition, a greater
number of stakeholders has been identified and invited to fill in the questionnaire in order to get input
from as many different types of organizations as possible.
Furthermore, identification of geospatial datasets has been performed via the INSPIRE Monitoring and
Reporting web portal.
3.3.2 Relevant answers to EGDI stakeholder survey
The questionnaire was relatively simple, and aimed at asking people about their need for geological data
and services.
Among others, one question seems particularly relevant for CLARUS:
Do you have any current legal barriers relating to your use of geological data?
- “Occasionally, standard copyright policies might apply”
- “Data from wells”
- “For any private data, in particular borehole data”
- “Yes, reserves/resources data are confidential”
- “classified and confidential data”
- “Yes, issues of national security”
- “National legislation in other countries”
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 21 / 27
- “Yes. Law restriction.”
- “All geological maps freely available. Generated geological information from site investigations
depends on confidentiality.”
- “Regarding geological reports and borehole data (rights to inspection, copy rights) - Mineral
royalty - Intellectual property rights (IPR)”
- “Some data are not public and is hard to have access to them, even if I am part of the Geological
Survey”
- “Yes – legal regulations are not clear”
3.3.3 Compilation of the INSPIRE dataset inventory
On another hand, key INSPIRE Indicators were selected to help create a synthesized overview of existing
spatial datasets (INSPIRE is a EU Directive aiming at standardizing infrastructures for spatial information
and data across Europe).
3.4 Use cases
Four use cases were selected by the EGDI‐Scope project consortium for the purpose of study in details
the future EGDI user needs. These use cases were chosen to represent different policy areas and
different stakeholder groups.
The aim of these use cases was to:
- Describe how geological data are used
- Assess the user requirements for data and functionality
- Assess the availability of geological datasets to fulfil the requirements
- Study the dependencies towards previous and ongoing projects
- Demonstrate interfaces to other e‐Infrastructures
- Assess legal, licensing and governance aspects
- Studying issues of relevance for the EGDI architecture
- Study issues of relevance for the implementation of EGDI and conversion of existing datasets.
3.4.1 Geohazards: Ground Instability in densely populated areas
A number of European projects have been dealing with ground stability assessments in densely
populated areas. The present use case focuses on the PanGeo project and data.
Datasets
The PanGeo project has developed a ground stability GIS layer and a geohazard description report for a
number of cities around Europe (Urban atlas).
Users
- End-users : European politicians, general public or private companies
- Professional users : geological experts from within the national geological surveys, mining or
finance companies, geological scientists or scientist from non-geological domains.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 22 / 27
Examples of required functionality
- Geohazard and demographic information for ground stability polygons (click‐info).
- Display of PSI data:
o Average annual velocities
o Cumulative displacements
- GIS Inquiry tools such as the visualization of ground motion time series for individual PSI points
(click‐info, click a point and visualize a graph of movement in time)
- Download of geohazard descriptions
- WMS/WFS
- Visualization in Google Earth
3.4.2 Raw Materials: Rare Earth Element potential within the European Union
This use case is closely connected to Minerals4EU.
Datasets
Rare Earth Elements (REE) are a group of critical raw materials. The sustainable supply of these elements
for European Industry is highly vulnerable.
Users
- End-users : European politicians, general public or private companies
- Professional users : geological experts from within the national geological surveys, mining or
finance companies, geological scientists or scientist from non-geological domains.
Examples of required functionality
- Interactive map functionality
- Download of printable maps
- WMS/WFS functionality
- Download in Excel format
- Download in GIS format
- Various search facilities (to be specified)
3.4.3 Renewable Energy - Planning for offshore Wind Farms
Users
- Private companies planning for the construction of wind farms
- Governmental agencies preparing calls for tender and evaluating applications
- Environmental agencies doing e.g. habitat mapping
Datasets
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 23 / 27
A wind farm project is planned taking into account a number of geological and geophysical data: e.g. the nature of the sea bottom, the potential impacts on the environment and potential submerged archaeological sites. The following data have been collected:
- Ground investigation - Mapping of marine physical processes - Marine water and sediment quality - Marine and intertidal ecology - Impact on other marine users - Marine and coastal archaeology
Free and open geological and geophysical data are easily available and are of a high value when
preparing for wind farm applications. A number of European projects have for many years worked on
putting together harmonized geological and geophysical data and making these available via the web.
The most important of these projects to be considered by EGDI‐Scope is EMODnet‐geology and Geo‐
Seas.
3.4.4 Geology and Soils – Ecosystem Mapping
The ecosystem mapping and assessment described within the current use case relates to the EU Biodiversity strategy. The part presented here concentrates on geoscientific studies/data sets about soils.
Users
- The European Environment Agency (EEA)
Datasets
The EGS geochemical mapping project (GEMAS) has recently produced a high quality pan‐European
dataset of geochemistry from agricultural and grazing soils. The dataset can supply important
information for the ecosystem assessment performed by EEA.
3.5 Trust and authentication
The study drew a particular attention on legal and organizational aspects (particularly trust and
authentication matters).
3.5.1 Elements of trust
Trust in the EGDI may take different levels and different forms.
Trust in the data
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 24 / 27
According to the report on trust and authentication delivered in the 5th EGDI-Scope work package, “the
user has to feel comfortable in using the geological data sets offered”. “He or she has to have enough
guarantees and safeguards that the data are reliable and of sufficient quality and proper format for the
objectives he wants to obtain”. This could be done through:
- Metadata - Transparent quality assessment procedures - Security measures for maintaining the authenticity and integrity of the data
Trust in the services
If a user has to rely on obtaining data via services such as the INSPIRE network services, he or she has to
be able to rely on the availability of these services whenever they are needed. This implies:
- A sufficient level of service has to be guaranteed by the service providers in the EGDI - The offered level of service has to be communicated clearly to the users of the services via
what is generally referred to as service level agreements or terms of service. - The required level of service is to a large extent determined by the INSPIRE implementing
rules relating to the network services.
Trust in the people
Trust in the people cover different aspects, like authentication and identity management, rights management and appropriate personal data protection. It may also imply an appropriate identity management system to be set up allowing for cross-border transactions, while not imposing too heavy burdens on the users of the system. Furthermore the EGDI consortium has differentiated:
- Trust from the perspective of data providers : it may be important, depending on the data and use conditions, to know who is
using their data and how they are using it.
- Trust from the perspective of users : it is important to know who the data is stemming from and that access and use
of the data is not unnecessarily restricted they need to be sure that the information on their identity and their use of the
data is not misused by the data provider
3.5.2 Moving to the cloud
As state in the EGDI-Scope report, moving to the cloud “may have considerable benefits relating to
scalability and efficiency”. “However, there are a number of risks and possible disadvantages that need
to be taken into account.”
These risks may even be intensified if the EGDI uses multiple cloud service providers.
In order to identify these risks one may refer to the ENISA extensive report “Cloud computing. Benefits,
risks and recommendations for information security.”
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 25 / 27
Risks about security
The user will depend on the cloud service provider’s security measures, and in most cases he or she will
not be able to impose any security requirements.
Inadequate security may lead to “loss of data”, “corruption of data”, “problems in extracting the data
from the cloud service”, “unintended exposure of data”, or “continuity problems”.
Risks about personal data protection
A particular type of data requiring confidentiality is personal data. The data sets included in the EGDI
cannot be qualified as personal data, and the only personal data that will be processed in the EGDI will
most likely be the data regarding the persons accessing the data and services in case of restricted data.
The processing of such data has to comply with specific requirements under the Data Protection
Directive.
The study for an EGDI concluded that a key question in the context of cloud computing concerns “the
division of responsibilities and liability between the different actors in the cloud computing value chain,
and the determining of the processors and controllers of the data processing operations and their
obligations.” Putting personal data in the cloud implies a “transfer of personal data to third parties
under the Data Protection Directive, possibly to countries outside of the European Union”.
Risks about control and ownership
For example it should be made sure that the data is properly deleted after the end of the contract, after
an appropriate transition period.
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 26 / 27
4 Minerals4EU
4.1 The Raw Materials Initiative
Raw materials are fundamental to Europe’s economy, and they are essential for maintaining and
improving our quality of life. Recent years have seen a rapid growth in the number of materials used
across products. Securing reliable and undistorted access of certain raw materials is of growing concern
within the EU and across the globe. As a consequence of these circumstances, the Raw Materials
Initiative was instigated to manage responses to raw materials issues at an EU level. Critical raw
materials have a high economic importance to the EU combined with a high risk associated with their
supply.
4.2 The Project
The Minerals4EU project is designed to meet the recommendations of the Raw Materials Initiative and
will develop an EU Mineral intelligence network structure delivering a web portal, a European Minerals
Yearbook and foresight studies.
The network will provide data, information and knowledge on mineral resources around Europe, based
on an accepted business model, making a fundamental contribution to the European Innovation
Partnership on Raw Materials (EIP RM), seen by the Competitiveness Council as key for the successful
implementation of the major EU2020 policies.
4.3 Actors: Data owners and consumers
Initially the network membership will be from the consortium, since the members are those supposed
to be the data owners at EU level, and which in fact comprises mainly the National Geological Surveys
and other key EU players, including all the main European raw materials industry associations
(EUROMINES, IMA, EUROMETAUX and UEPG). The network must also be extended to include academics,
and the main technology and related platforms. The involvement of the European Technology Platform
on Sustainable Mineral Resources (ETP SMR), which brings together all the European industrial players,
is guaranteed by EuroGeoSurveys1.
1 www.eurogeosurveys.org
CLARUS – H2020-ICT-2014 – G.A. 644024 CLARUS-D2.1-DefinitionOfApplication Cases-Annex1-v1.0
© CLARUS Consortium 27 / 27
An Industrial Consultation Committee has been formed of main stakeholders and end user industries.
Academic institutes are not normally the owners of large, regional data sets, but may be able to
contribute some data. A major interest in the outcome of data processing is the European Commission
itself. The network should therefore strive to become the authoritative source on which the European
Commission can base its judgments and policy making.
Following types of stakeholders are identified for information about results of the projects since they
may become users of the data collected and referenced:
experts/professionals from the industry sector or their representative association, as well as the
research and innovation institutes, that are involved in a Sustainable Consumption and
Production approach following a circular economy from the natural resources, design,
manufacturing, distribution, use, collection and reuse, recycling and recovery.
environmental organizations/professionals regarding the existing initiatives on good practices
on raw materials sustainability and resource efficiency. Special attention will be given to
clarifying the production, transformation and recycling routes followed by ‘Raw Materials’
during their lifespan. The alternatives for specific raw materials in terms of natural occurrences,
needs, toxicity, hazard and risk will be present in parallel with the relevant regulations.
social and labor bodies will be informed on the existing work practices for the
production/transformation of raw materials by the different main producers.
4.4 Services
4.4.1 Mineral Statistics
A central part of any “Mineral Intelligence Network” is the ‘intelligence’, in other words the quality
controlled data and information derived from that data.
Existing datasets relating to mineral production and trade will be enhanced. New datasets for
exploration activity, resources and reserves and secondary raw materials will be developed. As data are
received, an initial assessment will be conducted, which will consider data quality, requirements for
harmonization, and identify key data and knowledge gaps.