diligent -...

37
From Digital Objects to Content across eInfrastructures DILIGENT DILIGENT : : Deploying Virtual Research Deploying Virtual Research Environments on-demand Environments on-demand Donatella Castelli, Pasquale Pagano ISTI-CNR Yannis Ioannidis Univ. of Athens

Upload: others

Post on 16-Jun-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

From Digital Objectsto Content acrosseInfrastructures

DILIGENTDILIGENT::Deploying Virtual ResearchDeploying Virtual ResearchEnvironments on-demandEnvironments on-demand

Donatella Castelli, Pasquale PaganoISTI-CNR

Yannis IoannidisUniv. of Athens

Page 2: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

OutlineOutline

Motivations & overview Achievements

DL related services DILIGENT Infrastructure ImpECt application

D4Science

Page 3: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Motivations Motivations –– from DLs to from DLs to VREsVREs

DLs are evolving into “Virtual Research Environments”(Collaboratoria)

Distributed frameworks for carrying out cooperative activitieslike “in silico experiments”, data analysis and processing,production of new knowledge using specialised tools

Largely based on retrieval and access of always updatedknowledge from diverse heterogeneous content sources

The knowledge produced is preserved and made available forother usages inside and outside the VRE

Page 4: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

VREsVREs trend trend

Highly dynamic, created anddismissed on-demand

Based on specialised toolswhich support the generationof new knowledge

M26

0

0,2

0,4

0,6

0,8

1

1,2

Info

rma

ion S

erv

ice

Bro

ke

r &

Ma

tch

ma

ke

r Ke

ep

er

DV

OS

VD

L G

en

era

tor

Co

nte

nt

Ma

na

ge

me

nt

Wra

pp

er

& M

on

ito

r

Co

nte

nt

Se

cu

rity

Me

tad

ata

Bro

ke

r

An

no

tati

on

Me

tad

ata

Ma

na

ge

me

nt

Da

ta F

usio

n

CS

DS

Pe

rso

na

liza

tio

n

Ind

ex

Se

rvic

e

Se

arc

h S

erv

ice

Fe

atu

re E

xtr

acti

on S

erv

ice

Pro

ce

ss

De

sig

n &

Ve

rifi

ca

tio

n

Pro

ce

ss

Ex

ecu

tio

n &

Re

lia

bil

ity

Pro

ce

ss

Op

tim

iza

tio

n

Art

e P

ort

al

Imp

EC

t P

ort

al

PrototypeAvailableBuild

Page 5: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

VRE systemVRE system

VRE

VRE System

Content SourcesDedicated Resources

Services

Computing & storage elements

Management and Orchestration…

Page 6: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

1-to-1 model: sustainability1-to-1 model: sustainability

Content Sources

Management and Orchestration

Dedicated Resources

Services

Computing & storage elements

The cost of a dedicatedsystem can be too high forvolatile VREs that use manyresources

Page 7: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

OutsourcingOutsourcing toto the the e-Infastructuree-Infastructure

e-Infrastructure

Shared Resources Management and Orchestration

Page 8: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Success Success factorsfactors//challengeschallenges

Infrastructure sustainabilityMechanisms for reducing the cost of the infrastructure mng

Supported VREsFlexible and high quality solutions for satisfying the needs

of many different applications domainsSimple procedures for creating VREs

Page 9: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

DILIGENT DILIGENT achievementsachievements

ImpECtEnvironmental Monitoring

DILIGENTInfrastructure

SAPIR-enabledAV search

ARTEEducation in the Humanites

gCube System

Page 10: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

gCube gCube overviewoverview

gCube Mw

gCub

eDat

a k

it

VRE Generator

Page 11: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

gCube gCube middlewaremiddleware

Simplifies the infrastructure management

Resource Content SourceService Comp&Storage

Resources registration, monitoring, notification,… Service deployment, dynamic reallocation, … Service composition

Page 12: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

gCube gCube MiddlewareMiddleware [[contcont.].]

gCube Mw

gCub

e D

ata

kit

Pres

erva

tion

Dat

a ki

t

Page 13: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

VRE VRE generatorgenerator

Transparent selection and orchestration ofresources by Offering a GUI Abstracting over complexity Abstracting over heterogeneity

Simplifies the construction of a VRE system

Page 14: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

VRE VRE generatorgenerator [[contcont.].]

gCube Mw

gCub

e D

ata

kit

VRE Generator

Pres

erva

tion

Dat

a ki

t

Page 15: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

gCube Data KitgCube Data Kit

Provides flexible search and management functionality

Data Fusion Browse Source sel. Feature extr.

Page 16: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Focus: Search ManagementFocus: Search Management

Most important framework forInformation Spaces

Most important functionality /service in Information Access

Rep

licat

ion

Bro

wse

Encr

yptio

n

Search Mgt

Content Mgt

Dat

a fu

sion

Page 17: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Main Objectives Main Objectives

An open, feature-rich, inherently-distributed SearchEngine Composed out of diverse, autonomous, pluggable

elements Capturing complex application scenarios combining

Information retrieval Data processing

Maximization of resources placed at the disposal ofVRE managers and users Ease of sharing of resources, avoiding mis-utilization

and misuse Reduction of cost of ownership and use

Page 18: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Objective: Optimal Utilization ofObjective: Optimal Utilization ofResourcesResources

Essential for: Maintaining QoS contracts Confronting infrastructure-raised challenges Attracting resources to the Grid

Special challenges: Uncontrolled and dynamic environment High-dimensional search space Multi-facet quality metrics Heterogeneity

Page 19: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Search Management KitSearch Management Kit

Search Management: orchestration of search services Operation highlights:

Planning & Optimization Distributed Information Retrieval Incremental result delivery

Page 20: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Distribution x 2Distribution x 2

Retrieval of Distributed Information

Distributed Retrieval of Information

Page 21: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Distribution #1: Information SourcesDistribution #1: Information Sources

System diversity Internal, registered/indexed by the system External, Google, JDBC data sources, ISIS/OSIRIS system

Data diversity Structured and semi-structured (xml) Geospatial and temporal Potentially thematically focused

Processing diversity Metadata structures Querying cost Ranking estimation

Images

Page 22: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Distribution #1: Information SourcesDistribution #1: Information Sources

THE CHALLENGE Characterizing and indexing a diversity of sources Selecting the appropriate sources Fusing/Merging the results in meaningful lists

Page 23: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Indexing for Content Based Search Indexing for Content Based Search

QueryExtract features

Portal Feature Extraction

Query Index

Metadata &Content Mgt

Index203 236 172 210 78

Access metadata& createResultSet

MDPresent results

Index Mgt

FeedBuild Index

Content &Metadata

Feature Extraction Service Feature Index

Page 24: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Selecting Sources and Fusing ResultsSelecting Sources and Fusing Results

Index

MetadataManager

ContentSource

Description

ContentSource

Selection

Data Fusion

Search

ExternalSource

ExternalSource

ContentManager

MetadataCollections

ContentCollections External

Repositories

Describe

Indices

IndexStatistics

SourceDescriptions

Select Sources

Query Sources

Query SourcesAcquire Results

Acquire

Results

Reranked Lists

Page 25: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Distribution #2: Information RetrievalDistribution #2: Information Retrieval

Numerous Search services, for info retrieval & processing Structured data and XML processing (scanners, sorters,

joiners, filterers, transformers, retrievers) Lookups (indices, FT indices, XML indices, Geo indices) Content-based searches External source probes Fusion / Merging of results

Query language (internal) for interfacing Workflow language (BPEL) for execution Data transport mechanism (ResultSet) for communication

Page 26: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Query and Workflow ManagementQuery and Workflow Management

project by 'title', 'description', 'subject'on (keeptop 20

on (sort ASC by 'DocID'on (merge

on (fieldedsearchby 'title' contains '*woman'in 'ENGLISH'on ‘CollectionOfMedicalImages'as 'dc')

and (fieldedsearchby 'description' contains '*term*'in 'ENGLISH'on ‘CollectionOfMedicalBooks'as 'dc')

))

)

Produce & Execute BPEL Workfl

owOptimization

Complex Cost CalculationProfiling / MonitoringResource selection “hinting”Domain specific planning…

Parallelization

Active Planning

Query

Page 27: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Queries & Workflows: It can getQueries & Workflows: It can getcomplexcomplex……

project by 'title', 'date' on(sort ASC by 'DocID' on

(merge on//MAP REPORTSkeeptop 8 on

(sort ASC by 'RankID' on(join inner by 'DocID' on

(fulltextsearch by 'Mediterranean' in 'ENGLISH' on 'd369b3e0-fa4c-11db-a297-9c01d805f283')and

(fulltextsearch by 'Environmental' in 'ENGLISH' on 'd369b3e0-fa4c-11db-a297-9c01d805f283')))

keeptop 8 on (sort ASC by 'RankID' on (join inner by 'DocID' on (fulltextsearch by 'Mediterranean' in 'ENGLISH' on'd369b3e0-fa4c-11db-a297-9c01d805f283') and (fulltextsearch by 'Environmental' in 'ENGLISH' on 'd369b3e0-fa4c-11db-a297-9c01d805f283')))

// EEA reportskeeptop 8 on

(sort ASC by 'RankID' on(fieldedsearch by 'date' contains '*1999*' on

(join inner by 'DocID' on(fulltextsearch by 'air polution' in 'ENGLISH' on '25ad3c50-fa41-11db-a270-9c01d805f283')

and(fulltextsearch by 'european' in 'ENGLISH' on '25ad3c50-fa41-11db-a270-9c01d805f283')

))

))

)

Page 28: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Optimal Utilization of ResourcesOptimal Utilization of Resources

Pre-query optimization: Monitoring and adaptation of VRE layout for optimal resource use

Content Source Selection: Filtering of collections unlikely to contain useful data Query terms and automatically pre-constructed Content Source

Descriptors Query Planning:

Cost based optimization Heuristics and space-search

Process Execution: Process optimization selects and allocates appropriate resource for tasks

On-The-Spot processing: ResultSet mechanism to allow local filtering of large XML chunks of data

Further mechanisms to facilitate efficient searches: Indices ResultSet transport mechanism

Page 29: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Information Retrieval: How it WorksInformation Retrieval: How it Works

SearchMaster

Planner

Que

ryP

repr

oces

sing

Search Service

Query

Environment info

PES

Query Parser

WorkflowPersonalization

CSS

Linguistics

DIS

XML Sorter

XML Merger

XML Transformer

XML Joiner

XML Processor

External Source

FTI Lookup

Data Fusion

Metadata Catalog

Results

Geo IndexLookup

Feature IndexLookup

S1

F1F2

F4

C

J1

M1

S2

F3

S3

T

bpel4ws

Q

E

Q

P

Q

ActivePlanning

Page 30: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

From Digital Objectsto Content acrosseInfrastructures

from theory ...from theory ...... to reality... to reality

Page 31: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

Next step: DILIGENT for ScienceNext step: DILIGENT for Science

Provide and operate a production D4Science e-Infrastructure Consolidate and extend gCube Built VREs serving Environmental Monitoring and Fishery

Resources Management domains

Page 32: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

MainMain technologicaltechnological challengeschallenges

Provide and operate a production D4Science e-InfrastructureDefine the operational procedures for sites (sitess include content andservice sites)

Consolidate and extend gCubeExtend the the Data Kit to deal with very large and heterogenouscontent sources (e.g. textual repositories, satellite images, statisticaldatabases) and other content-related resources (e.g. gazetters,ontologies, thesauri)

Build VREs serving Environmental Monitoring and FisheryResources Management domains

Serve the needs of a multitude of researchers and decision-makersfrom many disciplines (biologists, climatologists, GIS experts, socio-economists, fishery managers, etc.) operating with many differenttools

Page 33: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

http://www.diligentproject.org

http://www.d4science.org/

Page 34: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

From Digital Objectsto Content acrosseInfrastructures

Thank you!Questions?

Page 35: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

From Digital Objectsto Content acrosseInfrastructures

Additional

Slides

Page 36: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

gCube SystemgCube System

An application framework for the development ofservices that can be outsourced to a grid-enabledinfrastructure

An advanced container for the hosting of WS on thegrid

A runtime environment for the provision of information about shared resources management of services and applications execution of VRE build-in services: content and

metadata management; indexing, selection, fusion,extraction, description, annotation, transformation, andpresentation of content

Page 37: DILIGENT - puma.isti.cnr.itpuma.isti.cnr.it/rmydownload.php?filename=cnr.isti/cnr.isti/2007-A3... · DILIGENT: Deploying Virtual Research Environments on-demand Donatella Castelli,

Rome, 29-30th October 2007European Information Space: Infrastructures, Services and Applications Workshop

VREsVREs: new requirements: new requirements

Persistent and consolidatede.g. serving a team of individuals in

addressing the mission of aninstitution

Analysis and production of newknowledgee.g. serving a research team which

produces new results throughcomplex analysis and simulation

Focus on publicatione.g. supporting the publishing and

archival of content

Highly dynamic, created anddismissed on-demand e.g. supporting the activities of a

project addressing a specificchallenge