international conference on intelligent systems design and applications 2009

16
A Business Intelligence Process to support Information Retrieval in an Ontology-Based Environment Tommaso Federici Tuscia University 01010 Viterbo, Italy [email protected] Filippo Sciarrone, Paolo Starace Open Informatica srl – BI Division Via dei Castelli Romani, 12/A 00040 Pomezia, Italy {f.sciarrone, p.starace}@openinformatica.org

Upload: paolo-starace

Post on 18-Dec-2014

263 views

Category:

Technology


0 download

DESCRIPTION

A Business Intelligence process to support Information Retrieval in an Ontology-based Environment

TRANSCRIPT

Page 1: International Conference on Intelligent Systems Design and Applications 2009

A Business Intelligence Processto support Information Retrieval

in an Ontology-Based Environment

Tommaso FedericiTuscia University

01010 Viterbo, [email protected]

Filippo Sciarrone, Paolo StaraceOpen Informatica srl – BI Division

Via dei Castelli Romani, 12/A00040 Pomezia, Italy

{f.sciarrone, p.starace}@openinformatica.org

Page 2: International Conference on Intelligent Systems Design and Applications 2009

IntroductionIntroduction• The competitiveness of a company depends on how

fast the company manages to take decisions regarding its business mission– Today’s market requires that data processing be also for

semantics, so as to achieve an extra competitive advantage in the business

– In Business Intelligence (BI) applications, Information Retrieval (IR) can be a critical function because most BI applications today rely on traditional keyword searching for their primary retrieval mechanism

Page 3: International Conference on Intelligent Systems Design and Applications 2009

• We propose to use the BI techniques to manage and reuse all data stored in an ontology-based Data Warehouse (DW):

– Data level: data involved in the DW aggregation process, which are the result of an IR indexing process

– OLAP Structures: multidimensional OLAP structures are automatically carried out by the ontology structures involved in the IR process

Research QuestionResearch Question

How Business Intelligence and Information Retrieval techniques can be merged to improve

the decision support process?

How Business Intelligence and Information Retrieval techniques can be merged to improve

the decision support process?

Page 4: International Conference on Intelligent Systems Design and Applications 2009

SummarySummary

• Semantic indexing process and system overview

• Dynamic dimension definition process

• Integration of the indexed data process

• Case study

• Semantic indexing process and system overview

• Dynamic dimension definition process

• Integration of the indexed data process

• Case study

Page 5: International Conference on Intelligent Systems Design and Applications 2009

unstructu

red docs

ontologies

terms s

et

The semantic indexing processThe semantic indexing process

Page 6: International Conference on Intelligent Systems Design and Applications 2009

The overall systemThe overall system

dm1

dm1

presentation layerpresentation layerpresentation layerpresentation layer

semantic dictionarystructured datastructured data indexindex

CUSTOM ETL PROCESS

data warehouse and datamarts

data warehouse and datamarts

Addictional integrable

datadm3

dm3

dm2

dm2

Page 7: International Conference on Intelligent Systems Design and Applications 2009

SummarySummary

• Semantic indexing process and system overview

• Dynamic dimension definition process

• Integration of the indexed data process

• Case study

Page 8: International Conference on Intelligent Systems Design and Applications 2009

concept_id description parent

1 one

2 two 1

parent subsidiary distance1 1 01 2 1

concept_id other_dim_id measure1 … …… … …

dimensiontable

bridgetable

facttable

Unbalanced ontology

tree

dynamic computation

Dynamic dimension definitionDynamic dimension definition

Page 9: International Conference on Intelligent Systems Design and Applications 2009

• We suggest the definition of some ad-hoc ontologies describing the scope of our analysis, with the aim to dynamically build those dimensional structures to join them to the ones previously obtained– Through the Computer Expert Ontology, the analyst can perform an

analysis of all the information content stored inside the unstructured documents

Dynamic dimension definitionDynamic dimension definition

Page 10: International Conference on Intelligent Systems Design and Applications 2009

SummarySummary

• Semantic indexing process and system overview

• Dynamic dimension definition process

• Integration of the indexed data

• Case study

Page 11: International Conference on Intelligent Systems Design and Applications 2009

Integration of the indexed dataIntegration of the indexed data

SELECT * FROM index JOIN op_fact_table ON index.op_fact_id = op_fact_table.id;

SELECT * FROM ((index JOIN op_fact_table ON index.op_fact_id = op_fact_table.id) JOIN std_dimension_dt ON op_fact_table.column = std_dimension_dt.column);

SELECT * FROM ((index JOIN op_fact_table ON index.op_fact_id = op_fact_table.id) JOIN std_dimension_dt ON op_fact_table.column = std_dimension_dt.column) JOIN ontology_dt ON index.concept = ontology_dt.concept;

SELECT ontology_dt.ontology_id, std_dimension_dt.std_dimension_id, sum(op_fact_table.measure) FROM ((index JOIN op_fact_table ON index.op_fact_id = op_fact_table.id) JOIN std_dimension_dt ON op_fact_table.column = std_dimension_dt.column) JOIN ontology_dt ON index.concept = ontology_dt.concept GROUP BY std_dimension_dt.std_dimension_id, ontology_dt.ontology_id;

Page 12: International Conference on Intelligent Systems Design and Applications 2009

SummarySummary

• Semantic indexing process and system overview

• Dynamic dimension definition process

• Integration of the indexed data process

• Case study

Page 13: International Conference on Intelligent Systems Design and Applications 2009

The Star SchemaThe Star Schema

The resulting schema, we obtained by our system, is composed by one dimension and one fact

•The dimension computerexpert_dt represents the ontology tree•The fact experience_ft represents the number of CVs that contain the concept

Page 14: International Conference on Intelligent Systems Design and Applications 2009

Pivot TablePivot Table

By this solution, the system supports a Human Resource manager to analyze his enterprise CV repository

Page 15: International Conference on Intelligent Systems Design and Applications 2009

Conclusions and Future WorksConclusions and Future Works• Our process allows for the reuse of ontologies defined in

semantic search engines dictionaries as OLAP dimensions thus providing for a stable solution to integrate indexed data in a semantic context

• To this aim, a two-steps process was implemented in such a way to let the overall system independent from the complexity of the ontologies

• Our future work will focus on the resolution of problems related to the management of many-to-many relationships

• A first evaluation was carried out in an industrial context yielding positive results on our proposal’s viability

Page 16: International Conference on Intelligent Systems Design and Applications 2009

Thanks to all for your attentionThanks to all for your attention

Questions?Questions?