international conference on intelligent systems design and applications 2009
DESCRIPTION
A Business Intelligence process to support Information Retrieval in an Ontology-based EnvironmentTRANSCRIPT
A Business Intelligence Processto support Information Retrieval
in an Ontology-Based Environment
Tommaso FedericiTuscia University
01010 Viterbo, [email protected]
Filippo Sciarrone, Paolo StaraceOpen Informatica srl – BI Division
Via dei Castelli Romani, 12/A00040 Pomezia, Italy
{f.sciarrone, p.starace}@openinformatica.org
IntroductionIntroduction• The competitiveness of a company depends on how
fast the company manages to take decisions regarding its business mission– Today’s market requires that data processing be also for
semantics, so as to achieve an extra competitive advantage in the business
– In Business Intelligence (BI) applications, Information Retrieval (IR) can be a critical function because most BI applications today rely on traditional keyword searching for their primary retrieval mechanism
• We propose to use the BI techniques to manage and reuse all data stored in an ontology-based Data Warehouse (DW):
– Data level: data involved in the DW aggregation process, which are the result of an IR indexing process
– OLAP Structures: multidimensional OLAP structures are automatically carried out by the ontology structures involved in the IR process
Research QuestionResearch Question
How Business Intelligence and Information Retrieval techniques can be merged to improve
the decision support process?
How Business Intelligence and Information Retrieval techniques can be merged to improve
the decision support process?
SummarySummary
• Semantic indexing process and system overview
• Dynamic dimension definition process
• Integration of the indexed data process
• Case study
• Semantic indexing process and system overview
• Dynamic dimension definition process
• Integration of the indexed data process
• Case study
unstructu
red docs
ontologies
terms s
et
The semantic indexing processThe semantic indexing process
The overall systemThe overall system
dm1
dm1
presentation layerpresentation layerpresentation layerpresentation layer
semantic dictionarystructured datastructured data indexindex
CUSTOM ETL PROCESS
data warehouse and datamarts
data warehouse and datamarts
Addictional integrable
datadm3
dm3
dm2
dm2
SummarySummary
• Semantic indexing process and system overview
• Dynamic dimension definition process
• Integration of the indexed data process
• Case study
concept_id description parent
1 one
2 two 1
parent subsidiary distance1 1 01 2 1
concept_id other_dim_id measure1 … …… … …
dimensiontable
bridgetable
facttable
Unbalanced ontology
tree
dynamic computation
Dynamic dimension definitionDynamic dimension definition
• We suggest the definition of some ad-hoc ontologies describing the scope of our analysis, with the aim to dynamically build those dimensional structures to join them to the ones previously obtained– Through the Computer Expert Ontology, the analyst can perform an
analysis of all the information content stored inside the unstructured documents
Dynamic dimension definitionDynamic dimension definition
SummarySummary
• Semantic indexing process and system overview
• Dynamic dimension definition process
• Integration of the indexed data
• Case study
Integration of the indexed dataIntegration of the indexed data
SELECT * FROM index JOIN op_fact_table ON index.op_fact_id = op_fact_table.id;
SELECT * FROM ((index JOIN op_fact_table ON index.op_fact_id = op_fact_table.id) JOIN std_dimension_dt ON op_fact_table.column = std_dimension_dt.column);
SELECT * FROM ((index JOIN op_fact_table ON index.op_fact_id = op_fact_table.id) JOIN std_dimension_dt ON op_fact_table.column = std_dimension_dt.column) JOIN ontology_dt ON index.concept = ontology_dt.concept;
SELECT ontology_dt.ontology_id, std_dimension_dt.std_dimension_id, sum(op_fact_table.measure) FROM ((index JOIN op_fact_table ON index.op_fact_id = op_fact_table.id) JOIN std_dimension_dt ON op_fact_table.column = std_dimension_dt.column) JOIN ontology_dt ON index.concept = ontology_dt.concept GROUP BY std_dimension_dt.std_dimension_id, ontology_dt.ontology_id;
SummarySummary
• Semantic indexing process and system overview
• Dynamic dimension definition process
• Integration of the indexed data process
• Case study
The Star SchemaThe Star Schema
The resulting schema, we obtained by our system, is composed by one dimension and one fact
•The dimension computerexpert_dt represents the ontology tree•The fact experience_ft represents the number of CVs that contain the concept
Pivot TablePivot Table
By this solution, the system supports a Human Resource manager to analyze his enterprise CV repository
Conclusions and Future WorksConclusions and Future Works• Our process allows for the reuse of ontologies defined in
semantic search engines dictionaries as OLAP dimensions thus providing for a stable solution to integrate indexed data in a semantic context
• To this aim, a two-steps process was implemented in such a way to let the overall system independent from the complexity of the ontologies
• Our future work will focus on the resolution of problems related to the management of many-to-many relationships
• A first evaluation was carried out in an industrial context yielding positive results on our proposal’s viability
Thanks to all for your attentionThanks to all for your attention
Questions?Questions?