applying ontology-informed lattice reduction using the … · 2019-07-18 · applying...

16
The University of Manchester Research Applying Ontology-Informed Lattice Reduction Using the Discrimination Power Index to Financial Domain DOI: 10.1007/978-3-030-19037-8_11 Document Version Accepted author manuscript Link to publication record in Manchester Research Explorer Citation for published version (APA): Quboa, Q., Mehandjiev, N., & Behnaz, A. (2019). Applying Ontology-Informed Lattice Reduction Using the Discrimination Power Index to Financial Domain. In N. Mehandjiev, & B. Saadouni (Eds.), Enterprise Applications, Markets and Services in the Finance Industry (Vol. 345, pp. 165-179). Springer Nature. https://doi.org/10.1007/978-3-030-19037-8_11 Published in: Enterprise Applications, Markets and Services in the Finance Industry Citing this paper Please note that where the full-text provided on Manchester Research Explorer is the Author Accepted Manuscript or Proof version this may differ from the final Published version. If citing, it is advised that you check and use the publisher's definitive version. General rights Copyright and moral rights for the publications made accessible in the Research Explorer are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. Takedown policy If you believe that this document breaches copyright please refer to the University of Manchester’s Takedown Procedures [http://man.ac.uk/04Y6Bo] or contact [email protected] providing relevant details, so we can investigate your claim. Download date:02. Oct. 2020

Upload: others

Post on 27-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

The University of Manchester Research

Applying Ontology-Informed Lattice Reduction Using theDiscrimination Power Index to Financial DomainDOI:10.1007/978-3-030-19037-8_11

Document VersionAccepted author manuscript

Link to publication record in Manchester Research Explorer

Citation for published version (APA):Quboa, Q., Mehandjiev, N., & Behnaz, A. (2019). Applying Ontology-Informed Lattice Reduction Using theDiscrimination Power Index to Financial Domain. In N. Mehandjiev, & B. Saadouni (Eds.), Enterprise Applications,Markets and Services in the Finance Industry (Vol. 345, pp. 165-179). Springer Nature.https://doi.org/10.1007/978-3-030-19037-8_11Published in:Enterprise Applications, Markets and Services in the Finance Industry

Citing this paperPlease note that where the full-text provided on Manchester Research Explorer is the Author Accepted Manuscriptor Proof version this may differ from the final Published version. If citing, it is advised that you check and use thepublisher's definitive version.

General rightsCopyright and moral rights for the publications made accessible in the Research Explorer are retained by theauthors and/or other copyright owners and it is a condition of accessing publications that users recognise andabide by the legal requirements associated with these rights.

Takedown policyIf you believe that this document breaches copyright please refer to the University of Manchester’s TakedownProcedures [http://man.ac.uk/04Y6Bo] or contact [email protected] providingrelevant details, so we can investigate your claim.

Download date:02. Oct. 2020

Page 2: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

Applying Ontology-informed Lattice Reduction Using

the Discrimination Power Index to Financial Domain

Qudamah Quboa1, Nikolay Mehandjiev2 and Ali Behnaz3

1 Alliance Manchester Business School, University of Manchester, UK

2 School of Computer Science and Engineering, University of New South Wales, Australia

{qudamah.quboa, n.mehandjiev}@manchester.ac.uk,

[email protected]

Abstract. Contemporary financial institutions are relying on varied and

voluminous data and so they need advanced technologies to provide their

customers with the best possible services. Capturing the meaning, or semantics,

of data and presenting these semantics in simplified yet relevant models are key

challenges to achieving this. Formal Concept Analysis (FCA) automates the

analysis of properties and instances of the data, generating a lattice which groups

properties and instances into concepts. This lattice can be used as automatically

generated semantic structure describing the domain, yet the complexity and size

of the resultant lattice render this technique unusable in most practical cases

involving financial data. To tackle this, our Ontology-informed Lattice

Reduction approach can guide the reduction of the lattices generated from

financial sampled data. We validate the adaptation of the approach to the

financial domain through a real-world asset allocation case study, demonstrating

that the approach achieves good overall performance and relevant results.

Keywords: FCA, Semantic structures, Lattice reduction, Validation.

1 Introduction

The financial industry has been undergoing immense changes and disruption in the last

decade. Indeed, one can argue that no force has influenced the industry more than

abundant data and cheap computational power. Modern financial institutions are

focused on excellence through advanced information technologies. To provide state-

of-the-art customer service, financial institutions collect vast amount of data including

news data, financial market data, sentiment data and data from other exogenous factors.

However, the heterogeneous nature of this data makes it difficult for players in financial

industry to obtain real benefits from their data. One key obstacle in leveraging data

assets is difficulties in capturing the meaning (semantics) of the data. Therefore,

knowledge acquisition is a key step in the process and its automation will be an enabler.

Automating knowledge acquisition involves [1], processing voluminous data [2],

understanding the meaning of this data and its relations [3] and presenting the outcomes

in relevant yet simplified models [4].

Page 3: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

2

For practical use, manual semantic tagging of data is very expensive task and

many scientific researchers are working to automate this task. As an example, most data

in financial markets has high volume and needs to be processed at a high speed to

provide value for its users. Formal Concept Analysis (FCA) is one technique for tagging

semantics to data. FCA takes a matrix of incidence relationships between sampled data

properties (intent) and their object instances (extent), named a formal context, and then

builds a lattice of partial order relations between the two sets (instance and property

sets). One major issue here is the complexity and nosiness of lattices produced by FCA

to be used for practical semantic analysis of real-world datasets. To solve this issue, a

reduction of lattices is required and existing approaches to achieve that are based on

mathematical measurements of relevancy [4]. They are agnostic about any prior

knowledge regarding the targeted domain, even when it is already formalised and

represented in an ontology or semantic structure.

Inspired by the similarities between ontology-based semantic and FCA

representations, different approaches have been proposed to use the combination of

both such as ontology modelling and attribute exploration [5,6] and merging different

ontologies[7,8]. However, the use of existing domain knowledge (represented as

ontology) to support the lattice reduction has not been explored until now. In our

research [9], we proposed Ontology-informed Lattice Reduction approach that address

the attaching of semantics to further instances in the domain through FCA using

existing domain ontology.

The approach uses prior domain knowledge (encoded in semantic format -

ontology) to classify and guide the reduction process of a sampled formal context where

not all instances are in the ontology. In addition, the approach relies on a new relevancy

metric called Discrimination Power Index (DPI) that is used to automatically classify

any new instances based on the shared instances and the overall power of a property

within the formal context.

In this paper, we present a more detailed analysis and testing of the proposed

approach [9] to evaluate and validate the performance of the approach especially when

facing the problem of large lattices generated when applying FCA to real-world data.

A financial real-world case study is presented to confirm the feasibility and the validity

of the approach.

The remainder of this paper is structured as follows: Section 2 describes shortly

the background of the work. Sections 3 summarises briefly the proposed approach and

its stages. Section 4 introduces a real-world case study in financial markets to evaluate

the adaption of the approach. Then Section 5 discusses the results and the statistical

analysis of applying the approach and Section 6 concludes the paper.

2 Literature Review

2.1 Analytics in Financial Domain

As data is becoming more abundant, organizations are looking for ways to acquire

actionable insights from their data, and hence make better decisions, achieve value and

competitiveness. Domain experts and analytics units at organizations have access to

sophisticated analytics solutions which serve their users’ ways of conducting analytics.

Page 4: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

3

For instance, a bank would be interested to forecast price moves in a particular

asset. There are thousands of assets (instances) to be analysed, and for each asset

hundreds of properties (intents) can be input into a predictive model to forecast the

changes in the respective asset [10, 11].

A typical analytics problem in finance domain has heterogeneous complicated

datasets which need to be understood easily and acted upon. As a result, knowledge

acquisition and knowledge representation are key elements in enhancing big data

analytics problems. Semantic web technology and ontologies present a solution to

capture knowledge in a domain [12].

2.2 Semantic Web Ontology

Ontology is a well-known knowledge representation method and widely supported by

both the academic and the industry domains in terms of available software and tools.

Ontologies defines as “explicit formal specifications of the terms in the domain and

relations among them” [12]. Different general and specialist ontologies in various

domains have developed by experts to capture knowledge and pass information in

standardised way.

It is mainly designed to define a set of data and its structure, constrains for the

use of other applications, and commonly utilised as a data sharing mechanism between

various programs or software agents. For instance, FIBO (the Financial Industry

Business Ontology) is an “industry initiative to define financial industry terms,

definitions and synonyms using semantic web principles [13].”

2.3 Formal Concept Analysis (FCA)

Formal Concept Analysis (FCA) is a mathematical formalism to automatically analyse

the structure of a domain of interest [2]. It generates a lattice of concepts representing

incidence relations between sets of observed properties (intent) and their object

instances (extent) in a target domain. The generated lattice is constituted by formal

concepts produced from mapping these relationships onto a knowledge structure that

reflects the specialisation and generalisation among the concepts of resultant lattice [2].

“Sports and their attributes” [14] (presented in Table 1) is an example of formal

context (K). In this example, the extent set (G) is {Run, Gymnastics, Triathlon,

Football, Tennis, Baseball, Curling, Diving, Rowing}, the intent set (M) is {on land, on

ice, in water, collective sport, individual sport, using ball, needs opponent, multiple

disciplines} and the sign X identifies a relation between instances of G and properties

of M (representing the set of I).

Page 5: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

4

Table 1. “Sports and their attributes” formal context (taken from [14]).

By processing the relationships between properties and instances of this example, the

corresponding concept lattice will be formed, as shown in Fig. 1, where each node

represents a formal concept and each connecting edge represents a subconcept-

superconcept relationship.

Fig. 1. Concept Explorer-generated FCA lattice of “Sports and their attributes” example [14].

2.4 Existing Lattice Reduction Techniques

A bottleneck problem with FCA mechanism is the huge size of lattices created from

real-world data sets because of noise and exceptions [2, 15]. Various methods have

been proposed to reduce the lattice using the structure of the lattice itself. These are

categorised into redundant information removal, simplification or selection [4].

The focus of the work is the last category, the selection reduction, which

represents any approach that emphases on selecting specific properties, instances and/or

concepts based on different measurements of relevancy (a set of constraints that

on land on ice in watercollective

sport

individual

sportusing ball

needs

opponent

multiple

disciplines

Run X X

Gymnastics X X X

Triathlon X X X X

Football X X X X

Tennis X X X X

Baseball X X X X

Curling X X

Diving X X

Rowing X X

Page 6: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

5

requires to be satisfied). This could be depended on various measurements, such as

logic (according to a user's attributes priorities [16]), weight (frequent weighted concept

reduction [17]), or hierarchies (using hierarchically ordered attributes [18]). This kind

of reduction is performed after completing the construction of the formal context [4].

All these approaches mainly rely on the lattice structure and are agnostic about

any prior knowledge about the domain that makes the results more vulnerable to

systemic noise in the data.

2.5 Similarity Measurements

Three similarity measurements are used to align the two different formal contexts

(ontology-derived and sampled) and integrate them. The first two are:

Jaccard Similarity Coefficient Index. This well-known similarity measurement [19]

is based on the following formula:

𝐽𝑎𝑐𝑐𝑎𝑟𝑑 𝐼𝑛𝑑𝑒𝑥 𝑆𝐽𝑎𝑐 = |𝐵1∩𝐵2|

|𝐵1∪𝐵2| (1)

Hamming Distance Index. This is also a well-known similarity measurement [20] and

is based on the following formula:

𝐷ℎ𝑎𝑚𝑚𝑖𝑛𝑔 = 𝑏 + 𝑐 (2)

It is worth to mention that these two measurements weight all properties as of equal

importance whilst our observations reveal that specific properties could have a higher

discrimination power than others.

We thus introduced a new complementary index called Discrimination Power

Index (DPI) [9] to identify a unique most-similar concept. It enhances the similarity

selection process in picking one of the possible concepts (pre-filtered from the previous

indices) based on their properties’ overall discrimination power. This is formally

defined as [9]:

𝐷𝑃𝐼 = |{∀𝑔∈𝐺 | 𝑏 ∈𝐵1 ∩𝐵2:(𝑔,𝑏)∈𝐼}|

|𝐺| (3)

The theoretical explanations of each of the mentioned equations are covered in [9].

3 Ontology-informed Lattice Reduction Approach

Our approach presented in [9] uses existing knowledge about the target domain

(encoded in a semantic ontology format) to support reduce the FCA-generated lattice

when extracting structure from sampled formal context (data). The approach begins by

extracting and transforming all required information into acceptable formats. It then

starts by recognising object instances that exist in the formal context and the ontology-

derived context and aligns concepts using these identified instances. This is followed

by automatically structures the rest of object instances from the formal context, using

(a) basic alignment routine that rely on the properties they have in common with the

Page 7: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

6

shared instances and (b) advanced alignment routine that is based on the similarity

measurements and the discrimination power of the properties. At the end, the resultant

extended structure will be reduced based on the user’s reduction threshold. The general

outline of the approach shown in Fig. 2.

Fig. 2. The outline of the Ontology-informed Lattice Reduction approach [adapted from 9].

In the next subsections, a summary of each stage of the proposed approach (presented

in Fig. 2) is provided.

3.1 Data Extraction:

During this stage, Protégé, a well-known Ontology editor and knowledge acquisition

software, is used to establish a deep understanding of the domain ontology as well as

helping in the construction of the constrains of the retrieval queries.

To retrieve the required features and their instances from any Semantic Web

dataset, it is necessary to use a semantic query language. SPARQL, recommended by

W3C, is used as a simple protocol and Semantic Web query language to perform the

querying via pattern matching [21].

Lastly, ARQ engine is a query engine for Jena (Java framework for building

Semantic Web applications), which provides the support to the standard SPARQL

query language, is used to execute the SPARQL queries, extract the results, and save

them into temporary comma-separated values (csv) file that will be used in the next

stage.

3.2 Data Transformation:

The main purpose of the stage is to reformat the extracted information from the

ontology source into a formal context format (in a tabulated format) and support it with

transforming the hierarchal structure of the extracted data based on the ontology source

itself and storing the results in a matrix format to easy the access.

Page 8: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

7

The developed algorithms for this stage support multiple hierarchal levels of

ontology structure and for any number of instances. This stage works as a prepossessing

stage to complete all the required preparations to align the different datasets and

reducing the results later on.

3.3 Data Alignment

Initially, a basic matching routine is applied based on instances existing in both the

ontology and the sampled formal context. This is followed by a more complex

classification routine for any instance not existing in the ontology and none of the

classified instances shares the same properties. This is achieved by assigning any

unknown concept from the sampled formal context to one of the existing ontology

concepts using a combination of different similarity indices to evaluate and align

concepts from both contexts based on their intents and extents.

This advanced alignment routine starts with a Jaccard similarity coefficient index

followed by optional use of a Hamming distance, and then the proposed DPI [9] (again

optional). The outcome of this stage is an extended formal context incorporating both

the ontology-derived knowledge and the sampled formal context.

3.4 Data Reduction

Two indices are proposed in [9] to provide an indication whether a property is essential

to a specific concept from the ontology or not:

RAindex is the first reduction index that focuses on evaluating the weight of every

property in the sampled formal context regarding each and every extracted concepts

from the domain ontology.

RBindex is a complementary index to indicate the concepts that need the reduction

by relying on both the property and the ontology’s concept in making the call.

Depending on the outcomes of both RAindex and RBindex indices, the reduction

function is executed when required. The purpose of this function is to eliminate the

incidence relations between the assessed property related to the used ontology concept

and the object instances.

The theoretical explanations and the justifications of each of the mentioned

indices and the reduction function are covered in [9]. This stage depends mainly on the

sampled formal context and the extracted ontology information and hierarchical

structure (prepared in the data extraction and data transformation stages) to work out

which of the incidence relationships need to be removed from the extended formal

context (resulted from the data alignment stage).

4 Financial Assets Allocation Case Study (Ontology and Formal

Context of Exchange Traded Funds)

In this case study, we use semantic technology to capture and represent the knowledge

related to asset allocation. Asset allocation intends to distribute investment among

different financial assets so as to achieve a certain investment strategy [22]. The

Page 9: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

8

knowledge is extracted and represented in an ontology including a list of Exchange

Traded Funds (ETF1). The goal in this case is to design and build an automated financial

advice system which helps people invest and manage their funds at a fraction of the

cost for human financial advice. As part of asset allocation, we explore, analyse and

select a number of assets (in this case ETFs). We have obtained the data for these ETFs

from Bloomberg. In doing so, we have adopted Bloomberg terminology for assets, their

properties and respective categories. The sample consists of 87 ETFs (instances) and

42 (properties) presented in the Asset Selection ontology (see Fig. 3). Table 2 provides

simple statistics for both the ontology and sampled formal context then Table 3 presents

the data description of the financial sampled formal context properties.

Table 2. Basic statistical analysis of the financial sampled formal context and its ontology.

No of

Instances

No of Properties

(Classes)

No of

Concepts

No of Levels /

Height

No of

Edges

Sampled

Formal Context 87 36 546 9 1722

Ontology 87 42 44 4 74

Table 3. The data description of the sampled financial formal context.

Properties Possible Values

Closed for new creations {Yes, No}

Leverage / Short {Yes, No}

Invests in Derivatives {Yes, No}

Invests in Swaps {Yes, No}

Invests in Physical

Commodities {Yes, No}

Actively Managed {Yes, No}

Currency Hedged {Yes, No}

Index Replication Strategy {Full, Optimized, Not Applicable, Derivative}

Index Weighting

Methodology

{Market Cap, Not Applicable, Single Asset, Multi Factor,

Fundamentals, Dividend, Proprietary, Equal}

Rebalancing Frequency {Quarterly, Not Applicable, Yearly, Semi-Annually,

Other, Monthly}

Creation / Redemption {In-kind, In-kind/Cash, Not Applicable, Cash}

Dividend Frequency {Quarter, Semi-Anl, Annual, None, Monthly, Irreg}

Risk {Low, High}

1 Exchange Traded Funds or ETFs are a basket of other assets that are designed to trace the

performance of an index

Page 10: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

9

Fig. 3. Asset Selection Ontology.

5 Analysis and Discussion

To validate the validity of applying the approach in this financial context, different

validation measurements are carried out to ensure each and every stage is generating

the expected outcome with the right results.

By relying on Protégé, the construction of the SPARQL query is achieved and the

retrieved results (using ARQ Engine) have all the main classes and sub-classes and their

hierarchal structure levels within the domain ontology. The extracted information is

then compared with the ontology itself to confirm the correctness of outcome. This

includes matching the number of extracted instances (objects) and their classes-

subclasses properties.

Moving to the second phase, Data Transformation, the validation here is achieved

by comparing the outcome of the previous stage with the transformed results and

making sure that all the data are transformed and correctly aligned.

During the Data alignment stage, two measurements are used to evaluate and

validate the process: (1) testing the functionality of the basic alignment routine alone,

and (2) testing the efficiency of alignment process with the advanced alignment routine

(the Jaccard index, Hamming Distance, and DPI).

For the first test, the experiment starts by using only known instances that exist in

both the financial sampled formal context and the financial ontology. Then the

validation of the alignment routine is confirmed by comparing the outcome of this stage

(Extended financial formal context) with the original sources and ensuring that all the

instances are classified accurately as they should be.

For the second evaluation, 50% of the ontology instances are removed to create

unknown instances in the sampled financial formal context while the other 50% are

kept to construct the training part for the advanced alignment routine to work. Then the

resulted classification outcome is evaluated and compared with the original one to

confirm the performance of the alignment. Table 4 presents a summary of the results.

It is worth to mention that the accuracy of the alignment for the top level of the ontology

Page 11: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

10

is 100% and the reason of having a less accurate rate for the sublevels concepts is the

large possible combinations of the properties that do not exist in the training dataset.

Table 4. Alignment Validation Results (50% Training - 50% Testing).

Semantics Attachment Alignment

Accuracy

No of

Instances

Overall

Percentage

100% 54 62%

95% 1 1%

93% 10 11%

90% 22 25%

Below 90% 0 0%

During the last stage of the applying the approach, the performance of the reduction

function is evaluated using two different extended formal contexts: (a) the extended

one using the basic alignment routine and (b) the extended one using both alignments

routines (50% Training – 50% Testing). The performance of the function is evaluated

at various reduction thresholds (presented in Table 5 and Table 6 respectively).

Table 5. The reduction performance using the fully classified formal context (the extended

formal context) based on the basic alignment routine only.

Red

uct

ion

Th

resh

old

No

of

the

con

cep

ts o

f th

e

Fo

rmal

Co

nte

xt

Act

ual

Red

uct

ion

%

To

tal

No o

f

Ed

ges

No

of

Lev

els

/

Hei

gh

t

To

tal

No o

f

Co

nce

pts

wit

h

the

On

tolo

gy's

Co

nce

pts

No

of

On

tolo

gy

Co

nce

pts

No

of

ov

erla

pp

ed

con

cep

ts

90% 24 95.6% 40 5 59 44 9

80% 24 95.6% 40 5 59 44 9

70% 24 95.6% 40 4 61 44 7

60% 31 94.3% 54 5 85 44 10

50% 97 82.2% 248 8 203 44 62

40% 148 72.9% 414 8 293 44 101

30% 269 50.7% 849 9 487 44 174

20% 324 40.7% 1030 9 598 44 230

10% 477 12.6% 1519 9 801 44 280

0% 546 0.0% 1722 9 878 44 288

Page 12: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

11

Table 6. The reduction performances using the extended formal context based on the basic and

advanced alignments routines (50% Training - 50% Testing).

Red

uct

ion

Th

resh

old

No

of

the

con

cep

ts o

f th

e

Fo

rmal

Con

tex

t

Act

ual

Red

uct

ion

%

To

tal

No o

f

Ed

ges

No

of

Lev

els

/

Hei

gh

t

To

tal

No o

f

Co

nce

pts

wit

h

the

On

tolo

gy's

Co

nce

pts

No

of

On

tolo

gy

Co

nce

pts

No

of

ov

erla

pp

ed

con

cep

ts

90% 24 95.6% 39 4 57 44 11

80% 24 95.6% 39 4 59 44 9

70% 30 94.5% 48 5 69 44 5

60% 43 92.1% 76 6 92 44 5

50% 121 77.8% 315 8 214 44 49

40% 185 66.1% 533 8 303 44 74

30% 313 42.7% 984 9 477 44 120

20% 334 38.8% 1058 9 520 44 142

10% 486 11.0% 1546 9 682 44 152

0% 546 0.0% 1722 9 790 44 200

Figs. 4 and 5 illustrate the generated lattice of the reduced sampled formal context at a

reduction threshold of 60% for both alignments experiments respectively without the

semantic attachments.

It could be noticed that the performance of the approach using the advanced

alignment routine (training and testing datasets) is very similar to the actual known one,

which reconfirms the efficiency of the alignment stage.

In addition, the reduction function is working as expected and performing well

in reducing the sampled data and presenting the results using simplified (and relevant)

financial lattice, even when the knowledge base only provides partial coverage of the

domain of interest.

It is worth to mention that (1) the reduction threshold represents the minimum

ratio to pass the RBindex and not being flagged as an unnecessary concept. (2) The

actual reduction is different from the reduction threshold and its value is vary depending

on the incidence relationships of the actual formal context and its semantic extension.

Page 13: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

12

Fig

. 4

. T

he

actu

al r

edu

ced

sam

ple

d f

orm

al c

on

text

at 6

0%

red

uct

ion

th

resh

old

(w

ith

ou

t th

e on

tolo

gy a

ttac

hm

ents

).

Page 14: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

13

Fig

. 5

. T

he

actu

al r

edu

ced

sam

ple

d f

orm

al c

on

text

at 6

0%

red

uct

ion

th

resh

old

(w

ith

ou

t th

e on

tolo

gy a

ttac

hm

ents

) b

ased

on

50

% T

rain

ing

and

50

% T

esti

ng.

Page 15: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

14

6 Conclusion

The financial institutions gather vast amount of data from various resources including

financial market data and news data. However, to gain an advantage as a player in the

market and obtain real benefits, the understanding of the meaning (semantics) of the

data and the presentations of outcomes in simplified and relevant models are the key

obstacles that need to be solved. Formal Concept Analysis (FCA) helps with the

semantics by generating a lattice that comprises partial order relations between sets of

properties (intent) and their instances (extent) in a domain that maps onto a semantic

structure. The problem is the resultant lattice is too complex and noisy.

Using existing domain knowledge to inform and reduce a formal context (that is

taken by FCA) is an opportunity that is being utilised in this work to simplify the

resultant lattice and presents relevant models.

In this research, the Ontology-informed Lattice Reduction approach is applied to

the financial domain as this approach relies on the use of an existing ontology to inform

and reduce the financial sampled formal context based on different alignments and

reductions measurements.

We specifically apply the approach to asset allocation problem in financial

markets and assess the feasibility and validity of the different stages of the approach

and the performance of the approach in regards to this real-world case study. The

achieved results are good and pass all the testing measurements in producing creating

a simplified, yet relevant, result that could be used in practice.

In the future work, we will (1) Extend the reduction approach to include the

reasoning of ontology’s description logics (DLs) to increase the accuracy of the

reduction constrains. (2) Continue the work on the semantic attachments and add a

“loopback” feature as a new extension that permits the utilisation of the approach

outcomes to enrich the existing ontology.

References

1. De Mauro, A., Greco, M., Grimaldi, M.: A formal definition of Big Data based on its

essential features. Library Review, 65(3), 122-135 (2016).

2. Singh, P. K., Kumar, C. A., Gani, A.: A comprehensive survey on formal concept analysis,

its research trends and applications. International Journal of Applied Mathematics and

Computer Science, 26(2), 495-516 (2016).

3. Rouane, M. H., Huchard, M., Napoli, A., Valtchev, P.: A proposal for combining formal

concept analysis and description logics for mining relational data. In: International

Conference on Formal Concept Analysis, pp. 51-65. Springer, Berlin, Heidelberg (2007).

4. Dias, S. M., Vieira, N. J.: Concept lattices reduction: Definition, analysis and classification.

Expert Systems with Applications, 42(20), 7084-7097 (2015).

5. Ignatov, D. I.: Introduction to formal concept analysis and its applications in information

retrieval and related fields. In: Braslavski, P., Karpov, N., Worring, M., Volkovich, Y.,

Ignatov, D. (eds.) Information Retrieval, vol. 505, pp. 42-141. Springer, Cham (2014).

6. Baader, F., Ganter, B., Sertkaya, B., Sattler, U.: Completing Description Logic Knowledge

Bases Using Formal Concept Analysis. In: Proceedings of the 20th International Joint

Conference on Artificial Intelligence (IJCAI), pp. 230-235. Hyderabad, India (2007).

Page 16: Applying Ontology-Informed Lattice Reduction Using the … · 2019-07-18 · Applying Ontology-informed Lattice Reduction Using the Discrimination Power Index to Financial Domain

15

7. Stumme, G: Using ontologies and formal concept analysis for organizing business

knowledge. In: Becker, J., Knackstedt, R. (eds.) Wissensmanagement mit

Referenzmodellen, pp. 163-174. Physica, Heidelberg (2002).

8. Sarmah, A. K., Hazarika, S. M., Sinha, S. K.: Formal concept analysis: current trends and

directions. Artificial Intelligence Review, 44(1), 47-86 (2015).

9. Quboa, Q., Behnaz, A., Mehandjiev, N., Rabhi, F., Petrounias I.:. Ontology-informed Lattice

Reduction Using the Discrimination Power Index. Working paper under review. Dec 201

10. Behnaz, A., Natarajan, A., Rabhi, F. A., Peat, M.: A Semantic-Based Analytics Architecture

and Its Application to Commodity Pricing. In International Workshop on Enterprise

Applications and Services in the Finance Industry, pp. 17-31. Springer, Cham (2016).

11. LaValle, S., Lesser, E., Shockley, R., Hopkins, M. S., Kruschwitz, N.: Big data, analytics

and the path from insights to value. MIT Sloan Management Review, 52(2), 21-32 (2011).

12. Gruber, T. R.: A translation approach to portable ontology specifications. Knowledge

acquisition, 5(2), 199-220 (1993).

13. Financial Services Standards, http://www.omg.org/hot-topics/finance.htm, last accessed

2018/04/19.

14. Belohlavek, R., Trnecka, M.: Basic level of concepts in formal concept analysis. In:

International Conference on Formal Concept Analysis, pp. 28-44. Springer, Berlin,

Heidelberg (2012).

15. Singh, P. K., Kumar, C. A.: Concept lattice reduction using different subset of attributes as

information granules. Granular computing, 2(3), 159-173 (2017).

16. Belohlavek, R., Vychodil, V.: Formal concept analysis with background knowledge:

attribute priorities. IEEE Transactions on Systems, Man, and Cybernetics, Part C

(Applications and Reviews), 39(4), 399-409 (2009).

17. Zhang, S., Guo, P., Zhang, J., Wang, X., Pedrycz, W.: A completeness analysis of frequent

weighted concept lattices and their algebraic properties. Data & Knowledge Engineering,

81, 104-117 (2012).

18. Bělohlávek, R., Sklenář, V., Zacpal, J.: Formal concept analysis with hierarchically ordered

attributes. International Journal of General Systems, 33(4), 383-394 (2004).

19. Domenach, F., Portides, G.: Similarity Measures on Concept Lattices. In Analysis of Large

and Complex Data, pp. 159-169. Springer, Cham (2016).

20. Choi, S. S., Cha, S. H., Tappert, C. C.: A survey of binary similarity and distance measures.

Journal of Systemics, Cybernetics and Informatics, 8(1), 43-48 (2010).

21. W3C, SPARQL 1.1 Query Language, https://www.w3.org/TR/sparql11-query, last accessed

2018/04/23.

22. Sharpe, W. F.: Asset allocation: Management style and performance measurement. Journal

of portfolio Management, 18(2), 7-19 (1992).