(in)formal concept analysis

Post on 08-May-2015

1.426 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

An informal and intuitive explanation of formal concept analyis

TRANSCRIPT

Lecture Notes : (In)Formal concept analysis 30/03/2009

Formal Concept AnalysisProf. Kim Mens

Louvain School of EngineeringDepartment of Computing Science and Engineering

UCL

http://www.info.ucl.ac.be/~km

(In)

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Information explosion

IT advances in the last decade(s) have caused an explosion of information

E.g., growth of the internet

This leads to a real information overload

How to manage (i.e., search, structure) all that information?

2

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

(Small) example

Dataset = someone’s iTunes™ music library

≥ 5000 songs each having a name, artist, rating, genre, ...

How to manage all that data

How to find a song we like?

Can we find interesting relations between songs?

which songs are similar?

in what way are they similar?

3

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Managing large data sets

Given a data set with many thousands of elements:

web pages, text or other documents

data libraries (books, songs, movies, ...)

customer and personnel databases

having certain properties:

indexes, relevant keywords, tags, genres, ...

In general ...

4

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Managing large data sets

Given a data set with many thousands of elements:

web pages, text or other documents

data libraries (books, songs, movies, ...)

customer and personnel databases

Questions

1. How to find relevant data?

2. How to discover (hidden) structure in that data?

5

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Running example (revisited)

Songs Genres

6

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Running example

How to manage all those songs?

Three concrete applications

1. Finding a song based on its genre

2. Discover (un)expected dependencies between genres

• as well as absence of expected dependencies

3. Discover a user profile

• e.g., what songs does she like most

7

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party dance

8

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party dance

Search results [ party, dance ] :

• Technologic – Daft Punk• Whole Again - Atomic Kitten• Get Busy - Sean Paul• Destination Calabria – Alex Gaudino• Rock This Party – Bob Sinclar

Refine search by genres :

• [ slow, pop, soft ]• [ beat ]

Remove genres from search :

• party• dance

9

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party dance beat

Search results [ party, dance, beat ] :

• Technologic – Daft Punk• Get Busy - Sean Paul• Destination Calabria – Alex Gaudino• Rock This Party – Bob Sinclar

Refine search by genres :

• [ electronic ]• [ reggae ]

Remove genres from search :

• party• dance• beat

10

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party dance beat reggae

Search results [ party, dance, beat, reggae ] :

• Get Busy - Sean Paul

Remove genres from search :

• party• dance• beat• reggae

11

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party reggae

Search results [ party, reggae ] :

• Could You Be Loved – Bob Marley

Refine search by genres :

• [ dance, beat ]

Remove genres from search :

• party• reggae

12

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Running example

How to manage all those songs?

Three concrete applications:

1. Finding a song based on its genre

2. Discover (un)expected dependencies between genres

• as well as absence of expected dependencies

3. Discover a user profile

• what songs does she like most

13

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Structure of the world-wide music scene

http://sixdegrees.hu/last.fm/index.html

?14

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Dependencies between genres

New wave is so eighties

Dance music is party music

Disco is from the seventies

Classical music and slows are for softies

...

15

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Running example

How to manage all those songs?

Three concrete applications:

1. Finding a song based on its genre

2. Discover (un)expected dependencies between genres

• as well as absence of expected dependencies

3. Discover a user profile

• what songs does she like most

16

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Discover a user profile

To analyse the preferred genres of a user

for match-making or publicity purposes

For example,

most of her music is party music

she likes background music

she’s not such a big fan of classical

none of her music is hard

17

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Running example

How to manage all those songs?

Three concrete applications:

1. Finding a song based on its genre

2. Discover (un)expected dependencies between genres

• as well as absence of expected dependencies

3. Discover a user profile

• what songs does she like most

So how can we

achieve all this?

18

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Formal concept analysis...

... may be of help

FCA was invented around 1980 in Darmstadt as a mathematical theory for modelling the notion of a “concept”

Since then it has been applied in many domains of computer science dealing with large data sets

data analysis

knowledge discovery

software engineering

19

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Data set is represented by a “context”

Objects Attributes

Relation

20

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Formal concept analysis...

Starts from a context C

a set G of objects

a set M of attributes

a relation I between the objects and the attributes

Determines concepts

Maximal groups of objects and attributes

Plus hierarchical relationships

Subset relationships between those groups21

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A “concept” represents a group of related objects and attributes

Intuitively, we look for maximal “rectangles” in the binary relation I

22

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A conceptAlice - Sisters of Mercy

A Forest - The Cure

New Wave Party Eighties

Objects Attributes

A concept is a maximal group of objects and attributes

Group:

Every object of the concept has those attributes

Every attribute of the concept holds for those objects

Maximal

No other object (outside the concept) has those same attributes

No other attribute (outside the concept) is shared by these objects23

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Not a concept

Need to include thisNeed to include this as well

Intuitively, we look for maximal “rectangles” in the binary relation I

24

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Formal concept analysis...

... derives hierarchies of concepts from data sets

It generates and visualizes hierarchies of concepts on a mathematically founded basis

FCA

25

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A concept hierarchy

26

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Yet another concept

27

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A subconcept

The blue concept is a subconcept of the green one.

28

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A subconcept

is subconcept of

TechnologicIn Da ClubGet Busy

Destination CalabriaRock This Party

Party Dance Beat

Party Electronic Dance BeatTechnologic

Destination CalabriaRock This Party

is subset of is subset of

29

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Concept lattice

For a given context, the set of all formal concepts, together with the partial order “is subconcept of” form a lattice

A lattice is a mathematical structure with some interesting properties:

for any two concepts there is always a greatest common subconcept and a least common superconcept

it is even a complete lattice, i.e. a unique top (least common superconcept) and bottom element (greatest common subconcept) exist

30

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A concept lattice

31

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A concept lattice

Alice – Sisters of Mercy

Forest – The Cure

New Wave Party Eighties

TechnologicIn Da ClubGet Busy

Destination CalabriaRock This Party

Party Dance Beat

Party Electronic Dance BeatTechnologic

Destination CalabriaRock This Party

is su

bcon

cept

of

32

Tool support : Concept Explorer

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A concept lattice in detail(sparse labelling)

34

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Running example revisitedHow does it work?

How to manage all those songs?

Three concrete applications

1. Finding a song based on its genre

2. Discover (un)expected dependencies between genres

• as well as absence of expected dependencies

3. Discover a user profile

• e.g., what songs does she like most

35

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party dance

Search results [ party, dance ] :

• Technologic – Daft Punk• Whole Again - Atomic Kitten• Get Busy - Sean Paul• Destination Calabria – Alex Gaudino• Rock This Party – Bob Sinclar

Refine search by genres :

• [ slow, pop, soft ]• [ beat ]

Remove genres from search :

• party• dance

36

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party dance beat

Search results [ party, dance, beat ] :

• Technologic – Daft Punk• Get Busy - Sean Paul• Destination Calabria – Alex Gaudino• Rock This Party – Bob Sinclar

Refine search by genres :

• [ electronic ]• [ reggae ]

Remove genres from search :

• party• dance• beat

37

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party dance beat reggae

Search results [ party, dance, beat, reggae ] :

• Get Busy - Sean Paul

Refine search by genres :

• [ electronic ]• [ reggae ]

Remove genres from search :

• party• dance• beat

38

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

A Google-like search engine for songsGalois

Genres (separated by spaces) :

search

party reggae

Search results [ party, reggae ] :

• Could You Be Loved – Bob Marley

Refine search by genres :

• [ dance, beat ]

Remove genres from search :

• party• reggae

39

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Running example revisitedHow does it work?

How to manage all those songs?

Three concrete applications

1. Finding a song based on its genre

2. Discover (un)expected dependencies between genres

• as well as absence of expected dependencies

3. Discover a user profile

• e.g., what songs does she like most

40

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Implications

New wave is from the eighties

Dance music is party music

Disco is from the seventies

Slows are soft

Classical music is soft

41

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Implications

Slows are soft

Classical music is soft

Disco is from the seventies

Dance music is party music

New wave is from the eighties

42

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Associations

Most dance music has a beat

Most of her music is party music

A lot of music from the eighties is party music

43

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Running example revisitedHow does it work?

How to manage all those songs?

Three concrete applications

1. Finding a song based on its genre

2. Discover (un)expected dependencies between genres

• as well as absence of expected dependencies

3. Discover a user profile

• e.g., what songs does she like most

44

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Concept lattice(with number of objects)

Preferred music is party music

Also likes some background music

Not such a big fan of classical

and so on ...

45

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Some problems...

Concept lattice can get very dense for large data sets

Concept lattice can grow exponential in size of context

Attributes are not always binary

What if data is incomplete or imprecise

False positives and negatives

...

(Some solutions have been proposed to overcome these problems)

46

/ 48Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium

Conclusion

FCA is an interesting technique to analyse large data sets

especially to discover interesting concepts, relations and structures in the data

Can be applied to many application domains

Based on a formal mathematical theory

Yet easy to use and understand intuitively

Quality of results depends on size and quality of the data

47

Lecture Notes : (In)Formal concept analysis 30/03/2009Prof. Kim Mens – UCL, Belgium / 48

SourcesB. Ganter, R . Wille: Formal Concept Analysis –Mathematical Foundations. Springer, Heidelberg 1999

Uta Priss’ Formal Concept Analysis Homepage

http://www.upriss.org.uk/fca/fca.html

Gerd Stumme’s course “Formale Begriffsanalyse”

http://www.kde.cs.uni-kassel.de/lehre/ss2005/formale_begriffsanalyse

Context Explorer (ConExp)

http://conexp.sourceforge.net/

J. Fallon: Application des treillis de Galois à la recherche d’informations. Master’s thesis, Université catholique de Louvain, Département d’Ingénierie Informatique, 2004

48

top related