terraling: the design and scaling of a linguistic data warehouse

15
The Design and Scaling of a Linguistic Data Warehouse Marco Liberati Prof. Dennis Shasha Prof. Paolo Atzeni Supervisor NYU Tutor

Upload: dej611

Post on 13-Aug-2015

879 views

Category:

Education


1 download

TRANSCRIPT

The+Design+and+Scaling+of+a+Linguistic+Data+Warehouse+

Marco+Liberati+

Prof.+Dennis+Shasha+Prof.+Paolo+Atzeni+

Supervisor+ NYU+Tutor+

Outline+

• Goal%of%the%project%• Prototype%• Terraling%• New%Functionalities%

• New%Facilities%

• Conclusions%

Linguistic+Explorer:+Goal+

• Data%from%150+%languages%

• Cross@linguistic%Analysis%•  Example:%

Analysis%by%Property%Combinations%

• Data%by%Experts%Author%• Analysis%Accessible%for%All%

Prototype:+Syntactic+Structure+of+World+Languages+

• Data%model%tailored%to%one%app%

• Hard%to%mantain%

• Performance%not%scalable%

Terraling+

•  Designed%for%multiple%linguistic%applications%

•  Enhanced%functionality%

•  High%performance%

•  Developed%by%TDD%&%BDD%

%

%

*+

1+

Terraling+

Lings+

Language+Property+Value+

Data+warehouse+structure+

Languages+could+have+children+

Props+

Ling+depth+

Props+

Lings+

Keywords+

ResultMapper+

Builder+

Regular+

Cross+

Compare+

Implication+

Clustering+

Intersection+Filter+

Value+Pair+Filter+

Keyword+Filter+

Architecture+of+Terraling+

•  Web+Page+

•  CSV+file+•  Image+file+

New+Analisys?+Add+it+as+a+module!+

Dinamically+use+the+chosen+Strategy+

New+Modular+Structure+

Your+new+functionality+

Created+a+ResultMapper+

Builder+

ResultMapper+

Builder+

Regular+

Analisys+#1+

Analisys+#2+

Up+to+4+properties+2+properties+

Cross+ SSWL+ Terraling+

New+Functionalities+

Adj.+Noun:+Yes+ Verb.+Noun:+Yes+ Double+Neg.:+Yes+ Adj.Verb:+Yes+ 122+

Adj.+Noun:+Yes+ Verb.+Noun:+No+ Double+Neg.:+No+ Adj.Verb:+No+ 45+

Adj.+Noun:+No+ Verb.+Noun:+Yes+ Double+Neg.:+No+ Adj.Verb:+Yes+ 13+

…+ …+ …+ …+ …+

Ready+for+Lings+depth+

High+performance+while+increasing+combination+number+

Up+to+9+languages!+2+languages+

Compare+ SSWL+ Terraling+

New+Functionalities+

Property+ Greek+ Italian+ Bellinzonese+ Hindi+

Adj.+Noun+ Yes+ No+ No+

…+ …+ …+ …+ …+

Common+Property+ Value+

Adj.+Noun+ Yes+

•  Scale+on+thousands+properties+•  Ready+for+Lings+depth+

Both+Universal+Implication+

New+Functionalities+

Property+ Greek+ Italian+ Bellinzonese+ Hindi+

Adj.+Noun+ Yes+ No+ No+ Yes+

Verb.+Noun+ No+ Yes+ Yes+ _+

Property+Antecedent+ Property+Consequent+ Languages+

Adj.+Noun:+No+ Verb.+Noun:+Yes+ 2+

Verb.+Noun:+Yes+ Adj.+Noun:+No+ 2+

Verb.+Noun:+No+ Adj.+Noun:+Yes+ 1+

Both+

Universal+Implication+

Antecedent+

Consequent+

Double+Both+

0+ 50+ 100+ 150+ 200+ 250+Running%Time%(s)%

Performances%Comparing%

SSWL+ Terraling+

New+Functionalities+

New+Functionalities+

Similarity+Tree+

•  Brand+new+Clustering+Library+100%+Ruby+

•  Similarity+Tree+by+Properties+

Similarity+Tree+_+Radial+

Similarity+Tree+

AJAX+to+improve+server+performance+

Italian+

Hurdu+

=Hoan+

Greek+

English+

American+Sign+Language+Italian+

Hurdu+

Greek+

English+

=Hoan+

New+Facilities+

Available+for+

•  Regular+•  Cross+•  Compare+•  Any+implication+

Geomapping+

Conclusions+

High+quality+Framework+

•  BDD:+200++Scenarios+•  TDD:+95,55%+code+lines+tested+

Flexible+and+Modular+new+Structure+

Improved+Search+Algorithm+

www.terraling.com+