leveraging big data, linked (open) data and (multilingual ... fileleveraging big data, linked (open)...

14
Leveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof. Dr. Jörg Schütz bioloom group MultilingualWeb-LT Workshop June 11-13, 2012 Dublin, Ireland

Upload: others

Post on 28-Oct-2019

11 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Leveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises

Prof. Dr. Jörg Schütz

bioloom group

MultilingualWeb-LT Workshop June 11-13, 2012 Dublin, Ireland

Page 2: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Today‘s Agenda

Overview • Application scenario

• Needs and Requirements

Challenges • Data, Processes and Workflows

• Tools, Curation and Management

Solutions

• Standards

• Existing Gaps

• Proprietary Solution

• Future(s)

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Page 3: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Enterprise related Data

RDB

Data

(internal)

Multiple

Data

Streams

(external)

Data and Meaning

Semantics

Metadata

Storage / Tools

Vocabularies Processing

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Page 4: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Needs and Requirements

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Social Media

• Identify

• Extract

• Analyze

• Categorize

• Channel

Core Language

• Assure Quality

• Curate Rules, Styles, Meta and Vocabularies

• Monitor and Optimize

Corporate Languages

• Assure Quality

• Curate Rules, Styles, Meta, TMs and Vocabularies

• Monitor and Optimize

Dynamic Streams

Multiple Languages

Entailed Knowledge

Relate to Curated Repositories

Page 5: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Data, Processes and Workflows

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Multilingual

Streams

Process Language

Item

Process Language

Item

Process Language

Item

Meta

LS

Relate

LS

Combine

and

Merge

LS

Meta

TS

Relate

TS

Combine

and

Merge

TS

Meta

TS

Relate

TS

Combine

and

Merge

TS

Page 6: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Tools, Curation and Management

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Corporate Communication

• Relational Database Systems

• Content Management

• BPM (BPN, EPC, …)

• IT Governance and Compliance

• …

Language Technologies

• Terminology Database Systems

• Translation Memories

• Checking Tools

• Crawler, Parser, …, Machine Translation

• …

Page 7: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Combining Standards

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

• Processes

• Workflows

• Monitoring

• Communication

• Language Services

• Asset Sharing

• Semantics

• Provenance

• Statistics

• Container, ML Data

• Guidance, Navigation

• Transport

XLIFF

ITS

RDF

OWL

BPMN

EPC

TMX

TBX

Page 8: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Existing Gaps …

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Too complex, not intuitive, not fitting, …

Standards (partly) immature, too leaky, too flexible, too error-prone, too non-existent, …

General lack of interoperability

Page 9: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

How everything fits together

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Data Processing Modeling & Metadata

Visualization

Page 10: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Extend / Refit or Mash up ?

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

LT

• Pragmatic evolution necessary to avoid existing gaps – mash up

BP • Examine W3C PROV Working Drafts

Services • Deploy REST to keep it simple

Page 11: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Enterprise Data Web

RDB

Data

(internal)

Multiple

Data

Streams

(external)

Linked Enterprise Data

RDF / RDFS

Provenance

Triple Stores Vocabularies

SKOS, OWL SPARQL, Inferencing

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Page 12: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Recommendations

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

KISS Principle

Re-use and Combine

Show how it works

Page 13: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Enterprise Data Web

RDB

Data

(internal)

Multiple

Data

Streams

(external)

Linked Enterprise Data

RDF / RDFS

Provenance

Triple Stores Vocabularies

SKOS, OWL SPARQL, Inferencing

MLW-LT Workshop, June 11-13, 2012 -- Dublin, Ireland – © bioloom group -- js

Word Wide Web

Page 14: Leveraging Big Data, Linked (Open) Data and (Multilingual ... fileLeveraging Big Data, Linked (Open) Data and (Multilingual) Language Technology Standards in Global Enterprises Prof

Thanks for listening !

Additional Info: [email protected]