ben gardner | delivering a linked data warehouse and integrating across the wider enterprise

20
Delivering a Linked Data warehouse and integrating across the wider enterprise Ben Gardner – Linklaters LLP Semantics September 2016

Upload: semanticsconference

Post on 15-Apr-2017

135 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

Delivering a Linked Data warehouse and integrating across the wider enterprise

Ben Gardner – Linklaters LLPSemanticsSeptember 2016

Page 2: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

22

Summary

• Information discovery requirements• What we did• Linked Data in Action• Conclusion

Page 4: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

44

What we did

Page 6: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

Linked Data and Model

• Traditional approaches try to identify how the data is to be “captured” upfront.

• You can do this with the linked data model• But we don’t…..Why?• Always leads to “Paralysis by Analysis”• You will miss so much.• And take a huge amount of time doing it.

• You will find that there is a huge amount of information and relationships you never would of thought if starting from the model.

• Then there are tricks you can do to add huge value

• The data model evolves very rapidly from the data and can be further tweaked at anytime.

Let the data express itself• Source by source, row by row let the data tell you

what it is describing.• What it is, what relationships and metadata it has. • You’ll find a lot more information that you simply

couldn’t describe in a RDMS• Another source can add to an existing item

without you even having to think

Page 7: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

77

Degree

Person

Matter

Jurisdiction

Jurisdiction

CollegeSector

Person

Person

Client

Manager

Partner

Client Area

Client

Person

Manager Area

Linked Data and Model : Individual Model Fragments

Page 8: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

88

Degree

Matter

Jurisdiction

College

Sector

Person

Client

Manager

Partner

ClientArea

ClientManager

Area

Linked Data and Model: Fragments automatically align

Page 9: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

ETL & Linked Data Creation & Management

In4mium Talend modules• Semantic modules ready to use through

configuration in Talend• No API knowledge required by users• Range of modules (over 60 ) for all

aspects of linked data creation and management

• Create fully semantic apps• Or pick and mix with traditional

aspects

• Works seamlessly with existing Talend environment and modules

• Model driven behaviours are now possible

• Easily add sematic technologies into existing service architectures

• All the benefits without the hassle

Page 10: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1010

OData4Sparql – Simplifying integration

+• Brings together the strength of a ubiquitous RESTful

interface standard (OData) with the flexibility, federation ability of RDF/SPARQL.

• SPARQL/OData Interop proposed W3C interoperation proxy between OData and SPARQL (Kal Ahmed, 2013)

• Opens up many popular user-interface development frameworks and tools such as Kendo UI, SAPUI5, etc.

• Acts as a Janus-point between application development and data-sources.

• User interface developers are not, and do not want to be, database developers. Therefore they want to use a standardized interface that abstracts away the database, even to the extent of what type of database: RDBMS, NoSQL, or RDF/SPARQL

• By providing an OData4SPARQL server, it opens up any SPARQL data-source to the C#/LINQ development world.

• Opens up many productivity tools such as Excel/PowerQuery, and SharePoint to be consumers of SPARQL data such as Dbpedia, Chembl, Chebi, BioPax and any of the Linked Open Data endpoints!

• Microsoft has been joined by IBM and SAP using OData as their primary interface method which means there will many application developers familiar with OData as the means to communicate with a backend data source.

Page 11: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1111

Model Driven UI

Linklaters Data Model Northwind Data Model

Things

Sample Query Sample Query

Relationships between Things

Things

Relationships between Things

Page 12: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1212

Demo of Linked Data in action

Page 13: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1313

Strings to Things to Facts

Click on a ‘thing’ displays a ‘Lens’ about that ‘thing’

that shows different fragments that

displays facts about the thing

The ‘About’ fragment shows most relevant information.

Compare with the Google

knowledge graph

The ‘Person Involved’

fragment list all persons involved with the matter

The ‘Financial Summary’

calculates a financial summary

… and we can find associated deal

‘things’. If we want more details about any ‘thing’ we can now navigate to its

‘lens’

Page 14: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1414

Lens Discovery

Navigating through ‘Gerald Grant’, the managing partner

for the Matter, takes us to his Lens

Navigating through the associated deal

takes us to that deal’s Lens

Or show the Lens on the client of the

matter

One is not limited to facts within the

application. In the case of a client we

can navigate to their Companies House page (or it could have been D&B,

LinkDocs etc)

Page 15: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1515

Composing Questions

Advanced Searches can be selected from the list

which then displays a query in a different format that allows better control

over the search

Advanced Searches can be selected from the list

which then displays a query in a different format that allows better control

over the search

The advanced search allows conditions to be added that link to other

‘things’ or limit the values of ‘facts’ about the

associated ‘thing’. This allows much more precise searches to be executed

Page 16: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1616

OData integration with Excel Power Query/Pivot

OData

OD

ata4

Spa

rql Power Query Data Grabber/Shaper

• Build queries and utilise expand to traverse graph• Limited data transformation can be incorporated into

the queries• Create multiple views

Power Pivot Self Service BI• Integrate across Power Queries and

other sources to build ROLAP models• Explore model with Pivot tables

Power View

Power Map

Pivots, Charts & Grids

Tableau, etc.

Power Query

Power Pivot

Page 17: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1717

Conclusion

Page 18: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1818

Linked Data has delivered

• Elimination of silos through creation of logical data warehouse that is extensible across internal and external data sources

• Enabled “find and explore” information seeking behaviours

• Separation of data modelling from integration provides for easy addition of internal & external data

• Ability to support diverse range of specialised domain views onto data

• Introduces a Service Orientated Data Architecture simplifying application development

• Based on W3C web standards providing future proofing and protection of firms IP (data models)

Page 19: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

1919

Building a Linked Data Warehouse pilot

RDF Management

Triple Store

Model

UI

S O

ETL Platform

OData+

OD

ata4

Spa

rql

Sparql

+

Matter

Time

People

Financials

DealFinder

Client Book

Client

Engage

K_Docs

SA

P

One FTE (2x0.5) and nine months delivered• Integrated 3 years and 9 months of data from 9 sources• 24 million triples• 62 Things (People, Matters, Clients, etc.)• 127 Relationships between Things• 223 Data attributes

Page 20: Ben Gardner | Delivering a Linked Data warehouse and integrating across the wider enterprise

2020

Questions?