the state of rdf in drupal 7

Post on 09-May-2015

5.341 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Chapter

§! Copyright 2008 Digital Enterprise Research Institute. All rights reserved. www.deri.org

Digital Enterprise Research Institute www.deri.ie

scorlosquet@gmail.com

The state of RDF in Drupal 7-

DrupalCon Paris 2009

Stéphane “scor” Corlosquet

1

Digital Enterprise Research Institute www.deri.ie

Presentation outline

! The current web

! The vision of the Semantic Web! Semantic Web technologies

! Initiatives and projects"Data portability

"Linking Open Data

2

Digital Enterprise Research Institute www.deri.ie

The current web

3

Digital Enterprise Research Institute www.deri.ie

Many web applications

4

Digital Enterprise Research Institute www.deri.ie

Many information silos

5

* Source: Pidgin Technologies, www.pidgintech.com

Digital Enterprise Research Institute www.deri.ie

Current Web

! web pages

" 20 billion public pages

" 900 billion deep web pages

" 62 links per page

" = 55 trillion links in the full web

6

http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

Digital Enterprise Research Institute www.deri.ie

Current Web

! web storage

" 246 exabytes of data (246 billion GB)

! tra!c

" 8 terabytes / s

" 2 million emails / s

7

http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

Digital Enterprise Research Institute www.deri.ie

Current Web

! mostly text and links

8

Web Images Maps News Shopping Gmail more !

Web

Sponsored Links

Who is LookupGlobal Who is Lookup for

domain names and their owners

www.who.is

scorlosquet@gmail.com | My Notebooks | My Account | Sign out

who is webchick? SearchAdvanced Search

Preferences

Results 1 - 10 of about 31,600 for who is webchick?. (0.21 seconds)

Webchick wins best contributor at Google-O'Reilly Open Source ...Please comment below if you've received help or have helped webchick help others as a way

of demonstrating your congratulations. ...

drupal.org/webchick-wins-best-contributor-open-source-awards - 115k -

Cached - Similar pages - Note this

Goodbye, World... :( | webchick.netI about poo'd myself when it read this in my RSS feed reader, then I realized, ohh yeah. Pffft,

webchick leave the drupal community, hahahaha. ...

webchick.net/goodbye-world - 34k - Cached - Similar pages - Note this

Uh. Holy crap... | webchick.netHowever, I wonder what will happen when they find out what many of us have suspected for a

long time now: that webchick is just an alias, ...

webchick.net/node/34 - 26k - Cached - Similar pages - Note this

More results from webchick.net »

webchick.net - How popular is webchick.net? (://URLFAN)webchick.net. Ranks 249086 out 1515000 sites Mentioned in 4 feeds ... sources

www.webchick.net groups.drupal.org www.chesnok.com www.garfieldtech.com ...

www.urlfan.com/site/webchick_net/1442668.html - 12k - Cached - Similar pages - Note this

webchick

webchick.org. Loading. DOT.ORG - The miscellaneous TLD for organizations that didn't fit

anywhere ... WEBCHICK.ORG SPEAKS VALID XHTML. SUPPORT GOOD IDEAS. ...

www.webchick.org/ - 31k - Cached - Similar pages - Note this

Quicksketch + Webchick = Drupal Love | Rob Loach .Netwebchick For those of you who have been living in a rock and don't know who ... The reason I

Google

who is webchick? - Google Search http://www.google.com/search?q=who+is+webchick%3F&ie=utf-8&oe=utf-8&am...

1 of 2 30/08/2007 13:31

Technology, The Movie

Will Spiritual Robots Replace Humanity by 2100?

Civilizations Are Creatures

Speculations on the Future of Science

The Myth of Leapfrogging

The Rise and Fall of the Copy

Symmetrical and Asymmetrical Technologies

From Slumber to the Fires of Computation

The Forever Book

The Speed of Information

Atom Versus Net

The Computational Metaphor

The Singularity Is Always Near

The Paradoxical Nature of Technology

Immortal Technologies

Identity From What-is-Not

The Futility of Prohibition

The Seventh Kingdom

Speculations on the Change of Change

Major Transitions in Technology

Major Transitions in Biology

On the option of being anonymous

Recent Innovations in the Method

Evolution of the Scientific Method

The Name of What We Do

Only One Machine

When Answers Are Cheap

Brains of White Matter

The Number of Species We Use

What Will Big Brains Do?

Cosmic Origins of Extropy

Inventing Our Humanity

My Search for the Meaning of Tech

RSS Feed

+My Yahoo!

+NewsGator

+Rojo

+NewsBurst

+Google Reader

+Pluck

+My AOL

+FeedLounge

+NetVibes

+BlogLines

Machine (one billion from the one billion online PCs) as there transitors in an

Itanium chip. The Machine is a super computer where each "transistor" is

computer. A very rough estimate of the computing power of this Machine

then is that it contains a billion times a billion, or one quintillion (10 ^ 18)

transistors. Since only the newest servers have a billion processors, the

figure is probably an order of magnitude smaller. When we add the

transistors for cell phones, handhelds, it calculates out to about 170

quadrillion (10^17) transistors wired into the Machine

There are about 100 billion neurons in the human brain. Today the Machine

has as 5 orders more transistors than you have neurons in your head. And

the Machine, unlike your brain, is doubling in power every couple of years at

the minimum.

In 2003 alone a total one quintillion transistors were produced, but not all of

them are wired up into the Machine. Many transistors made their way into

cameras, TVs, GPS units and the like, few of which are currently online. One

day they will be. Every chip will eventually connect to the web in some

fashion. That would mean we would be adding as many transistors to the

Machine in a year as exist right now.

If the Machine has 100 quadrillion transistors, how fast is it running? If we

include spam, there are 196 billion emails sent every day. That's 2.2 million

per second, or 2 megahertz. Every year 1trillion text messages are sent.

That works out to 31,000 per second, or 31 kilohertz. Each day 14 billion

instant messages are sent, at 162 kilohertz. The number of searches runs at

14 kilohertz. Links are clicked at the rate of 520,000 per second, or .5

megahertz.

There are 20 billion visible, searchable web pages and another 900 billion

dark, unsearchable, or deep web pages (for instance pages behind

passwords or the kind of dynamic page that Amazon will produce when you

query it). The average number of links found on each searchable web page

is 62. Assuming the same count for dynamic pages that means there's 55

trillion links in the full web. We could think of each link as a synapse -- a

potential connection waiting to me made. There is roughly between 100

billion and 100 trillion synapses in the human brain, which puts the Machine

in the same neighborhood as our brains.

Kevin Kelly -- The Technium http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

3 of 9 30/08/2007 13:42

Digital Enterprise Research Institute www.deri.ie

The vision of the Semantic Web

9

Digital Enterprise Research Institute www.deri.ie

Giant Global Graph (2007)

! Transition

" WWW = content+links

" GGG = WWW+relationships+descriptions

! Universal medium for data, information and knowledge exchange

10

http://dig.csail.mit.edu/breadcrumbs/node/215

Tim Berners-Lee

Digital Enterprise Research Institute www.deri.ie

The One machine

11

http://www.kk.org/thetechnium/archives/2007/11/dimensions_of_t.php

! The One machine (Kevin Kelly, 2007)

" 1.2 billion personal computers

" 27 million data servers

" 2.7 billion cell phones

" 80 million wireless PDAs

" 600 billion RFID tags in use

Digital Enterprise Research Institute www.deri.ie

Evolution of the Web

12

Digital Enterprise Research Institute www.deri.ie

The Key

13

http://www.flickr.com/photos/11437726@N08/2781739886/

Agree on standards

Open your data

Digital Enterprise Research Institute www.deri.ie

Semantic Web technologies

14

Digital Enterprise Research Institute www.deri.ie

Links

page1 -> user1

page1 -> book1

page1 -> page24

page1 -> Cats

15

! Let's give a meaning to the hyperlinks

page1 -hasAuthor-> user1

page1 -isPartOf--> book1

page1 -refersTo--> page24

page1 -isAbout---> Cats

triple: subject -property-> object

Digital Enterprise Research Institute www.deri.ie

Graph Model - RDF

16

Digital Enterprise Research Institute www.deri.ie

Graph Model - RDF

17

Digital Enterprise Research Institute www.deri.ie

Resources on the Semantic Web

18

! Internet of Things

" URI: Uniform Resource Identifier

" http://dbpedia.org/resource/Apple

" http://dbpedia.org/resource/Apple_Inc

" http://dbpedia.org/resource/Apple_River

" http://dbpedia.org/resource/Apple_(band)

" http://dbpedia.org/resource/Apple_(album)

" URIs should be dereferenceable

Digital Enterprise Research Institute www.deri.ie

RDF - Describe your data

! Various RDF formats

"RDF is not XML! XML is one of the ways to write RDF data, ie. it's a language/syntax

"RDF/XML

"N-triple

"Turtle

"RDFa

! shortcut notation for URIs: CURIE (Compact URI)

"prefix:id

– example: foaf:knows, sioc:User, etc.

19

Digital Enterprise Research Institute www.deri.ie

RDF - Describe your data

! Various languages

"scor knows danbri (English)

"scor connait danbri (French)

"scor danbri (drawing)

! One meaning in RDF

"scor foaf:knows danbri

20

scor walkahfoaf:knowsscor danbrifoaf:knows

Digital Enterprise Research Institute www.deri.ie

RDF - Vocabularies

! Semantic links are categorized in vocabularies

"Dublin Core - DC

– title, creator, description, date

"Friend of a Friend - FOAF

– hasName, knows, homepage

"Description of a Project - DOAP

"Semantically Interlinked Online Communities - SIOC

"Simple Knowledge Organization System - SKOS

21

PREFIX abc: <http://example.com/exampleOntology#> SELECT ?capital ?countryWHERE { ?x abc:cityname ?capital ; abc:isCapitalOf ?y . ?y abc:countryname ?country ; abc:isInContinent abc:Africa .}

Digital Enterprise Research Institute www.deri.ie

SPARQL - query the GGG data

"standardized in January 2008

"Example, return the capital of all the african countries:

22

Digital Enterprise Research Institute www.deri.ie

Semantic Web practical applications and initiatives

23

Digital Enterprise Research Institute www.deri.ie

Dataportability

! Merge my social networks between various sites

! Move information from one service to another

24

Digital Enterprise Research Institute www.deri.ie

Local communities

25

* Source: Pidgin Technologies, www.pidgintech.com

Digital Enterprise Research Institute www.deri.ie

Many isolated and disparate communities

26

* Source: Pidgin Technologies, www.pidgintech.com

Digital Enterprise Research Institute www.deri.ie

(De-)centralized profile

27

http://www.johnbreslin.com/blog/

Digital Enterprise Research Institute www.deri.ie

Decentralized profiles

28

http://www.johnbreslin.com/blog/

Digital Enterprise Research Institute www.deri.ie

Linking Open Data project

29

http://richard.cyganiak.de/2007/10/lod/

Home About Search Submit Forum Dev

europe

Search results for term “europe”, found about 54.2 thousand

Birds of Europe (RDF)

2008-07-26 – 363 triples in 52.7 kb

http://dbpedia.org/resource/Category:Birds_of_Europe (Search) (Cached) (Ontologies)

Europe (RDF)

2008-07-22 – 91 triples in 13.1 kb

http://dbpedia.org/resource/Category:Europe (Search) (Cached) (Ontologies)

Europe 1 (RDF)

2008-07-22 – 639 triples in 91.4 kb

http://dbpedia.org/resource/Europe_1 (Search) (Cached) (Ontologies)

Flora of Europe (RDF)

2008-07-26 – 297 triples in 43.4 kb

http://dbpedia.org/resource/Category:Flora_of_Europe (Search) (Cached) (Ontologies)

Europe (Band), Europe (musique), Europe (樂團), ヨーロッパ (バンド), Europe (band), Europe (RDF)

2008-07-20 – 1062 triples in 224 kb

http://dbpedia.org/resource/Europe_%28band%29 (Search) (Cached) (Ontologies)

Search results for term “europe” - Sindice http://sindice.com/search?q=europe&qt=term

1 of 2 30/08/2007 11:47

Digital Enterprise Research Institute www.deri.ie

Sindice - The Semantic Web index

30

http://sindice.com/

Digital Enterprise Research Institute www.deri.ie

RDF in Drupal

31

Digital Enterprise Research Institute www.deri.ie

RDF in Drupal core

! RDFa only

" RDF serialization format recommended by W3C

" RDF in xHTML

" Yahoo! SearchMonkey and Google parse it

" no need to generate another output: human and machine readable document

32

Digital Enterprise Research Institute www.deri.ie

DrupalCon DC RDFa video

! DrupalCon DC RDFa video

33

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7: architecture

! Semantics at the module level"Modules can export data along with their semantics in the

format they want– Core => RDFa

– Contrib => RDF/XML, ntriples and what not.

"No duplicate definition of semantics.

"Built in semantics can be altered.

"The theme layer does not have to worry about the semantics anymore, it simply outputs it along with the data.

"Better control on what namespaces are being used for a given page so that only these namespaces are included in the header of the HTML document.

34

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! Architecture of the RDF API in core

–hook_rdf_mapping() : Allow modules to define their own RDF mappings

–hook_rdf_mapping_alter(&$mapping) : Allow modules to override existing mappings

–rdf_get_mapping($bundle) : Returns the mapping for the attributes of the given bundle as an associative array

35

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! hook_rdf_mapping()

36

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! rendered HTML

37

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! What’s already committed

" RDFa doctype

38

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! What’s already committed

" Common RDF prefix definitions

39

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! What’s pending

"The rest!

"1 week for the API

"6 weeks for testing (code slush)

40

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

! Theming layer

"Hardest part of the work

"Many tags are hardcoded in the tpl files

–we want to avoid modifing these, themers should not have to care about RDFa

"Dilema

–centralize everything in the RDF module

–distribute the RDF in all modules (and patch these modules)

41

Digital Enterprise Research Institute www.deri.ie

Status of RDF in Drupal 7

42

building block modules beneficiary modules

page/block

node

field

user

comment

taxonomy

blog

forum

book

openid

profile

all contributed modules

Digital Enterprise Research Institute www.deri.ie

Thank you

! Credits" Frédéric Marand

" Florian Lorétan

" John Breslin

" John Morahan

" Mark Birbeck

" Rolf Guescini

" Benjamin Doherty

" Benjamin Melançon

" Stefan Freudenberg

" Peter Wolanin

" Barry Jaspan

" yched

" catch

" ...

43

Digital Enterprise Research Institute www.deri.ie

Contribute

! IRC: #drupal-rdf

! list of issues to review athttp://drupal.org/project/issues/search/drupal?issue_tags=RDF

! Talk to us

! Keynote tomorrow by Dan Brickley

! code sprint on Saturday

44

top related