using semantics to enhance content publishing

160
Integrating the Cloud into Content Using Semantics to Enhance Content Publishing http://semprog.com/presentations/web20ny Jamie Taylor

Upload: jamie-taylor

Post on 05-Dec-2014

4.968 views

Category:

Technology


0 download

DESCRIPTION

Integrating the cloud into content. Web2.0 Expo NY 2009 Workshop

TRANSCRIPT

Page 1: Using Semantics to Enhance Content Publishing

Integrating the Cloud into ContentUsing Semantics to Enhance Content Publishing

http://semprog.com/presentations/web20ny

Jamie Taylor

Page 2: Using Semantics to Enhance Content Publishing

What do y'all mean"Semantics"

Page 3: Using Semantics to Enhance Content Publishing

KNOCK

misfortune

bad luck

sound

occurrence

zing

vroomzizz

bump

knocking

bashbelt

bang

roast

critique

blow

rap whack

Page 4: Using Semantics to Enhance Content Publishing

LJOMF

misfortune

bad luck

sound

occurrence

zing

vroomzizz

bump

knocking

bashbelt

bang

roast

critique

blow

rap whack

Page 5: Using Semantics to Enhance Content Publishing

IBM

Page 6: Using Semantics to Enhance Content Publishing

Head

qu

art

ers

CEO

Legal S

tructu

re

Operating Incom

e

Ticker Symbol

CIK

SIC

NAIC

Founders

Date Founded

Su

bsid

iari

es

So

ftware

Develo

ped

0000051143

NYSE:IBM

Sam Palmisano

17,604,000,000USD 2006

SANSF, ViaVoiceLotus Notes

CognosCross Worlds

334111:ElectronicComputer Manufacturing

3571:ElectronicComputers

1889

Thomas Watson

1 New Orchard RoadArmonk, New YorkPublicaly Listed

Company

IBM

Page 7: Using Semantics to Enhance Content Publishing

Head

qu

art

ers

CEO

Legal S

tructu

re

Operating Incom

e

Ticker Symbol

CIK

SIC

NAIC

Founders

Date Founded

Su

bsid

iari

es

So

ftware

Develo

ped

0000051143

NYSE:IBM

Sam Palmisano

17,604,000,000USD 2006

SANSF, ViaVoiceLotus Notes

CognosCross Worlds

334111:ElectronicComputer Manufacturing

3571:ElectronicComputers

1889

Thomas Watson

1 New Orchard RoadArmonk, New YorkPublicaly Listed

Company

Page 8: Using Semantics to Enhance Content Publishing
Page 9: Using Semantics to Enhance Content Publishing
Page 10: Using Semantics to Enhance Content Publishing
Page 12: Using Semantics to Enhance Content Publishing

PageRanktm

Page 13: Using Semantics to Enhance Content Publishing

0000051143

NYSE:IBM

Sam Palmisano

17,604,000,000USD 2006

SANSF, ViaVoiceLotus Notes

CognosCross Worlds

334111:ElectronicComputer Manufacturing

3571:ElectronicComputers

1889

Thomas Watson

1 New Orchard RoadArmonk, New YorkPublicaly Listed

Company

Page 14: Using Semantics to Enhance Content Publishing
Page 15: Using Semantics to Enhance Content Publishing

Earlier this year, the AP slashed prices to try to hold on to subscribers.

That's not the answer, says Jeff Jarvis, journalism professor at City University of New York.

JEFF JARVIS: The fundamentals of the media economy are changing, from a content economy to a link-based economy.

Jarvis says the AP needs to become the broker for those links, like helping the Baltimore Sun link to a story about GM from the Detroit Free Press.

Page 16: Using Semantics to Enhance Content Publishing

http://www.flickr.com/photos/pagedooley/

Jarvis resorts to the concept of a "gift economy" to explain the link economy

Page 17: Using Semantics to Enhance Content Publishing

I am a behavioral economist.

Gift economics are frequently used as explanations for what we don't understand

Page 18: Using Semantics to Enhance Content Publishing

Worse I am a Behaviorist

Only talk about what you can observe

Page 19: Using Semantics to Enhance Content Publishing

Semantics

Process of communicating enough meaning to result in an action

Page 20: Using Semantics to Enhance Content Publishing

Link Economy

• Enriching links focuses meaning• Improves "findability" (SEO)

• Increased usability

• Better ad selection

Page 21: Using Semantics to Enhance Content Publishing

Link Economy

• Semantics Benefit• Site owners

• Site users

• Developers

• You

At the end of this talk - you should be able to say how semantics benefits each of these groups

Page 22: Using Semantics to Enhance Content Publishing
Page 23: Using Semantics to Enhance Content Publishing
Page 24: Using Semantics to Enhance Content Publishing

Wish it were real

Page 25: Using Semantics to Enhance Content Publishing

Might be real

Page 26: Using Semantics to Enhance Content Publishing

Is real, but don't believe it

Page 27: Using Semantics to Enhance Content Publishing

Is very useful

Build Flexible Applications with

Graph Data

Page 28: Using Semantics to Enhance Content Publishing

Not Your TypicalSemantic Web Talk

Page 29: Using Semantics to Enhance Content Publishing

The Caketaken from http://www.w3.org/2007/Talks/0130-sb-W3CTechSemWeb/layerCake-4.png

The W3C Layer Cake

Page 31: Using Semantics to Enhance Content Publishing

Ontologies

Page 32: Using Semantics to Enhance Content Publishing

<http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000005b7ab1a> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.freebase.com/ns/business.employment_tenure>.<http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000005b7ab1a> <http://rdf.freebase.com/ns/business.employment_tenure.company> <http://rdf.freebase.com/ns/en.determine_software>.<http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000007e53e16> <http://rdf.freebase.com/ns/education.education.institution> <http://rdf.freebase.com/ns/en.mounds_view_high_school>.<http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000007e53e16> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.freebase.com/ns/education.education>.<http://rdf.freebase.com/ns/guid.9202a8c04000641f8000000007e53e16> <http://rdf.freebase.com/ns/education.education.student> <http://rdf.freebase.com/ns/en.jamie_taylor>.<http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/business.company_founder.companies_founded> <http://rdf.freebase.com/ns/en.mobius_net>.<http://rdf.freebase.com/ns/en.jamie_taylor> <http://creativecommons.org/ns#attributionName> "Source: Freebase - The World's database".<http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/people.person.nationality> <http://rdf.freebase.com/ns/en.united_states>.<http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/common.topic.image> <http://rdf.freebase.com/ns/en.jamie_headshot>.<http://rdf.freebase.com/ns/en.jamie_taylor> <http://rdf.freebase.com/ns/type.object.name> "Jamie Taylor"@en.<http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.freebase.com/ns/user.skud.freebase_events.tshirt_recipient>.<http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.freebase.com/ns/user.skud.freebase_events.topic>.<http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.freebase.com/ns/book.author>.<http://rdf.freebase.com/ns/en.jamie_taylor> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.freebase.com/ns/people.person>.

RDF Serialization Formats

Page 33: Using Semantics to Enhance Content Publishing

Instead....

• Part I• Why

• Uses, Benefits

• Part II• How

• Representation, Concepts

Part I- so you can explain to other

Part II- so you can do what you say

Page 34: Using Semantics to Enhance Content Publishing

Part IWhy

Page 35: Using Semantics to Enhance Content Publishing

Is very useful

Build Flexible Applications with

Graph Data

Page 36: Using Semantics to Enhance Content Publishing

Graph Data Model

John Krasinski

Person, Actor

The Office (US)TV Program

stars in starred in

Leatherheads

Film

Brown UniversityCollege/university

attended

Page 37: Using Semantics to Enhance Content Publishing

A socially managed semantic database

Page 38: Using Semantics to Enhance Content Publishing

Freebase has Many Types of Things

Page 39: Using Semantics to Enhance Content Publishing
Page 40: Using Semantics to Enhance Content Publishing
Page 41: Using Semantics to Enhance Content Publishing

9,547,107 Topics

Page 42: Using Semantics to Enhance Content Publishing

government position held

topic:United States

Senator

topic:Barack Obama

Freebase

topic:UBS AG

took money from

topic:Switzerland

is based in

Contributions over $50000 made to members of the US congress in the 2008 election cycle by companies

headquartered outside of the United States

Page 43: Using Semantics to Enhance Content Publishing

Industry Browser Identity Model

Industry (USCB)NAICS

Industry (SEC)SIC

NAICS/SIC MapFreebase

CompanyCIKSEC

PeopleCIKSEC

PersonWikipediaFreebase

CompanyCRP IDCRP

DonationsCRP IDCRP

LocationZIP Code

Freebase

CompanyTicker

SEC

Article

Wikipedia

Page 44: Using Semantics to Enhance Content Publishing

Industry Browser

http://kiwitobes.com/industry_mashup/

Page 45: Using Semantics to Enhance Content Publishing

Web 2.0 + Semantics

Barriers between science and the humanities impede solving humanities important problems

Page 46: Using Semantics to Enhance Content Publishing

"Smoov"Ankolekar et al.2007

Page 49: Using Semantics to Enhance Content Publishing
Page 50: Using Semantics to Enhance Content Publishing
Page 51: Using Semantics to Enhance Content Publishing

Patrick Sinclair (BBC)

Page 52: Using Semantics to Enhance Content Publishing

About the Content (and visitor?)

Page 53: Using Semantics to Enhance Content Publishing

MIT Simile

Page 54: Using Semantics to Enhance Content Publishing

Simile

http://dev.mqlx.com/~jamie/simile/timeline.html

Page 55: Using Semantics to Enhance Content Publishing

Data Portability

Data

Data

Data

Data

Semantics allows data to be utilized by unanticipated new applications

Page 56: Using Semantics to Enhance Content Publishing

Simile

Page 57: Using Semantics to Enhance Content Publishing

MIT Simile: Exhibit

Page 58: Using Semantics to Enhance Content Publishing

User Experience

Page 59: Using Semantics to Enhance Content Publishing

Topic Hubs

Page 60: Using Semantics to Enhance Content Publishing

Open Calais

Page 61: Using Semantics to Enhance Content Publishing

Open Calais

Page 63: Using Semantics to Enhance Content Publishing

<rdf:Description rdf:nodeID="A1"> <att:lastupdated>2009-06-18T21:22:28</att:lastupdated> <att:text>IBM Corporation And Siemens Announce Integrated Solutions To Help Companies</att:text> </rdf:Description> <rdf:Description rdf:nodeID="A2"> <att:code>3577</att:code> <att:description>Computer Periph'L Equipment, Nec</att:description> </rdf:Description> <rdf:Description rdf:nodeID="A3"> <att:code>7371</att:code> <att:description>Computer Programming Services</att:description> </rdf:Description> <rdf:Description rdf:nodeID="A4"> <att:age>46</att:age> <att:lastname>Iwata</att:lastname> <att:officerurl rdf:resource="http://www.reuters.com/finance/stocks/officerProfile?symbol=IBM.N&amp;officerId=222727"/> <att:firstname>Jon</att:firstname> <att:title>Senior Vice President - Marketing and Communications</att:title> <att:middle>C.</att:middle> </rdf:Description>

http://p.opencalais.com/er/company/ralg-tr1r/9e3f6c34-aa6b-3a3b-b221-a07aa7933633

Open Calais

Page 65: Using Semantics to Enhance Content Publishing

Herman Tolentino et al. http://epispider.net/index.php

Epispider

Page 66: Using Semantics to Enhance Content Publishing

guardian.co.uk Open Platform

Chris Thorpe

Page 67: Using Semantics to Enhance Content Publishing

Vocabulary

Do you understand the words that are coming out of my mouth?

-Chris Tucker, Rush Hour

Page 68: Using Semantics to Enhance Content Publishing

Head

qu

art

ers

CEO

Legal S

tructu

re

Operating Incom

e

Ticker Symbol

CIK

SIC

NAIC

Founders

Date Founded

Su

bsid

iari

es

So

ftware

Develo

ped

0000051143

NYSE:IBM

Sam Palmisano

17,604,000,000USD 2006

SANSF, ViaVoiceLotus Notes

CognosCross Worlds

334111:ElectronicComputer Manufacturing

3571:ElectronicComputers

1889

Thomas Watson

1 New Orchard RoadArmonk, New YorkPublicaly Listed

Company

Page 69: Using Semantics to Enhance Content Publishing

Herman Tolentino et al. http://epispider.net/index.php

Epispider

Page 70: Using Semantics to Enhance Content Publishing

vocabularies...are everywhere

Page 71: Using Semantics to Enhance Content Publishing

The Twitter Vocabulary

@

#Short URLs

Page 72: Using Semantics to Enhance Content Publishing

Pivot on an @ tag

Page 73: Using Semantics to Enhance Content Publishing

Pivot on a # tag

Page 75: Using Semantics to Enhance Content Publishing

Vocabularies make links more understandable

...and thus content more findable

Page 76: Using Semantics to Enhance Content Publishing

microformats

Annotate existing HTML so the content can be "extracted by software and indexed, searched for, saved, cross-referenced or combined. "

Page 77: Using Semantics to Enhance Content Publishing

microformats

Page 78: Using Semantics to Enhance Content Publishing

microformats<div class="vcard">..... <div id="view"> <div id="home">

<table> <tr> <td class="f">address</td> <td class="v"> <div class="adr"> <span class="locality">Berkeley</span>, <span class="region">CA</span> <div class="country-name">United States</div>

</div> </td> </tr> <tr> <td class="f">aim</td> <td class="v"><a id="aim" class="url im offline" href="aim:[email protected]">[email protected]</a></td> </tr>

Page 79: Using Semantics to Enhance Content Publishing

microformats.org

Page 80: Using Semantics to Enhance Content Publishing

microformats

•(Relatively) easy to use

•Small, fixed vocabulary

•No standard parsing pattern

•No strong identifiers

• Limits utility

Page 81: Using Semantics to Enhance Content Publishing

RDFa

Annotate HTML with machine readable RDF

Page 83: Using Semantics to Enhance Content Publishing

RDFa

•Unambiguous identifiers

•Extensible vocabulary

•Standard parsing pattern

• Produces RDF

•Hard to use

• Rules about formatting based on RDF

Page 84: Using Semantics to Enhance Content Publishing

What “concepts” are covered in content

Like existing tagging,

but with strong identifiers!<resource>

Tag

tagged

meanslabel

<resource>"text"

taggingDate "2001-01-01"

Strong identifier goes here!

Page 85: Using Semantics to Enhance Content Publishing

<div class="rdfa" xmlns:ctag="http://commontag.org/ns#">

NASA's

<a typeof="ctag:Tag"

rel="ctag:means"

href="http://rdf.freebase.com/ns/en.phoenix_mars_mission"

property="ctag:label">Phoenix Mars Lander</a>

has deployed its robotic arm.

</div>

<resource>

Tag

tagged

meanslabel

<resource>"text"

taggingDate "2001-01-01"

Page 86: Using Semantics to Enhance Content Publishing
Page 87: Using Semantics to Enhance Content Publishing

And the winner is....

Page 88: Using Semantics to Enhance Content Publishing

HTML5 MicroData

• Annotate HTML with machine readable data

• Simple Name-Value Pair design

Page 89: Using Semantics to Enhance Content Publishing

HTML5 MicroData

Sometimes, it is desirable to annotate content with specific machine-readable labels, e.g. to allow generic scripts to provide services that are customised to the page, or to enable content from a variety of cooperating authors to be processed by a single script in a consistent manner.

Page 90: Using Semantics to Enhance Content Publishing

HTML5

Simple! 15 pages of 657 page spec

Page 91: Using Semantics to Enhance Content Publishing

HTML5 MicroData

<section itemscope itemtype="http://example.org/animals#cat" itemid="http://semprog.com/jamiestuff/hedral">

<h1 itemprop="name">Hedral</h1> <p itemprop="desc">Hedral is a male american domestic

shorthair, that is <span itemprop="http://example.com/color">black</span> and <span itemprop="http://example.com/color">white</span>.</p>

<img itemprop="img" src="hedral.jpeg" alt="" title="Hedral, age 18 months">

</section>

Page 92: Using Semantics to Enhance Content Publishing

MicroData Widgets

Page 93: Using Semantics to Enhance Content Publishing

HTML5 MicroData

• Easy to use

• Strong identifiers

• Extensible vocabulary

• Easy to parse

• In last call for comments stage!• Usable! Now!

Page 94: Using Semantics to Enhance Content Publishing

Vocabulary Powered SearchSearch Applications:- Enhanced results- Info Bar

Page 95: Using Semantics to Enhance Content Publishing

<div class="hReview-aggregate"><div class="item vcard"> <h1 class="fn org">Taylor&#39;s Automatic Refresher</h1> <div class=rating>

<img class="stars_3_half rating average" width="83" height="325" title="3.5 star rating" alt="3.5 star rating"

src="http://static1.px.yelp.com/static/2843250757/i/new/ico/stars/stars_map.png"/></div> <em>based on <span class="count">888</span> reviews</em>

</div>

<div id="bizInfoContent"> <p id="bizCategories">Category: <span id="cat_display"><a href="/c/sf/burgers">Burgers</a> </span><address class="adr"> Neighborhood: Embarcadero<br/>

<span class="street-address">1 Ferry Bldg<br />Marketplace Shop #6</span><br /><span class="locality">San Francisco</span>, <span class="region">CA</span> <span class="postal-code">94111</span><br />

</address><span id="bizPhone" class="tel">(866) 328-3663</span>

Page 96: Using Semantics to Enhance Content Publishing
Page 97: Using Semantics to Enhance Content Publishing

<div class="hReview-aggregate"><div class="item vcard"> <h1 class="fn org">Taylor&#39;s Automatic Refresher</h1> <div class=rating>

<img class="stars_3_half rating average" width="83" height="325" title="3.5 star rating" alt="3.5 star rating" src="http://static1.px.yelp.com/static/2843250757/i/new/ico/stars/stars_map.png"/></div>

<em>based on <span class="count">888</span> reviews</em></div>

<div id="bizInfoContent"> <p id="bizCategories">Category: <span id="cat_display"><a href="/c/sf/burgers">Burgers</a> </span><address class="adr"> Neighborhood: Embarcadero<br/>

<span class="street-address">1 Ferry Bldg<br />Marketplace Shop #6</span><br /><span class="locality">San Francisco</span>, <span class="region">CA</span> <span class="postal-code">94111</span><br />

</address><span id="bizPhone" class="tel">(866) 328-3663</span>

Page 98: Using Semantics to Enhance Content Publishing

Search Monkey Vocabulary

Page 99: Using Semantics to Enhance Content Publishing

Search Monkey Vocabulary

Page 100: Using Semantics to Enhance Content Publishing

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"><rdf:Description rdf:about="http://dbpedia.org/ontology/areaTotal"><rdfs:domain rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:nodeID="b29203"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/ontology/Place/nickname"><rdfs:domain rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/ontology/Place/location"><rdfs:range rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/ontology/maximumDepth"><rdfs:domain rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/ontology/Place/maximumElevation"><rdfs:domain rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:nodeID="b29250"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/ontology/nearestCity"><rdfs:domain rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/ontology/PopulatedPlace"><rdfs:subClassOf rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/ontology/Place/maximumDepth"><rdfs:domain rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:about="http://dbpedia.org/ontology/Place/location"><rdfs:domain rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description><rdf:Description rdf:nodeID="b29225"><rdf:first rdf:resource="http://dbpedia.org/ontology/Place"/></rdf:Description>

DBPedia Place Vocabulary

Page 101: Using Semantics to Enhance Content Publishing

Rich Snippet Vocabulary

http://data-vocabulary.org

• name • affiliation • nickname • price • postal-code • dtReviewed• photo • country-name• locality• reviewer• region• count• address• itemReviewed• title• brand• category• role

Page 102: Using Semantics to Enhance Content Publishing

<rdf:Property rdf:ID="affiliation"> <rdfs:comment>An affiliation can be specified by a string literal or an Organization instance.</rdfs:comment> <rdfs:domain rdf:resource="#Person"/> <rdfs:range> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Organization"/> <owl:Class rdf:about="xsd:string"/> </owl:unionOf> </owl:Class> </rdfs:range></rdf:Property>

<rdf:Property rdf:ID="brand"> <rdfs:domain rdf:resource="#Product"/></rdf:Property>

<rdf:Property rdf:ID="category"> <rdfs:domain> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Organization"/> <owl:Class rdf:about="#Product"/> </owl:unionOf> </owl:Class> </rdfs:domain></rdf:Property>

Rich Snippet Vocabulary

Page 103: Using Semantics to Enhance Content Publishing

HTML5 Vocabularies

Page 104: Using Semantics to Enhance Content Publishing

Vocab Hubhttp://microdata.freebaseapps.com/

Page 105: Using Semantics to Enhance Content Publishing

Part IIHow

(or why we wrote the book)

Page 106: Using Semantics to Enhance Content Publishing
Page 107: Using Semantics to Enhance Content Publishing

Rich Graph Data

John Krasinski

Person, Actor

The Office (US)TV Program

stars in starred in

Leatherheads

Film

Brown UniversityCollege/university

attended

Page 108: Using Semantics to Enhance Content Publishing

Connected to other rich sources

Page 109: Using Semantics to Enhance Content Publishing

Where does your data live?

Page 110: Using Semantics to Enhance Content Publishing

Traditional data-modeling

Page 111: Using Semantics to Enhance Content Publishing

The beloved spreadsheet

Restaurant Address Cuisine Price OpenDeli Lllama Peachtree Rd Deli $ Mon, Tue, Wed, Thu, FriPeking Inn Lake St Chinese $$$ Thur, Fri, SatThai Tanic Branch Dr Thai $$ Tue, Wed, Thu, Fri, Sat, Sun

Lord of the Fries Flower Ave Fast food $$ Tue, Wed, Thu, Fri, Sat, SunMarquis de Salade Main St French $$$ Thur, Fri, Sat

Wok this way Second St Chinese $ Mon, Tue, Wed, Thu, Fri, Sat, SunLuna Sea Autumn Dr Seafood $$$ Tue, Thu, Fri, SatPita Pan Thunder Rd Middle Eastern $$ Mon, Tue, Wed, Thu, Fri, Sat, Sun

Award Weiners Dorfold Mews Fast food $ Mon, Tue, Wed, Thu, Fri, SatLettuce Eat Rustic Parkway Deli $$ Mon, Tue, Wed, Thu, Fri

Tabular data

Page 112: Using Semantics to Enhance Content Publishing

Too much information, not enough cells

Restaurant Address Cuisine Price Open

Deli Lllama Peachtree Rd Deli $ Mon (11a-4p), Tue (11-4), Wed (11-4), Thu (11-7), Fri (11-8)

Peking Inn Lake St Chinese $$$ Thur (5p-10p), Fri (5p-1a), Sat (5p-1a)

etc…

Tabular Data

Page 113: Using Semantics to Enhance Content Publishing

Allows for simple queries

Restaurantidnameaddresscuisine_id

Hoursrestaurant_iddayopenclose

Cuisineidname

A simple schema

Page 114: Using Semantics to Enhance Content Publishing

Filled with data

id name address price

1 Deli Lllama Peachtree Rd

$

2 Peking Inn Lake St $$$

...

restaurant_id day open close1 Mon 11 161 Tue 11 161 Thu 11 192 Fri 5 23...

A simple schema

Page 115: Using Semantics to Enhance Content Publishing

This doesn’t fit into our schema...

Bar Address DJ Best Drink

The Bitter End 14th Ave No Beer

Peking Inn Lake St No Scorpion Bowl

Hammer Time Wildcat Dr Yes Hennessey

Marquis de Salade Main St Yes Martini

Some new data

Page 116: Using Semantics to Enhance Content Publishing

Maybe ok now, but can’t this keep happening?

Restaurant Address Price DJ Best DrinkDeli Lllama Peachtree Rd $Peking Inn Lake St $$$ No Scorpion BowlThai Tanic Branch Dr $$

Lord of the Fries Flower Ave $$Marquis de Salade Main St $$$ Yes Martini

Wok this way Second St $Luna Sea Autumn Dr $$$Pita Pan Thunder Rd $$

Award Weiners Dorfold Mews $Lettuce Eat Rustic Parkway $$

Hammer Time Wildcat Dr Yes HennesseyThe Bitter End 14th St No Beer

Half-empty columns

Page 117: Using Semantics to Enhance Content Publishing

But now the information is duplicated :(

Restaurantidnameaddresscuisine_id

RB_Linkrestaurant_idbar_id

Baridnamedjbest_drink

Link the tables

Page 118: Using Semantics to Enhance Content Publishing

Better, but now we have to “migrate”

Venueidnameaddress

Hoursvenue_iddayopenclose

Restaurantidvenue_idcuisine_id

Baridvenue_iddjbest_drink

Split place / purpose

Page 119: Using Semantics to Enhance Content Publishing

A small section of a limited product

Large schemas

Page 120: Using Semantics to Enhance Content Publishing

Does this look familiar?

Venueidnameaddress

Propertiesvenue_idfield_idvalue

fieldidname

A flexible schema

Page 121: Using Semantics to Enhance Content Publishing

simple enough...

id name address

1 Deli Lllama Peachtree Rd

2 Peking Inn Lake St

...

venue_id field_id value

1 1 Deli

1 2 $

2 1 Chinese

2 2 $$$

2 3 Scorpion Bowl

2 4 No

id name1 Cuisine2 Price3 Specialty Cocktail4 DJ?

Add some data

Page 122: Using Semantics to Enhance Content Publishing

No schema change required

id name address

1 Deli Lllama Peachtree Rd

2 Peking Inn Lake St

3 Thai Tanic Branch Dr

venue_id field_id value1 1 Deli1 2 $2 1 Chinese2 2 $$$2 3 Scorpion Bowl2 4 No3 5 Yes3 6 Jazz

id name1 Cuisine2 Price3 Specialty Cocktail4 DJ?5 Live Music6 Music Genre

Add live music info

Page 123: Using Semantics to Enhance Content Publishing

Explicit semantics

Page 124: Using Semantics to Enhance Content Publishing

Remember this from grammar class?

subject predicate object

The basic data unit

Page 125: Using Semantics to Enhance Content Publishing

Machine readable and almost human readable

subject predicate objectS1 cuisine “Deli”S1 price “$”S1 name “Deli Llama”S2 cuisine “Chinese”S2 price “$”S2 name “Peking Inn”S2 best drink “Scorpion Bowl”S2 address “Lake St”S2 DJ? “No”S4 name “Fendalton”S4 contained-by S5S5 name “Christchurch”S1 location S4S6 name “Downtown”S6 contained-by S7S7 name “Wellington, NZ”S2 location S6

Restaurants as triples

Page 126: Using Semantics to Enhance Content Publishing

...or as a graph

Deli Liiama

$

DeliS1Cuisine

Price

Name

Page 127: Using Semantics to Enhance Content Publishing

Restaurant Graph

Deli Liiama

Christchurch

Fendalton

$

DeliS1

S4

Cuisine

Price

Name

Name

Contained-by

Location

Peking Inn

Name

S2

Location

Chinese

Cuisine

Page 128: Using Semantics to Enhance Content Publishing

Extending The Restaurant Model

Deli Liiama

Christchurch

Fendalton

$

DeliS1

S4

Cuisine

Price

Name

Name

Contained-by

Location

Live DJ

Music

Urban ChicDecor

Page 129: Using Semantics to Enhance Content Publishing

Integrating Graph Data Models

Deli Liiama

$

DeliS1Cuisine

Price

Name

OnTap

LeinenkugelZ6

Pabst BRBrand

Brand

Deli Liiama

A2

Name

Page 130: Using Semantics to Enhance Content Publishing

What Went Wrong?

Things change

Requirements change

User expectations change

Data structures change

Our data models aren’t keeping up

Scripting Languagesfacilitate change

....where is the data model that does the same?

Page 131: Using Semantics to Enhance Content Publishing

Semantic Representation

Relationships are represented explicitly

Schema can be represented as a graph

Data integration is the union of two graphs

This makes creating, extending, and combining data much easier than before

Page 132: Using Semantics to Enhance Content Publishing

Just enough RDF

Page 133: Using Semantics to Enhance Content Publishing

Just Enough RDF

RDF is a Data Model

A very simple model!

Page 134: Using Semantics to Enhance Content Publishing

Cosmos was written by Carl Sagan

Page 135: Using Semantics to Enhance Content Publishing

Subject ObjectPredicate

(Cosmos) (was written by) (Carl Sagan)

authorCosmosCarl

Sagan

Page 136: Using Semantics to Enhance Content Publishing

(Cosmos)

Subject Which Cosmos?

Page 137: Using Semantics to Enhance Content Publishing

(Cosmos)

Subject Which Cosmos?

Page 138: Using Semantics to Enhance Content Publishing

Identifiers are Everywhere

#w2e

Page 139: Using Semantics to Enhance Content Publishing

The humble URI

•URI’s provide strong references

•Much like pointing in the physical world

“this is red”“this is a pen”

•a URIref is an unambiguous pointer to something of meaning

Page 140: Using Semantics to Enhance Content Publishing

(Cosmos)

Subject

http://rdf.freebase.com/ns/authority.openlibrary.book.OL3568862M

Which Cosmos?

Page 141: Using Semantics to Enhance Content Publishing

authorCosmosCarl

Sagan

http://rdf.freebase.com/ns/book.written_work.author

What do you mean, author?

vocabulary

Page 142: Using Semantics to Enhance Content Publishing

authorCosmos

There are billions of Carl Sagans...http://rdf.freebase.com/ns/en.carl_sagan

Page 143: Using Semantics to Enhance Content Publishing

authorCosmosCarl

Sagan

published “1980”

Page 144: Using Semantics to Enhance Content Publishing

RDF Data Model

Nodes (“Subjects”)

connect via Links (“Predicates”)

to Objects• either Nodes or Literals

Page 145: Using Semantics to Enhance Content Publishing

Expressions of RDF

RDF has many (inconvenient) serializations

•RDF-XML•N3

•Turtle•NTriples

•RDFa

Page 146: Using Semantics to Enhance Content Publishing

URIs provide identityhttp://rdf.freebase.com/ns/en.robert_cook

Stability

Simplicity

Manageability

Page 147: Using Semantics to Enhance Content Publishing

Not all URL’s are good identifiers

Page 148: Using Semantics to Enhance Content Publishing

Data

Data

Data

Data

Semantics allows an application to utilize unanticipated new data sources

Plugable Data

Page 149: Using Semantics to Enhance Content Publishing

Plugable Data

Page 150: Using Semantics to Enhance Content Publishing

Data Portability

Data

Data

Data

Data

Semantics allows data to be utilized by unanticipated new applications

Page 151: Using Semantics to Enhance Content Publishing

Data Portability

http://dev.mqlx.com/~jamie/simile/timeline.html

Page 152: Using Semantics to Enhance Content Publishing

Data Portability

Page 153: Using Semantics to Enhance Content Publishing

Semantics facilitate shared meaning through

• Subject Identity

• Strong and Consistent Semantics

• Open APIS + Open Data

These principles make it much easier to extend, combine, and integrate data

Why Does This Work?

Page 154: Using Semantics to Enhance Content Publishing

RDF Graphs

CarrieFisher

Star Wars

Harrison Ford

Blade Runner

Daryl Hannah

Starred In

Starred In

Starred In

Starred In

Page 155: Using Semantics to Enhance Content Publishing

Triple Stores(aka Graph Stores)

Page 156: Using Semantics to Enhance Content Publishing

Allegro Graph

Page 157: Using Semantics to Enhance Content Publishing

Keep your data as flexible as the source

+

+

Page 158: Using Semantics to Enhance Content Publishing

Strong Identifiers

Strong Semantics(strong vocabularies)

Open Data

Page 159: Using Semantics to Enhance Content Publishing

Can describe?!

• Semantics Benefit• Site owners

• Site users

• Developers

• You

At the end of this talk - you should be able to say how semantics benefits each of these groups

Page 160: Using Semantics to Enhance Content Publishing