breaking down walls in enterprise with social semantics
DESCRIPTION
Keynote Talk at the Workshop on New Trends in Service Oriented Architecture for massive Knowledge processing in Modern Enterprise (SOA-KME 2012) / Palermo, Italy / 6th July 2012TRANSCRIPT
Breaking down walls in enterprise with social semantics
John Breslin
National University of Ireland, Galway
Lecturer at NUI Galway
Engineering and
informatics
• Researcher at DERI, NUI Galway
Founder of the SIOC project
• Semantically-Interlinked Online Communities
• Enables interoperability and exchange of social content:– Blogs, forums, wikis...
Co-founder of boards.ie
• Ireland’s largest discussion forum site
• 2.25 million visitors/month
• Irish people seeking information, or just chatting about sports, TV, politics, health, whatever
Co-founder of StreamGlider, Inc.
• Real-time streaming newsreader
• Supports social, multimedia, news
• Can be used as an enterprise dashboard
Social platforms are like data silos
image from pidgintech.com
Many isolated communities of users and their data
image from pidgintech.com
Need ways to connect these islands
image from pidgintech.com
Allowing users to easily travel from one to another
image from pidgintech.com
Enabling users to easily bring their data with them
image from pidgintech.com
image from tinyurl.com/orionw
• Workers are using a variety of collaboration platforms internally in a localised or distributed enterprise
• These platforms remain largely isolated from each other
• Vast amounts of shared items and profiles are disconnected
Parallels in enterprise
Object-centred sociality (AKA social objects)
• Users are connected via a common object:
– Their job, university, hobbies, interests, a date…
• “According to this theory, people don’t just connect to each other. They connect through a shared object. […]Good services allow people to create social objects that add value.” – Jyri Engestrom– Flickr = photos– del.icio.us = bookmarks– Blogs = discussion posts
It’s the social objects we create…
• Discussions
• Bookmarks
• Annotations
• Profiles
• Microblogs
• Multimedia
…that connect usto other people
Semantics
The Semantic Web
A brief overview
What’s in a page? And in a link?
?
?
?
Tim Berners-Lee, The 1st World Wide Web Conference, Geneva, May 1994
To a computer, the Web is a flat, boring world, devoid of meaning. This is a pity, as in fact documents on the Web describe real objects and imaginary concepts, and give particular relationships between them. […] Adding semantics to the Web involves two things: allowing documents which have information in machine-readable forms, and allowing links to be created with relationship values. Only when we have this extra level of semantics will we be able to use computer power to help us exploit the information to a greater extent than our own reading.
Identifying resources with URIs
• URIs are used to identify everything in a unique and non-ambiguous way:
– Not only pages (as on the current Web), but any resource (people, documents, books, interests, etc.)
– A URI for a person is different from a URI for a document about the person, because a person is not a document!
– e.g. http://dbpedia.org/resource/Galway
Defining assertions with RDF
• URIs identify resources:
– How do we define assertions about these resources?
• We use RDF (Resource Description Framework):
– A data model; a directed, labeled graph using URIs
– Various serialisations (RDF/XML, N3, RDFa, etc.)
• RDF is based on triples:
– <subject> <predicate> <object> .
RDF by example
@prefix dct: <http://purl.org/dc/terms/> .
<http://example.org/dm110-semweb>
dct:title “Introduction to the Semantic Web” ;
dct:author <http://apassant.net/alex> ;
dct:subject <http://dbpedia.org/resource/Semantic_Web> .
Defining semantics with ontologies
• RDF provides a way to write assertions about URIs:
– But what about the semantics of these assertions, e.g. to state that http://xmlns.com/foaf/0.1/knows identifies an acquaintance relationship?
• Ontologies provide common semantics for resources on the Semantic Web:
– “An ontology is a specification of a conceptualization”
– RDFS and OWL have different expressiveness levels
Ontologies consist mainly of classes and properties
– :Person a rdfs:Class .– :father a rdfs:Property .– :father rdfs:domain :Person .– :father rdfs:range :Person .
Linked Data
• Building a “Web of Data” to enhance the current Web• Exposing, sharing and connecting data about things via
dereferenceable URIs• The Linking Open Data (LOD) project:
– http://linkeddata.org/– Translating existing datasets into RDF and linking
them together, for example DBpedia (Wikipedia) and GeoNames, Freebase, BBC programmes, etc.
– Government data available as Linked Data– LOD cloud in 2007:
image from richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.png
Social semantic representation models
Using ontologies to model social data
Two-way street: the Semantic Web can help social spaces, vice versa
• Can use the Semantic Web to describe people, content objects and the connections that bind them all together so that social spaces can interoperate via semantics
• In the other direction, object-centered social spaces can serve as rich social data sources for semantic applications
image from tinyurl.com/highway2
The Social Semantic Web
FOAF
Friend Of A Friend
What is FOAF?
• An ontology for describing people and the relationships that exist between them:– http://foaf-project.org/– Identity, personal profiles and social networks– Can be integrated with other SW vocabularies
• FOAF on the Web:– LiveJournal, MyOpera, identi.ca, MyBlogLog, hi5,
Fotothing, Videntity, FriendFeed, Ecademy, Typepad
FOAF at a glance
Distributed identity with FOAF
FOAF from Flickr
FOAF from Twitter
Interlinking identities and networks
SIOC, pronounced shock
image from tinyurl.com/siocshock
Semantically-Interlinked Online Communities (SIOC)
• Goal of the SIOC ontology is to address interoperability issues on the Social Web– sioc-project.org– W3C member submission in 2007– SIOC has been adopted in a framework of applications
or modules deployed on hundreds of sites– Web 2.0, enterprise information integration, HCLS, e-
government
image from tinyurl.com/friendship2
Some of the SIOC core ontology classes and properties
Some applications using SIOC
RDFa on newsweek.com
RDFa in Drupal 7
• Drupal CMS used by 2 percent of all sites• Drupal 7 release has Semantic Web support built-in• RDFa (SIOC, FOAF, Dublin Core, SKOS) data for blog
posts, forums, etc. • Video at www.semantic-drupal.com
image from tinyurl.com/drupaper
How much SIOC data is out there?
images (this one and later backgrounds) from publicdomainpictures.net
Sindice 2012: classes
• Total instances of SIOC classes: 7.7M
– Up 200k in three months
• Most occurences: sioc:Item (2.2M)
– Followed by:• UserAccount (1.6M), MicroblogPost (1.3M), Post (800k),
User (700k), Comment (400k)...
– Note: 1 billion foaf:Person instances!!!
• Used on most [distinct] sites:
– Item (7k), UserAccount (7k), Post (3k)...
Sindice 2012: predicates
• Total instances of SIOC predicates: 22.5M
– Up 400k in three months
• Most occurences: sioc:follows (4.6M)
– Followed by:• topic (4M), account_of (3.5M), has_creator (2.7M),
links_to (1.5M), has_discussion (1.3M)...
• Used on most [distinct] sites:
– has_creator (8k), num_replies (7k), name (2k), account_of (1.5k), reply_of (1.5k)...
Sindice 2012: namespaces
• SIOC data is being generated from 10k distinct domains (2k SLDs) (plus 2k domains for the SIOC Types module)
– Increasing by about 100 domains a month
– No doubt helped by Drupal!
• FOAF data is being generated from 3M distinct domains (100k SLDs)
– Increasing by over 1000 domains a month
CommonCrawl
• Muehleisen and Bizer– LDOW @ WWW 2012
• 1.5 billion web pages• 3 billion RDF quads• SIOC available from
at least 22k PLDs (pay-level domains)
• FOAF on 27k PLDs
• Results published on Monday 2 July 2012 at:
• webdatacommons.org/vocabulary-usage-analysis/index.html
Online Presence Ontology (OPO)
Tagging issues
• Tagging enables user-generated classification of content with evolving and user-driven vocabularies
• But it also raises various issues:
– Tag ambiguity:• “apple” = fruit or computer brand?
– Tag heterogeneity:• “socialmedia”, “social_media”, “socmed”
– Lack of organisation:• No links between tags, e.g. “SPARQL” and “RDF”
The Tag Ontology
• The “Tag Ontology” by Newman from 2005:
– http://www.holygoat.co.uk/projects/tags/
– Based on Gruber’s tag model
– tags:Tag rdfs:subClassOf skos:Concept– A “Tagging” class describing relationships between:
• A user
• An annotated resource
• Some tags
MOAT
• MOAT (Meaning Of A Tag):
– http://moat-project.org/
– A model to define “meanings” of tags
– e.g. SPARQL → http://dbpedia.org/resource/SPARQL
– User-driven interlinking
– Tagged content enters the “Linked Data” web
– Collaborative approach to share meanings in a community
Tagging process with MOAT and DBpedia
MOAT in Drupal
Unifying collaborations
Some more semantically-enhanced systems, with enterprise applicability
Semantic MediaWiki (SMW)
Sample output from a SMW query
Linking IRC to the Web of Data
Mailing lists
Bulletin boards
SMOB
Semantic #tagging
• User-driven interlinking
• Real-time URIs are suggested when writing content
• Added ability to add new webservices (e.g. enterprise microblogging with contextual semantics)
Distributed arch
An ontology stack for social semantic collaborative spaces
Semantic Enterprise 2.0
Enterprise 2.0 goes semantic
Enterprise 2.0
• Web 2.0 includes applications such as blogs, wikis, RSS feeds and social networking, while Enterprise 2.0 is the packaging of those technologies in both corporate IT and workplace environments:– Corporate blogging, wikis, microblogging– Social networking within organisations, etc.
• “Enterprise 2.0 is the use of emergent social software platforms within companies, or between companies and their partners or customers” - McAfee, MIT Sloan, 2006
Enterprise 2.0 and the Web
• Many enterprises have an online presence on various Web 2.0 services to reach their customers:
– Slideshare
– Flickr
– etc.
The SLATES acronym
• Search: Easy and relevant access to information
• Links: Enable better browsing capabilities between content
• Authoring: Easy interfaces to produce content, in a collaborative way
• Tagging: User-generated classification, enables serendipity and knowledge discovery
• Extension: Recommendation of relevant content
• Signals: Identify relevant content
Social aspects of Enterprise 2.0
• Enterprise 2.0 introduces new paradigms in organisations with regards to knowledge sharing and communication patterns:
– Enterprise 2.0 is a philosophy
• Enterprise 2.0’s success depends on a company’s background:
– A study by AIIM showed that 41% of companies do not have a clear understanding of what Enterprise 2.0 is, while this percentage goes down to 15% in KM-oriented companies.
Keys to Enterprise 2.0 adoption
• Combining top-down and bottom-up approaches helps to realise Enterprise 2.0:
– Top-down: Hierarchy (bosses!) sets up new tools and requires that various sections use them
– Bottom-up: Users become evangelists and word-of-mouth improves the number of new users
Business metrics for Enterprise 2.0
• 13% of the Fortune 500 companies have a public blog maintained by their employees
• Forrester Research predicts a global market for Enterprise 2.0 solutions of 4.6 billion dollars by 2013, and according to Gartner, more social computing platforms will be adopted by companies in next 10 years
• Lots of companies and products in this space:– Awareness, Mentor Scout, SelectMinds,
introNetworks, Jive Software, Visible Path, Web Crossing, SocialText, etc.
Open-source applications
• Open-source Web 2.0 apps can be efficiently used in organisations to build Enterprise 2.0 ecosystems:
– Blogging: WordPress, etc.
– Wikis: MediaWiki, MoinMoin, etc.
– RSS readers and APIs: MagpieRSS, etc.
– Integrated CMSs: Drupal, etc.
Information fragmentation issues
• Heterogeneity of people, services, needs and practices leads to various services and tools being deployed
• By using various services (blogs, wikis, etc.), information about a particular object (e.g. a project) is fragmented over a company’s network:
– Getting a global picture is difficult
• Applications act as independent data silos, with different APIs, different data formats, etc.:
– Data integration can be a costly task
Lack of machine-readable data and tagging issues
• Enterprise 2.0 enables and encourages people to provide valuable content inside organisations:
– However, information is complex to re-use, generally remains locked inside services, and is for human-consumption only
• Some queries cannot be answered automatically:
– “List all the US-based companies involved in sustainable energies”
– Plus there’s the aforementioned issue with tagging
Semantic Web in enterprises
• Semantic Web technologies are already widely used in organisations:– Ontology-based information management– Semantic middleware between databases – Intelligent portals– etc.
• Semantic Web Education and Outreach (W3C):– http://www.w3.org/2001/sw/sweo/public/UseCases/– NASA, Lilly, Oracle, Yahoo!, etc.
A Semantic Enterprise 2.0 architecture
• Lightweight add-ons to existing applications to provide RDF data:
– Exporters, wrappers, dedicated scripts, etc.
– Taking into account the social aspect (e.g. semantic wikis)
• Models to give meaning to this RDF data:
– Domain ontologies, taxonomies, etc.
• Applications on the top of it:
– Thanks to RDF(S)/OWL and SPARQL
The RDF Bus approach
• RDF Bus architecture (Tim Berners-Lee):– Add-ons to produce RDF data from
existing Web 2.0 applications
• Store distributed data using RDF stores
• Create new applications:– Semantic mashups
– Semantic search
• Open architecture thanks to a SPARQL endpoint, services as plugins to the architecture
Relational DB to RDF mapping
• Relational data (RDB) is structured data and can be mapped to RDF straightforward:
– Allows integration of existing enterprise databases into the Semantic Enterprise 2.0 architecture
• Main issues include: closed-world vs. open-world modeling; assigning URIs for entities (records); mapping language expressivity
• For a state-of-the-art see http://www.w3.org/2005/Incubator/rdbrdf/RDB2RDF_SurveyReport.pdf
LOD and Semantic Enterprise 2.0
• Huge potential for internal IT infrastructures to enhance existing applications (mashups, extended UIs, etc.):
– Integration of open and structured data from various sources at minor cost
• Issue: dependance on external services, replication may be required
• RSS is already widely used in organisations as a way to get information from the Web, LOD provides structured data to extend IT ecosystems
Semantic Enterprise 2.0 use cases
• Electricité De France R&D:
– Integration of Enterprise 2.0 components using lightweight semantics
• Ecospace EU project:
– Interoperability of collaborative work environments
• Boeing inSite:
– Uses SIOC, FOAF and other social web standards to reduce time and effort spent finding and sharing
Use case: EDF R&D
Use case: CWE interoperability
BSCW shadow
folder
BC semantic folder
private folders
Use case: Boeing inSite
Related ongoing work
SPARQL+XMPP+spreading activation for linking enterprise collaborations (Cisco)
Using PPO/PPM to access Linked (Enterprise) Data
Aggregated, interoperable and multi-platform user profiles
Summary
• Object-centred sociality refers to how we really use social spaces:– Can use semantics to describe this usage, by
representing objects for linkage and reuse• Applicability in the enterprise for collaboration platforms• Describe people, networks, content, presence,
knowledge, tags, etc. with semantics• Providing solutions for novel uses in organisations:
– Not just for the Social Web, but for Enterprise 2.0
Acknowledgements
• Thanks to my colleagues in the Unit for Social Software (USS) at DERI, especially for their slides!
• We appreciate the support of Science Foundation Ireland and the Irish Research Council
image from tinyurl.com/starshiptr
…at Amazon.com
Our book…