apache stanbol
DESCRIPTION
Apache Stanbol. Overview. Features overview Components Stanbol Content Enhancer Stanbol Entity Hub Stanbol Content Hub Stanbol Ontology Technologies. Features. Apache Stanbol provides a set of reusable components for semantic content management . - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/1.jpg)
www.sti-innsbruck.at © Copyright 2012 STI INNSBRUCK www.sti-innsbruck.at
Apache Stanbol
![Page 2: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/2.jpg)
www.sti-innsbruck.at
Overview
• Features overview
• Components– Stanbol Content Enhancer
– Stanbol Entity Hub
– Stanbol Content Hub
– Stanbol Ontology
• Technologies
2
![Page 3: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/3.jpg)
www.sti-innsbruck.at
Features
• Apache Stanbol provides a set of reusable components for semantic content management.
• Apache Stanbol's main features are:– Content Enhancement
Services that add semantic information to “non-semantic” pieces of content.– Reasoning
Services that are able to retrieve additional semantic information about the content based on the semantic information retrieved via content enhancement.
– Knowledge ModelsServices that are used to define and manipulate the data models (e.g. ontologies) that are used to store the semantic information.
– PersistenceServices that store (or cache) semantic information, i.e. enhanced content, entities, facts, and make it searchable.
3
![Page 4: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/4.jpg)
www.sti-innsbruck.at
Components
• Enhancer: Extracts Knowledge from parsed Content• Entityhub: Manage Entities and Topics of Interest to your Domain• Contenthub: Semantic Indexing / Search over your - semantic
enhanced - Content• CMS Adapter: Sync. your CMS with Apache Stanbol (JCR/CMIS)• Ontology Manager: Manage you formal Domain Knowledge • Reasoners & Rules: Apply Domain Knowledge to improve / validate
extracted.• Information. Refactor / refine knowledge to align it to public schemas
such as schema.org
4
![Page 5: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/5.jpg)
www.sti-innsbruck.at
Stanbol Content Enhancer
• Entity Tagging - replacing text based tags such as "Bob Marley" with entities - dbpedia:Bob_Marley - to improve content search and categorization.
• Entity Disambiguation - enhance the entity tagging experience by explicit support for disambiguation between different suggested entities. This allows users to explicitly link to Paris (Texas), Bob Marley (Comedian) or in between any other entities that do share similar labels.
• Entity Checker - interact with extracted entities similar as with todays spellchecker: Show extracted/suggested dirtily within the content; Allow users to interact with suggestions and to disambiguate between different matches if necessary; Support search for additional/other entities.
5
![Page 6: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/6.jpg)
www.sti-innsbruck.at
Stanbol Content Enhancer (II)
6
![Page 7: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/7.jpg)
www.sti-innsbruck.at
Stanbol Content Enhancer (III)
• Support for domain specific vocabularies
7
![Page 8: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/8.jpg)
www.sti-innsbruck.at
Stanbol Content Enhancer (IV)
• The following Languages are supported for Named Entity Recognition - and can therefore be used for Named entity Linking:
– English (via NamedEntityTaggingEngine, OpenCalais)– Spansh (via NamedEntityTaggingEngine, OpenCalais)– Dutch ((via NamedEntityTaggingEngine)– French (via CELI NER engine, OpenCalais)– Italien (via CELI NER engine)
• For the following languages NLP support is available to improve results when using the Keyword Extraction Engine:
– Danish– Dutch– English– German– Portuguese– Spanish– Swedish
8
![Page 9: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/9.jpg)
www.sti-innsbruck.at
Stanbol Content Enhancer (V)
9
![Page 10: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/10.jpg)
www.sti-innsbruck.at
Stanbol Entity Hub
• Responsible for providing the information about Entities relevant to the users domain. The following figure tries to provide an overview about the features of the Entityhub.
10
![Page 11: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/11.jpg)
www.sti-innsbruck.at
Stanbol Content Hub
• Add Semantic Search to your CMS– RESTful Faceted Search Interface– Related Keyword Search using Entityhub, Ontonet or Wordnet– Improve Search by Semantic Indexing
• Use the Stanbol Contenthub for semantic indexing
11
![Page 12: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/12.jpg)
www.sti-innsbruck.at
Stanbol Ontology
• Manage your Ontologies– and use/combine them in Scopes
• Reasoning– on volatile Data loaded into a Sessions – consistency check / classification / enrichment– RDFS, OWL and OWL - 2
• Support for background Jobs – for long running reasoning tasks
12
![Page 13: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/13.jpg)
www.sti-innsbruck.at
Stanbol Ontology
13
![Page 14: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/14.jpg)
www.sti-innsbruck.at
Stanbol Ontology (Rules)
• Stanbol Rules– Recipes: Manage a set of Rules that are executed together– Rules are converted to SWRL,Jena Rules or SPARQL CONSTRUCT depending on
the available RuleEngine
• Typical Use Cases– integrity checks for imported Data– harmonize Vocabularies e.g. simple SEO by using schema.org
14
![Page 15: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/15.jpg)
www.sti-innsbruck.at
Technologies
• Functionalities are provided as RESTful services returning results as RDF (Resource Description Language) and JSON.
– Apache Stanbol also supports the use of JSON-LD.
• Apache Stanbol can be run as a standalone application (packaged as a runable JAR) or as an web application (packaged as a WAR file) deployable in servlet containers such as Apache Tomcat.
• Written in Java based on the OSGi as component framework.• Implemented using frameworks such as
– Apache Solr - for semantic search; – Apache Tika - for plain text and metadata extraction; – Apache OpenNLP - for natural language processing; – Apache Clerezza and Apache Jena - as RDF and storage frameworks; – Apache Felix as OSGi framework and – Apache Sling for deployment.
15
![Page 16: Apache Stanbol](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816390550346895dd485e2/html5/thumbnails/16.jpg)
www.sti-innsbruck.at
Technologies (II)
• Stanbol Components provide– RESTful API– Java API and OSGI services
• Stanbol Components do NOT depend on each other– however they can be easily combined
16