the road to the semantic web

52
The Road to the Semantic Web Michael Genkin SDBI 2010@HUJI

Upload: michael-genkin

Post on 24-Jun-2015

308 views

Category:

Technology


0 download

DESCRIPTION

A seminar lecture presenting a Carnegie Mellon research project "Read the Web" (http://rtw.ml.cmu.edu/rtw/). Presented at the Databases & the Internet seminar at Hebrew University of Jerusalem

TRANSCRIPT

Page 1: The  Road To The  Semantic  Web

The Road to the Semantic WebMichael Genkin

SDBI 2010@HUJI

Page 2: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

"The Semantic Web is not a separate Web but an extension of the current one, in which information is given well-defined meaning, better enabling computers and people to work in cooperation."

Tim Berners-Lee, James Hendler and Ora Lassila; Scientific American, May 2001

Page 3: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Over 25 billion RDF triples (October 2010)

More than 24 billion web pages (June 2010)

Probably more than one triple per page, lot more

Page 4: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

How will we populate the Semantic Web?

Humans will enter structured data

Data-store owners will share their data

Computers will read unstructured data

Page 5: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Read the Webhttp://rtw.ml.cmu.edu/rtw/(or google it)

Page 6: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Roadmap Motivation Some definitions

Natural language processing Machine learning

Macro reading the web Coupled training NELL

Demo Summary

Page 7: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Some Definitions Natural Language Processing Machine Learning

Page 8: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Natural Language Processing Part of Speech Tagging (e.g. noun, verb) Noun phrase: a phrase that normally

consists of a (modified) head noun. “pre-modified” (e.g. this, that, the red…) “post-modified” (e.g. …with long hair, …

where I live) Proper noun: a noun which represents an

unique entity (e.g. Jerusalem, Michael) Common noun: a noun which represents a

class of entities (e.g. car, university)

Page 9: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Learning: What is it? Assume there is some knowledge base KB. Let some algorithm to perform a set of task

T. Let a performance metric Perf. We will say that a computer program learns

if:

Page 10: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Training Methods

Page 11: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

We have a set of examples (KB) and a domain (D) Examples might be positive, or negative e.g. for every input for some .

The learning algorithm A would try to find such . is called a classifier or regression

Supervised

Page 12: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Distinguished from supervised learning by that there are no labeled examples (KB=D).

The unsupervised learning algorithm A will try to find a classifier that when given some as input, will return some arbitrary label. i.e. the algorithm A analyses the structure of

D

Supervised

Unsupervised

Page 13: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

A middle way between supervised and unsupervised.

Use a minimal amount of labeled examples and a large amount of unlabeled.

Learn the structure of D in unsupervised manner, but use the labeled examples to constraint the results. Repeat. Known as bootstrapping.

Supervised Semi-Supervised

Unsupervised

Page 14: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Bootstrapping Iterative semi-supervised learningJerusalemTel AvivHaifa

mayor of arg1life in arg1

Ness-ZionaLondondenial

anxietyselfishnessAmsterdam

arg1 is home oftraits such as arg1

Under constrained! Sematic drift

Page 15: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Macro Reading the WebPopulating the Semantic Web by Macro-Reading Internet Text.T.M. Mitchell, J. Betteridge, A. Carlson, E.R. Hruschka Jr., and R.C. Wang. Invited Paper, In Proceedings of the International Semantic Web Conference (ISWC), 2009

Page 16: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Problem Specification (1): Input Initial ontology that contains:

Dozens of categories and relations (e.g. Company, CompanyHeadquarteredInCity)

Relations between categories and relations (e.g. mutual exclusion, type constraints)

A few seed examples of each predicate in ontology

The web Occasional access to human trainer

Page 17: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Problem Specification (2): The Task Run forever (24x7) Each day:

Run over ~500 million web pages. Extract new facts and relations from the

web to populate ontology. Perform better than the day before

Populate the semantic web.

Page 18: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

A Solution? An automatic, learning, macro-reader.

Page 19: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Micro vs. Macro Reading (1) Micro-reading: the traditional NLP task of

annotating a single web page to extract the full body of information contained in the document. NLP is hard!

Macro-reading: the task of “reading” a large corpus of web pages (e.g. the web) and returning large collection of facts expressed in the corpus. But not necessarily all the facts.

Page 20: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Micro vs. Macro Reading (2) Macro-reading is easier than micro-reading.

Why? Macro-reading doesn’t require extracting

every bit of information available. In text corpora as large as the web, many

important fact are stated redundantly, thousands of times, using different wordings. Benefit by ignoring complex sentences. Benefit by statistically combining evidence

from many fragments to determine a belief in a hypothesis.

Page 21: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Why an Input Ontology? The problem with understanding free text is

that it can mean virtually anything. By formulating the problem of macro-

reading as populating an ontology we allow the system to focus only on relevant documents.

The ontology can define meta properties of its categories and relations.

Allows to populate parts of the semantic web for which an ontology is available.

Page 22: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Machine Learning Methods Semi-supervised (use an ontology to

learn). Learn textual patterns for extraction. Employ methods such as Coupled

Training to improve accuracy. Expand the ontology to improve

performance.

Page 23: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled Training

Page 24: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Bootstrapping – Revised Iterative semi-supervised learningJerusalemTel AvivHaifa

mayor of arg1life in arg1

Ness-ZionaLondondenial

anxietyselfishnessAmsterdam

arg1 is home oftraits such as arg1

Page 25: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled Training

Couple the training of multiple functions to make unlabeled data more informative

Makes the learning task easier by adding constraints

Page 26: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupling (1):Output Constraints We wish to train a function

e.g. Assume we have two different functions

that assign the label city, but receive different input.

Coupling constraint: must agree over unlabeled data.

Page 27: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupling (1):Output Constraints

arg1: Nir Barkat is the mayor of Jerusalem

X1=arg1

Y=city?

X2=arg1

Y=country?≠

X2=arg1

Y=city?¿

Page 28: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupling (2):Compositional Constraints Assume we have Assume we have a constraint on valid

pairs given . Coupling constraint: must satisfy the

constraint on . e.g. “type checks” first argument for

Page 29: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupling (2):Compositional Constraints

Nir Barkat is the mayor of Jerusalem

MayorOf(X1,X2)

city?

location?

politician?

city?

location?

politician?

Page 30: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupling (3):Multi-view Agreement We have a function

If X can be partitioned into two “views” . Assume and can predict Y.

We wish to learn Coupling constraint: must agree.

Page 31: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupling (3):Multi-view Agreement Let Y a set of possible web page

categories Let X be a set of web pages

Assume represents the words in a page Assume represents the words in

hyperlinks pointing to the page

Page 32: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

NELL – Never-Ending Language LearningCoupled Semi-Supervised Learning for Information Extraction.A. Carlson, J. Betteridge, R.C. Wang, E.R. Hruschka Jr. and T.M. Mitchell. In Proceedings of the ACM International Conference on Web Search and Data Mining (WSDM), 2010.Never Ending Language LearningTom Mitchell's invited talk in the Univ. of Washington CSE Distinguished Lecture Series, October 21, 2010.

Page 33: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Motivation Humans learn many things, for years,

and become better learners over time Why not machines?

Page 34: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled Constraints (1) Mutual Exclusion:

Two mutually exclusive predicates can’t be both satisfied by the same input .

Relation argument type checking: Insure the noun phrases to satisfy each

relation correspond to the categories defined for this relation.

e.g. CompanyIsInEconomicSector relation has arguments of Company and EconomicSector categories.

Page 35: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled Constraints (2) Unstructured and Semi-structured text

features: Noun phrases appear on the web in free

text context or semi-structured context. Structured and Semi-structured classifiers

will make independent mistakes But each is sufficient for classification

Both the classifiers must agree.

Page 36: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled Pattern Learner (CPL): Overview Learns to extract

category and patterninstances.

Learns high-precisiontextual patterns. e.g. arg1 scored a

goal for arg2

Page 37: The  Road To The  Semantic  Web

Coupled Pattern Learner (CPL): Extracting Runs forever, on each iteration bootstraps a

patterns promoted on the last iteration to extract instances. Select the 1000 that co-occur with most patterns. Similar procedure for patterns, but using recently

promoted instances. Uses PoS heuristics to accomplish extraction

e.g. per category proper/common noun specification, pattern is a sequence of verbs followed by adjectives, prepositions, or determiners (and optionally preceded by nouns).

Michael Genkin ([email protected])

Page 38: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled Pattern Learner (CPL): Filtering and Ranking Candidates are filtered to enforce mutual

exclusion and type constraints A candidate is rejected unless it co-occurs

with a promoted pattern at least three times more than it co-occurs with mutually exclusive predicates.

Candidates are ranked as following: Instances: by the number of promoted

patterns the co-occur with. Patterns: by precision estimation

Page 39: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled Pattern Learner (CPL): Promoting Candidates For each predicate – promotes at most

100 instances and 5 patterns. Highest rated. Instances and patterns promoted only if

they co-occur with two promoted pattern or instances.

Relations instances are promoted only if their arguments are candidates for the specified categories.

Page 40: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled SEAL (1) SEAL is an established wrapper

induction algorithm. Creates page specific extractors Independent of language Category wrappers defined by prefix and

postfix, relation wrappers defined by infix. Wrappers for each predicate learned

independently.

Page 41: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Coupled SEAL (2) Coupled SEAL adds mutual

exclusion and type checkingconstrains to SEAL. Bootstraps recently promoted

wrappers. Filters candidates that are

mutually exclusive or not ofthe right type for relation.

Uses a single page per domainfor ranking.

Promotes the top 100 instances extracted by at least two wrappers.

Page 42: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Meta-Bootstrap Learner Couples the training of

multiple extractiontechniques. Intuition: different

extractors will makeindependent errors.

Replaces the PROMOTEstep of subordinateextractor algorithms. Promotes any instance recommended by all

the extractors, as long as mutual exclusion and type checks hold.

Page 43: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Page 44: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Page 45: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Learning New Constraints Data mine the KB to infer new beliefs. Generates probabilistic, first order, horn

clauses. Connects previously uncoupled

predicates.

Manually filter rules.

Page 46: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Page 47: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Demo Time http://rtw.ml.cmu.edu/rtw/kbbrowser/

Page 48: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

SummaryPopulating the semantic web by using NELL for macro reading

Page 49: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Populating the Semantic Web Many ways to accomplish. Use initial ontology to focus, constrain

the learning task. Couple the learning of many, many

extractors. Macro Reading: instead of annotating a

single page each time, read many pages simultaneously.

A never ending task.

Page 50: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Macro-Reading Helps to improve accuracy. Still doesn’t help to annotate a single

page, but… Many things that are true for a single

page are also true for many pages Helps to populate databases with

frequently mentioned knowledge

Page 51: The  Road To The  Semantic  Web

Michael Genkin ([email protected])

Future Directions Coupling with external sources

DBpedia, Freenode Ontology extension

New relations through reading, Subcategories

Use a macro-reader to train a micro-reader

Self-reflection, Self-correction Distinguishing tokens from entities Active learning – crowd sourcing