![Page 1: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/1.jpg)
STS: Temporal and Spatial Constraints on Text Similarity
James PustejovskyBrandeis University
March 13, 2012
![Page 2: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/2.jpg)
Measuring Similarity
• Objects• Events
![Page 3: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/3.jpg)
Object similarity is a function of:
1. Sortal correlation2. Temporal proximity3. Spatial proximity
• the Latin Quarter of the 1920s– the 5th Arrondissement in 1929– Paris in 1925– The Left Bank in the early 20th Century
![Page 4: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/4.jpg)
Event similarity is a function of:
1. Predicative similarity2. Participant correlation3. Temporal proximity4. Spatial proximity
cf. Kim (1993), Davidson (1980), Lewis (1986)
![Page 5: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/5.jpg)
Brandeis CS114-2012 Pustejovsky
Event Similarity
a. Mary visited John in Boston on Tuesday.b. The woman/she saw her husband in Copley Square yesterday.
• Sim(P1,P2): visit vs. see
• Sim(Subj1,Subj2): Mary vs. the woman
• Sim(Obj1,Obj2): John vs. her husband
• Sim(Loc1,Loc2): Boston vs. Copley Square
• Sim(Time1,Time2): Tuesday vs. yesterday
![Page 6: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/6.jpg)
Brandeis CS114-2012 Pustejovsky
Predicative Similarity
• Lexical resources• LSA• Vector-based models
![Page 7: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/7.jpg)
Brandeis CS114-2012 Pustejovsky
Argument Alignment
• Semantic Role Labeling +
• Sortal Similarity
![Page 8: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/8.jpg)
Brandeis CS114-2012 Pustejovsky
Temporal Similarity
• Normalization– Map to standardized ISO-TimeML format
• Referencing– Reference relative to local temporal values
Val(Tuesday) = Val(yesterday)
![Page 9: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/9.jpg)
Brandeis CS114-2012 Pustejovsky
Spatial Similarity
• Normalization– Map to standardized ISO-Space format
• Referencing– Reference relative to accessible spatial values
Val(Copley_Sq) Spatial-IN Val(Boston)
![Page 10: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/10.jpg)
Brandeis CS114-2012 Pustejovsky
Temporal Issues
• Subsumption in anchoring– The bombing occurred Monday morning.– The bombing occurred Monday.– The bombing occurred last week.
![Page 11: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/11.jpg)
Motivation for time and event markup
• Natural language is filled with references to past and future events, as well as planned activities and goals;
• Without a robust ability to identify and temporally situate events of interest from language, the real importance of the information can be missed;
• A Robust Annotation standard can help leverage this information from natural language text.
![Page 12: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/12.jpg)
Temporal Awareness in Real Text
• The bridge collapsed during the storm but after traffic was rerouted to the Bay Bridge.
• President Roosevelt died in April 1945 before– the war ended. (event happened)– he dropped the bomb. (event didn’t happen)
• The CEO plans to retire next month. • Last week Bill was running the marathon when he
twisted his ankle. Someone had tripped him. He fell and didn't finish the race.
![Page 13: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/13.jpg)
Current Time Analysis Technology
• Document Time Linking– Find the document creation time and link that to
all events in the text;
• Local Time Stamping– find an event and a “local temporal expression”,
and link it to that time;
![Page 14: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/14.jpg)
Document Time Stamping
April 25, 2010• President Obama paid tribute Sunday to 29
workers killed in an explosion at a West Virginia coal mine earlier this month, saying they died "in pursuit of the American dream." The blast at the Upper Big Branch Mine was the worst U.S. mine disaster in nearly 40 years.Obama ordered a review earlier this month and blamed mine officials for lax regulation.
![Page 15: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/15.jpg)
Document Time Stamping:
April 25, 2010• President Obama paid tribute Sunday to 29
workers killed in an explosion at a West Virginia coal mine earlier this month, saying they died "in pursuit of the American dream." The blast at the Upper Big Branch Mine was the worst U.S. mine disaster in nearly 40 years.Obama ordered a review earlier this month and blamed mine officials for lax regulation.
![Page 16: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/16.jpg)
Identify which Events Should be Ordered
• The annotation specification should specify a kernel of events and time expressions to be annotated.
• Anchoring relations between events and times depend on genre, style, and register.
• Ordering relations between events depend largely on discourse relations in the text.
![Page 17: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/17.jpg)
Creation vs. Narrative Time
• Document Creation Time– when the utterance is made (speech time)
• Narrative Time– when the event occurs
![Page 18: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/18.jpg)
Genre, Style, and Register
• Participants• Relations among participants• Channel• Production Circumstances• Setting• Communicative Purpose• Topic
![Page 19: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/19.jpg)
Genre, Register, and Style
• Help distinguish text types in order to better characterize the information structure of the text
• Example, news wire vs. news article– narrative time (NT) is a function of
publication/creation frequency.
![Page 20: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/20.jpg)
Narrative Time
• Identifies the temporal interval of the events being described in the text. – Document Narrative Time: set by text-genre– Current Narrative Time: shifts through the text
![Page 21: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/21.jpg)
Document Time Stamping: for real
April 25, 2010• President Obama paid tribute Sunday to 29
workers killed in an explosion at a West Virginia coal mine earlier this month, saying they died "in pursuit of the American dream." The blast at the Upper Big Branch Mine was the worst U.S. mine disaster in nearly 40 years.Obama ordered a review earlier this month and blamed mine officials for lax regulation.
![Page 22: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/22.jpg)
Narrative Container
April 25, 2010• President Obama paid tribute Sunday to 29
workers killed in an explosion at a West Virginia coal mine earlier this month, saying they died "in pursuit of the American dream." The blast at the Upper Big Branch Mine was the worst U.S. mine disaster in nearly 40 years. Obama ordered a review earlier this month and blamed mine officials for lax regulation.
![Page 23: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/23.jpg)
Time Stamping: the good, bad, …
✓• ☺Set up a meeting on Tuesday with EMC.
✓• ☺Franklin arrives tomorrow from London.
✗• ☹ Franklin arrives on the afternoon flight from
London tomorrow. ✗
• ☹☹Most people drive today while talking on the phone.
![Page 24: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/24.jpg)
ISO-TimeML Enables Temporal Parsing
• A new generation of language analysis tools that are able to temporally organize events in terms of their ordering and time of occurrence
• These tools can be integrated with visualization, summarization, question answering, and link analysis systems to help analyze large event-rich information spaces.
![Page 25: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/25.jpg)
ISO-TimeML Provides elements to:
• Find all events and times in newswire text• Link events to the document time and to local
times• Order event relative to other events• Ensure consistency of the the temporal
relations
![Page 26: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/26.jpg)
ISO-Space
• Capture the complex constructions of spatial language in text
• Provide an inventory of how spatial information is presented in natural language
• ISO-Space is not designed to provide a formalism that fully represents the complexity of spatial language
![Page 27: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/27.jpg)
Applications of ISO-Space
• Building a spatial map of objects relative to one another.• Reconstructing spatial information associated with a sequence
of events.• Determining object location given a verbal description.• Translating viewer-centric verbal descriptions into other
relative descriptions or absolute coordinate descriptions.• Constructing a route given a route description.• Constructing a spatial model of an interior or exterior space
given a verbal description.• Integrating spatial descriptions with information from other
media.
![Page 28: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/28.jpg)
Semantic Requirements for Annotation
• Fundamental distinction between the concepts of annotation and representation– Based on ISO CD 24612 Language resource management -
Linguistic Annotation Framework (Ide and Romary, 2004)
• Distinguish between abstract syntax and concrete syntax– Concrete Syntax XML encoding– Abstract Syntax Conceptual inventory and a set of
syntactic rules defining the combination of these elements
![Page 29: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/29.jpg)
Spatial Expressions
• Constructions that make explicit reference to the spatial attributes of an object or spatial relations between objects
• Four grammatically defined classes:– Spatial Prepositions and Particles: on, in, under, over, up,
down, left of– Verbs of Position and Movement: lean over, sit, run, swim,
arrive– Spatial Attributes: tall, long, wide, deep– Spatial Nominals: area, room, center, corner, front, hallway
![Page 30: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/30.jpg)
Spatial Relations
• Topological:– In, inside, touching, outside
• Orientational (with frame of reference):– Behind, left of, in front of
• Topo-metric:– Near, close by
• Topological-orientational:– On, over, below
• Metric:– 20 miles away
![Page 31: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/31.jpg)
Frames of Reference (Levinson, 2003)
• Absolute– The lake is north of the city.
• Relative– The book is to your left. – The tree is between the Pru and the Monitor.
• Intrinsic– There’s a ball in front of the car.– The tree is behind the bench.
![Page 32: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/32.jpg)
Frames of reference
• The tree to the left of the entrance• The steps in front of me/the entrance
![Page 33: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/33.jpg)
ISO-Space 1.4
• Spatial Relations are split into 4 types:– Topological (QSLink)– Relational (OrientLink)– Movement (MoveLink)– Measurement (MLINK, from TimeML)
• Spatial Relations are identified with role labels, include Figure and Ground
• SPATIAL_NAMED-ENTITY
![Page 34: STS: Tempora l and Spatial Constraints on Text Similarity](https://reader036.vdocuments.net/reader036/viewer/2022062310/5681666a550346895dd9fff8/html5/thumbnails/34.jpg)
Conclusion: Measuring Semantic Similarity
• Normalizing temporal and spatial expressions• Developing standardized specifications
contribute towards corpora for training and evaluation for such normalization
• Cases in point:– ISO-TimeML (ISO adopted)– ISO-Space (in development)