hierarchical segmentation: finding changes in a text signal malcolm slaney and dulce ponceleon ibm...
TRANSCRIPT
Hierarchical Segmentation:Finding Changes in a Text Signal
Malcolm Slaney and Dulce Ponceleon
IBM Almaden Research Center
Problem Statement Problem
How do we browse video? Goal
Create a table-of-contents Solution
Look for topic changes in text
TOC Example
Chapter 1
Chapter 2
Overview of This Talk Goal and approach Latent semantic indexing (LSI) Scale space Combination Results
LSIScaleSpaceFilter
Segment
Approach Sentences -> Semantic Space Filter at multiple scales Look for large jumps Three subjects (loops) shown
Loop 1: Polychromaticity Artifacts Loop 2: Emission Tomography Loop 3: Ultrasound Tomography
Courtesy of Jianbo Shi (CMU)
Building on Previous Work LSI and clustering Text tiling Change point analysis Segmentation Scale space
Latent Semantic Indexing Collect histogram of word
frequencies Use SVD to capture frequent
combinations Orthogonal decomposition
Represent in low-dimensional space
Word
s
Docs Docs
10D
LSI Within a Document Split into chunks
Fixed size Sentences
Compute histograms Perform SVD Look at results Sources
“Principles of Computerized Tomographic Imaging”
PBS News Hour
LSI – 2D Projection
Chapter 4 of Principles of Computerized Tomographic Imaging
LSI – Self-similarity Measure
similarity Cosine of angle
between “documents”
Plot all pairs of chunks/sentences
Look for block diagonal
Chapter 4 of Principles of Computerized Tomographic Imaging
Scale-space Filtering What size are the features? Look at different scales! Continuous scale Used for
Object Recognition Feature Detection
Scale-space Movie Green line
marks best high-level segmentation
10d semantic space
Scale varies from 1 to 400 sentences
Scale-space Segmentation Low pass filter signal Form image of scale vs. time Look for changes Track peaks of vector derivative
across scale
Scale-space Example
Derivative as function of scale and sentence
LSI and Scale Space Putting it all together Split document/transcript Perform LSI analysis Look at change in angle Perform scale-space segmentation Show tree
Scale-Space Image
Peaks in scale-space derivative
Peaks traced to their origin
Results – CT Comparison
Scale-Space Book Headings
Results – News Comparison
Scale-Space Ground Truth
Results – Autocorrelation Block
sentences Measure
correlation Positive
Peak Anti-
correlation
Discussion Issues Evaluation (and ground truth)
Lafferty’s measure Temporal properties
Histogram/SVD chunking size Autocorrelation
Computational Effort Histogram: O(N) SVD: O(N3) Scale space: O(N2) N < 1000
Number of sentences in a video or document is not large
LSI Document Lookup Histogram documents Entropy term weighting Compute SVD Use first 10-100 vectors to model
space Encode query as histogram Look for documents in similar
direction
LSI Example Collection of
book titles Differential
equations vs. algorithms and applications