Download - Understanding The Semantics of Media Chapter 8 Camilo A. Celis

Understanding The Semantics of Media

Chapter 8

Camilo A. Celis

Questions

1. What kind of application does SVD has? How is it used in this paper?

2. What does MPESAR stands for? What does this system do?

3. How does MPESAR generally works?

Contents

Understanding the problem

Analysis Tools

Segmenting Video

Semantic Retrieval

Contents

Understanding the problem Different Approaches Segmentation Literature Semantic Retrieval Literature

Analysis Tools

Segmenting Video

Semantic Retrieval


Semantic: (N) the study of meaning. the study of linguistic development by classifying and

examining changes in meaning and form.changes in meaning and form.

Rapid growth of media: personal media, social media... Low price Social preassure

We are not understanding media.

Different Approaches

Increasing number of methods to retrieve information from media

[Aner and Kender] Finds a background in a video shot, and then clusters shorts into physical scenes by noting shots with common background.

QBIC (IMB) Allow to search for images based on the colors and images in an image. Known as query by-example.

Where is the semantics of the media?

"The most important information is in the WORDS!"

Segmentation Literature

Extension of others work. Latent semantic indexing. (LSI) - Allows to summarize the

semantic content of a document and measure similarities.

Visualization and segmentation algorithm based on wavelet analysis of text documents. (time and frequency)

Scaled-space ideas to segmentation problem. Multi-dimensional signals.

Semantic Retrieval Literature

Multimedia retrieval systems. (audio and video)

Mixtures of probability expert for semantic-audio retrieval (MPESAR) is a sophisticated model connection words and media.

- Consider the acoustic and semantics similarity of sounds, allowing user to retrieve sounds without searching on the an exact word.

"MPESAR algorithm is appropriate for mapping one type of media to another."

Contents


Analysis Tools

Segmenting Video

Semantic Retrieval

Contents


Analysis Tools SVD Principles Color Space Word Space

Segmenting Video

Semantic Retrieval

Analysis Tools

Common tools and mathematics used to analyze multimedia signals.

Two type of transformations, which reduce raw text and video signals into meaningful spaces.

Preprocessing the data

* Mapping from a one dimensional signal (speech) into a multidimensional signal (video).

Analysis Tools: SVD

SVD (Singular Value Decomposition) principles: Factorization of real or complex matrixes.

Noise reduction.

Semantic and video data are expressed as vector-value function of time.

Collect data from an entire video and put the data into a matrix X. (Columns of X represent the signal at different times)

Using SVD, rewrite the matrix X in terms of 3 matrices U,S,V.

Analysis Tools: Color Space

Color changes are useful metrics for finding the boundary between shots.

Collect a histogram of colors of each frame. (512 histograms bins)

Convert all the tree intensities RGB intensities (0-255) to a single histogram bin, by finding the log base 2, of the intensity value

Pack the tree colors into a 9-bit number using floor() to covert to an integer.

Analysis Tools: Word Space Latent Semantic indexing (LSI), uses a SVD in direct

analogy to the color analysis.

Analyse the audio data by collection a histogram of the words in a transcript of the video. Only one document to study.

Consider sentences of the document, which define a semantic space.

Issues? Synonomous and Polysemy.

SVD captures both relationships.

Contents


Analysis Tools

Segmenting Video

Semantic Retrieval

Contents


Analysis Tools

Segmenting Video Temporal Properties Video Segmentation overview Scalar Space Combined Image and Audio Data Hierarchical Segmentation Results

Semantic Retrieval

Segmenting Video Indexing by combining two major sources of

data images words

Describe the semantic path of a vide's transcript as a signal, from the initial sentence to the conclusion.

Instead of trying to find similarities (segments) see audio-visual content as a signal and look for large changes in this signal.

Scale Space

Used to find boundaries in a signal.

Analyse a signal with many different kernels that vary in size of the temporal neighborhood that is included in the analysis at each point in time.

Look for changes in the signal over time. (Do so by calculating the derivate of the signal with respect to time)

Overall

From hierarchial segmentation and compare it with other forms of segmentation.

A simple description of a video is possible by unifying the representations.

Combine 2 well known technique to find boundaries in a video. Reduce dimensionality (SVD) and put all in the same format and its application on color and word data.

Combining color, words and scale space

The result is a 20-dimensional vector function of time and scale.

Scale Space representations:

Scale Space

Results: Autocorrelation

Results: Grouping correlation

Results(cont.) Representations of the semantic information in the HeadlineNews video in scale space.

The top image shows the cosine of the angular change of the semantic trajectory with different amounts of low-pass filtering.

The middle plot shows the peaks of the scale-space derivative

The bottom plot shows the peaks traced backto their original starting point. These peaks represent topic boundaries.

Results: Shot Boundary Segmentation

Results:

Segmentation in Perspective

New framework for combining into a unified representation and for segmentation from multiple types of information from a video.

Described hierarchial segmentation

(Unexactedly) good amount of information in the color.

This method is also applicable with other type of information. (musical key, audio emotion, etc)

Contents


Analysis Tools

Segmenting Video

Semantic Retrieval

Contents


Analysis Tools

Segmenting Video

Semantic Retrieval The algorithm Testing Conclusions

Semantic Retrieval: MEPSAR

Connecting sounds to words and vice-versa. Queries with sounds and words

Learn about the connections between semantic space and acoustic space.

Algorithm Semantic Features

Uses PORTER stemmer to remove common suffixes from words, and deletes common words before further processing.

Partition the space into overlapping clusters of regions. Acoustic Features

Signal processing and machine learning calculations endeavors to capture the sound.

MFCC(mel-frequency cepstral coefficient) Analyse speech sounds. Used to reduce the audio signal

GMM captures the long-term characteristics of each sound.

Semantic Retrieval

Acoustic signal processing chain

Building MPESAR models

Testing

Audio to semantic testing procedure.

Retrieval Results

Histogram of true label ranks based on likehood from audio-to-semantic test.

Histogram of true label ranks based on likehood from semantic-to-audio test.

Questions1. What kind of application does SVD has? How is it used?

The SVD has also applications in digital signal processing, e.g., as a method for noise reduction. It allows to summarize different kind of video data and combine the results into a common representation.

2. What does MPESAR stands for? What does this system do?(Mixture of Probability Expert for Semantic-Audio Retrieval) Learns the connections between a semantic space and an acoustic space.

-Ex) Given a description of a word, the system finds audio signal that best fits the word.

3. How does MPESAR generally works?Semantic space maps words into a high-dimentional probabilistic

space. Acoustic space describes sounds by a multidimensional vector. A many to many connection.

Thank you

Questions1. What kind of application does SVD has? How is it used?

The SVD has also applications in digital signal processing, e.g., as a method for noise reduction. It allows to summarize different kind of video data and combine the results into a common representation.

2. What does MPESAR stands for? What does this system do?(Mixture of Probability Expert for Semantic-Audio Retrieval) Learns the connections between a semantic space and an acoustic space.

-Ex) Given a description of a word, the system finds audio signal that best fits the word.

3. How does MPESAR generally works?Semantic space maps words into a high-dimensional probabilistic

space. Acoustic space describes sounds by a multidimensional vector. A many to many connection.

Download - Understanding The Semantics of Media Chapter 8 Camilo A. Celis

Top Related