presented by: emmanuel velasco city college of new york

31
Discussion on Video Discussion on Video Analysis and Extraction, Analysis and Extraction, MPEG-4 and MPEG-7 Encoding MPEG-4 and MPEG-7 Encoding and Decoding in Java, Java and Decoding in Java, Java 3D, or OpenGL 3D, or OpenGL Presented by: Presented by: Emmanuel Velasco Emmanuel Velasco City College of New York City College of New York

Upload: cardea

Post on 15-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Discussion on Video Analysis and Extraction, MPEG-4 and MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGL. Presented by: Emmanuel Velasco City College of New York. Video Analysis and Extraction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Presented by: Emmanuel Velasco City College of New York

Discussion on Video Analysis Discussion on Video Analysis and Extraction, MPEG-4 and and Extraction, MPEG-4 and

MPEG-7 Encoding and Decoding MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGLin Java, Java 3D, or OpenGL

Presented by:Presented by:

Emmanuel VelascoEmmanuel Velasco

City College of New YorkCity College of New York

Page 2: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

• As more videos are being created, digitized As more videos are being created, digitized and archived, the need for content-based and archived, the need for content-based search and retrieval is necessary. This search and retrieval is necessary. This involves analyzing a video and extracting its involves analyzing a video and extracting its contents.contents.

• The videos are cut into frames. The frames The videos are cut into frames. The frames are analyzed and the objects can be are analyzed and the objects can be extracted using image processing extracted using image processing techniques.techniques.

Page 3: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

Temporal Video SegmentationTemporal Video Segmentation• Cut detection:Cut detection: The changes in the contents The changes in the contents

are visible and occur instantaneously are visible and occur instantaneously between consecutive frames.between consecutive frames.

• Gradual transition detection:Gradual transition detection: The image The image transition makes gradual changes. This transition makes gradual changes. This requires multiple frames to be analyzed. requires multiple frames to be analyzed. Gradual transitions include fade in, fade out, Gradual transitions include fade in, fade out, wipe and dissolve.wipe and dissolve.

Page 4: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

Examples: Examples: • Cut transitionCut transition

• Gradual transitionGradual transition

Page 5: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

• The cut transition is easier to detect. We The cut transition is easier to detect. We check the frame differences between two check the frame differences between two consecutive frames and see if the difference consecutive frames and see if the difference is greater than a certain threshold. If it is, is greater than a certain threshold. If it is, then a cut is determined.then a cut is determined.

• Gradual transitions are harder to detect. Gradual transitions are harder to detect. There are several methods, which include the There are several methods, which include the twin-comparison algorithm. This works by twin-comparison algorithm. This works by noticing that the first and last transition noticing that the first and last transition frames are different, and any consecutive frames are different, and any consecutive frames between them are similar.frames between them are similar.

Page 6: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

Twin-Algorithm ResultsTwin-Algorithm Results

Page 7: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

Scene and Object DetectionScene and Object Detection• We want to identify objects in a video. One We want to identify objects in a video. One

method of finding this is the opposite of method of finding this is the opposite of transition detection. Instead of finding the transition detection. Instead of finding the differences between frames above a differences between frames above a threshold, we want to find image regions threshold, we want to find image regions below a certain threshold.below a certain threshold.

• Another method is to take an image and try Another method is to take an image and try all possible transformations between the all possible transformations between the edges of the two images.edges of the two images.

Page 8: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

Text ExtractionText Extraction• We want to retrieve the captions in an video. We want to retrieve the captions in an video.

While most text segmentation is done on high While most text segmentation is done on high resolution media, video is low resolution.resolution media, video is low resolution.

• One method is to assume that the gray levels One method is to assume that the gray levels of the text is lighter or darker than the of the text is lighter or darker than the background. Using a minimum difference with background. Using a minimum difference with the background, the text can be extracted.the background, the text can be extracted.

Page 9: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

Example of Text ExtractionExample of Text Extraction

Page 10: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

So we see that video analysis and extraction So we see that video analysis and extraction is useful in our projects.is useful in our projects.

The Classroom Project:The Classroom Project:

Object detection is used for finding the Object detection is used for finding the location of the professor.location of the professor.

Text extraction is useful for capturing text in Text extraction is useful for capturing text in the PowerPoint slides shown in a video.the PowerPoint slides shown in a video.

Page 11: Presented by: Emmanuel Velasco City College of New York

Video Analysis and ExtractionVideo Analysis and Extraction

The NYC Traffic Project:The NYC Traffic Project:

Object detection is used for detecting how Object detection is used for detecting how heavy or light the traffic is.heavy or light the traffic is.

Transition detection is used to see if we are Transition detection is used to see if we are looking at the same view, or if the view has looking at the same view, or if the view has changed.changed.

Page 12: Presented by: Emmanuel Velasco City College of New York

MPEG-4MPEG-4

• Is an ISO/IEC compression standard created Is an ISO/IEC compression standard created by the Moving Pictures Expert Group by the Moving Pictures Expert Group (MPEG).(MPEG).

• Has been successfully used in:Has been successfully used in:• digital televisiondigital television• interactive graphics applicationsinteractive graphics applications• interactive multimediainteractive multimedia

Page 13: Presented by: Emmanuel Velasco City College of New York

MPEG-4MPEG-4

• Can bring multimedia to new networks such Can bring multimedia to new networks such as mobile networks.as mobile networks.

• Media objects are audio, video, or Media objects are audio, video, or audiovisual contents and can be natural audiovisual contents and can be natural (recorded using a camera and/or (recorded using a camera and/or microphone) or synthetic (generated using a microphone) or synthetic (generated using a computer).computer).

Page 14: Presented by: Emmanuel Velasco City College of New York

MPEG-4MPEG-4

• An example of an MPEG-4 scene.An example of an MPEG-4 scene.

Page 15: Presented by: Emmanuel Velasco City College of New York

MPEG-4MPEG-4

• The media objects are independent from The media objects are independent from their background. This allows easy their background. This allows easy extraction of the object and easier editing of extraction of the object and easier editing of an object.an object.

• The objects are synchronized by time and The objects are synchronized by time and space.space.

Page 16: Presented by: Emmanuel Velasco City College of New York

MPEG-4MPEG-4

• With a set of media objects, MPEG-4 allows us With a set of media objects, MPEG-4 allows us to:to:• place objects anywhere in a given coordinate system.place objects anywhere in a given coordinate system.• apply transforms to change an visual object apply transforms to change an visual object

geometrically or change an audio object acoustically.geometrically or change an audio object acoustically.• group objects together (such as the visual image of group objects together (such as the visual image of

the person, and their voice).the person, and their voice).• apply streamed data to media objects to modify their apply streamed data to media objects to modify their

attributes.attributes.• change the user’s viewpoint or listening point change the user’s viewpoint or listening point

anywhere in the scene.anywhere in the scene.

Page 17: Presented by: Emmanuel Velasco City College of New York

Encoder / Decoder DefinitionsEncoder / Decoder Definitions

• Encoder: To format (electronic data) Encoder: To format (electronic data) according to a standard format.according to a standard format.

• Decoder: to recognize and interpret (an Decoder: to recognize and interpret (an electronic signal)electronic signal)

Page 18: Presented by: Emmanuel Velasco City College of New York

MPEG-4 Encoder / DecoderMPEG-4 Encoder / Decoder

While many MPEG-4 encoders and decoders While many MPEG-4 encoders and decoders exists as standalone applications, we want exists as standalone applications, we want to be able to encode and decode using Java, to be able to encode and decode using Java, Java 3D, or OpenGL.Java 3D, or OpenGL.

Page 19: Presented by: Emmanuel Velasco City College of New York

MPEG-4 Encoder / DecoderMPEG-4 Encoder / Decoder

• IBM Toolkit for MPEG-4 is a set of Java IBM Toolkit for MPEG-4 is a set of Java classes and API with five applications.classes and API with five applications.• AVgen: a simple, easy-to-use GUI tool for AVgen: a simple, easy-to-use GUI tool for

creating audio/video-only content for ISMA- or creating audio/video-only content for ISMA- or 3GPP-compliant devices 3GPP-compliant devices

• XMTBatch: a tool for creating rich MPEG-4 XMTBatch: a tool for creating rich MPEG-4 content beyond simple audio and video content beyond simple audio and video

• M4Play: an MPEG-4 client playback application M4Play: an MPEG-4 client playback application • M4Applet for ISMA: a Java player applet for ISMA-M4Applet for ISMA: a Java player applet for ISMA-

compliant content compliant content • M4Applet for HTTP: a Java applet for MPEG-4 M4Applet for HTTP: a Java applet for MPEG-4

content played back over HTTP. content played back over HTTP.

Page 20: Presented by: Emmanuel Velasco City College of New York

MPEG-4 Encoder / DecoderMPEG-4 Encoder / Decoder

IBM MPEG-4 XMT Editor ToolIBM MPEG-4 XMT Editor Tool

Add media object

Time FrameObject

Attributes

Page 21: Presented by: Emmanuel Velasco City College of New York

MPEG-4 Encoder / DecoderMPEG-4 Encoder / Decoder

• IBM MPEG-4 Demos:IBM MPEG-4 Demos:

http://www.research.ibm.com/mpeg4/Demos/DemoSystems.hthttp://www.research.ibm.com/mpeg4/Demos/DemoSystems.htmm

• SKLMP4 Encoder / DecoderSKLMP4 Encoder / Decoder

is a C++ library that is capable of encoding is a C++ library that is capable of encoding and decoding MPEG-4 and decoding MPEG-4 http://skal.planet-d.net/coding/mpeg4codec.htmlhttp://skal.planet-d.net/coding/mpeg4codec.html

Page 22: Presented by: Emmanuel Velasco City College of New York

MPEG-4MPEG-4

MPEG-4 can make it easier for us to extract MPEG-4 can make it easier for us to extract the objects since each object is independent the objects since each object is independent of each other.of each other.

The Classroom Project:The Classroom Project:

The professor is an image object, separated The professor is an image object, separated from the PowerPoint background.from the PowerPoint background.

Page 23: Presented by: Emmanuel Velasco City College of New York

MPEG-4MPEG-4

The NYC Traffic Project:The NYC Traffic Project:

The background (roads) are separate from The background (roads) are separate from the objects (cars).the objects (cars).

The interactivity that MPEG-4 allows can The interactivity that MPEG-4 allows can make the user interface easier to interact make the user interface easier to interact with. They can point and click on the map with. They can point and click on the map and view the cameras in that location.and view the cameras in that location.

Page 24: Presented by: Emmanuel Velasco City College of New York

MPEG-7MPEG-7

• Since audiovisual data is increasing and Since audiovisual data is increasing and coming from many different sources, coming from many different sources, searching for a certain type of media content searching for a certain type of media content will be more difficult. Therefore we need a will be more difficult. Therefore we need a way to search the data quickly and efficiently. way to search the data quickly and efficiently. The solution is MPEG-7.The solution is MPEG-7.

• MPEG-7 is a standard for describing media MPEG-7 is a standard for describing media content. Unlike MPEG-1, MPEG-2, and MPEG-4, content. Unlike MPEG-1, MPEG-2, and MPEG-4, MPEG-7 is not a standard for the actual coding MPEG-7 is not a standard for the actual coding of moving pictures and audio.of moving pictures and audio.

Page 25: Presented by: Emmanuel Velasco City College of New York

MPEG-7MPEG-7

• MPEG-7 uses XML Schema as the language MPEG-7 uses XML Schema as the language of choice for content description.of choice for content description.

• These descriptions may include information These descriptions may include information describing the creation of the content (title, describing the creation of the content (title, author). It may include the storage features author). It may include the storage features of the content (storage format, encoding). It of the content (storage format, encoding). It can contain low level features in the content can contain low level features in the content (color, texture, shape, motion, audio). (color, texture, shape, motion, audio).

Page 26: Presented by: Emmanuel Velasco City College of New York

So what will MPEG-7 standardize?So what will MPEG-7 standardize?

• A set of descriptors (D):A set of descriptors (D): Descriptors define Descriptors define the syntax and the semantics of each the syntax and the semantics of each feature (metadata element).feature (metadata element).

• A set of description schemes (DS):A set of description schemes (DS): A A description scheme specifies the structure description scheme specifies the structure and semantics of the relationships between and semantics of the relationships between its components.its components.

Page 27: Presented by: Emmanuel Velasco City College of New York

So what will MPEG-7 standardize?So what will MPEG-7 standardize?

• Description Definition Language (DDL):Description Definition Language (DDL): to to define the syntax of the descriptors and define the syntax of the descriptors and description schemes.description schemes.

Page 28: Presented by: Emmanuel Velasco City College of New York

Some possible MPEG-7 ApplicationsSome possible MPEG-7 Applications

• Audio: Audio: play a few notes on the keyboard, play a few notes on the keyboard, and it will return musical pieces with similar and it will return musical pieces with similar tunes.tunes.

• Graphics: Graphics: sketch a few lines on a screen and sketch a few lines on a screen and get a set of images containing similar get a set of images containing similar graphics or logos .graphics or logos .

• Images:Images: define objects, color patterns or define objects, color patterns or textures and retrieve images that look like textures and retrieve images that look like the image described.the image described.

Page 29: Presented by: Emmanuel Velasco City College of New York

MPEG-7 Encoder / DecoderMPEG-7 Encoder / Decoder

• MPEG-7 Library is a set of C++ classes, MPEG-7 Library is a set of C++ classes, implementing the MPEG-7 standard.implementing the MPEG-7 standard.http://iis.joanneum.at/mpeg-7/overview.htmhttp://iis.joanneum.at/mpeg-7/overview.htm

• Java MPEG-7 Audio Encoder is a java library Java MPEG-7 Audio Encoder is a java library that provides a MPEG-7 audio encoder to that provides a MPEG-7 audio encoder to describe an audio content with some describe an audio content with some descriptors of the MPEG-7 standard. descriptors of the MPEG-7 standard.

http://www.ient.rwth-aachen.de/team/crysandt/software/mpeg7http://www.ient.rwth-aachen.de/team/crysandt/software/mpeg7audioenc/audioenc/

Page 30: Presented by: Emmanuel Velasco City College of New York

MPEG-7MPEG-7

Once we have a lot of media contents, MPEG-Once we have a lot of media contents, MPEG-7 allows us to search through them easier.7 allows us to search through them easier.

The Classroom Project:The Classroom Project:

If we have a lot of videos, sound, or both. We If we have a lot of videos, sound, or both. We can find the content we need quickly.can find the content we need quickly.

The NYC Traffic Project:The NYC Traffic Project:

If there are many cameras at several If there are many cameras at several locations, finding a specific location can be locations, finding a specific location can be easier.easier.

Page 31: Presented by: Emmanuel Velasco City College of New York

Discussion on Video Analysis Discussion on Video Analysis and Extraction, MPEG-4 and and Extraction, MPEG-4 and

MPEG-7 Encoding and Decoding MPEG-7 Encoding and Decoding in Java, Java 3D, or OpenGLin Java, Java 3D, or OpenGL

Presented by:Presented by:

Emmanuel VelascoEmmanuel Velasco

City College of New YorkCity College of New York