mpeg & h.26l overview · 2004. 6. 5. · home entertainment (e.g., systems for the management...
TRANSCRIPT
MPEG & H.26L OVERVIEWMPEG & H.26L OVERVIEW
Nuno VasconcelosNuno Vasconcelos(with thanks to Truong Nguyen)(with thanks to Truong Nguyen)
Video CompressionVideo Compression
Codec CharacteristicsCodec CharacteristicsTemporal & Spatial CompressionTemporal & Spatial CompressionCodec SettingsCodec SettingsCompression StandardsCompression StandardsMPEGMPEG--77
Codec CharacteristicsCodec CharacteristicsLossyLossy v. Losslessv. Lossless
Can’t use Can’t use lossylossy compression for data or programscompression for data or programs
Spatial v. Temporal CompressionSpatial v. Temporal CompressionIntraframeIntraframe Discrete Cosine Transform (DCT)Discrete Cosine Transform (DCT)InterframeInterframe KeyframeKeyframe + Difference frame+ Difference frame
Symmetric v. Symmetric v. AsymetricAsymetricUsually decoding needs to be faster, except for captureUsually decoding needs to be faster, except for capture
Software v. HardwareSoftware v. HardwareRealReal--time capture needs hardwaretime capture needs hardwareMPEGMPEG--2 usually needs hardware2 usually needs hardwareWeb video should never need special hardwareWeb video should never need special hardware
Codec Characteristics (contd.)Codec Characteristics (contd.)Hardware RequirementsHardware Requirements
Fast Disk AccessFast Disk Access for storage/retrieval of compressed filefor storage/retrieval of compressed filePowerful ProcessorPowerful Processor for realfor real--time compression/time compression/decomprdecompreessionssion
Change makes Compression DifficultChange makes Compression DifficultFast motionFast motionDramatic lighting changesDramatic lighting changesLow light level introduces noiseLow light level introduces noise
ArtifactsArtifactsBlockinessBlockiness -- low DCT values onlylow DCT values onlyBlurrinessBlurriness -- loss of high frequency DCT coefficientsloss of high frequency DCT coefficients
Low Bit Rate Video codingLow Bit Rate Video coding
Why?: Why?: Increasing demand for video conferencing and telephony Increasing demand for video conferencing and telephony applications, limited bandwidth in PSTN and wireless networksapplications, limited bandwidth in PSTN and wireless networks
Video coding algorithms:Video coding algorithms:Waveform based coding: MC+DCT/wavelets, 3D subband, etc.Waveform based coding: MC+DCT/wavelets, 3D subband, etc.ObjectObject-- and modeland model--based coding : shape coding, wireframes, etc.based coding : shape coding, wireframes, etc.
Video coding standards:Video coding standards:ITUITU--T H.261(1990), H.262 (1994), H.263 (1995), H.263+ (1998)T H.261(1990), H.262 (1994), H.263 (1995), H.263+ (1998)ISO/IEC MPEG1 (1992), MPEG2 (1994), MPEG4 (1999)ISO/IEC MPEG1 (1992), MPEG2 (1994), MPEG4 (1999)
H.263 version 2 (H.263+): H.263 version 2 (H.263+): Higher coding efficiency, more flexibility,Higher coding efficiency, more flexibility,scalability support, error resilience supportscalability support, error resilience support
5
The H.263 StandardThe H.263 Standard
ME
IQ
MUXVLC
0
Bit StreamVideo in
Inter/Intra
QDCT
IDCT
PRED
6
VVideoideo Compression StandardsCompression Standards
H.261
H.263
MPEG-1
MPEG-2
Communications Information/Entertainment
Real timeEncode& Decode
Low delayLow bit-rate
Real time decode
Delay not critical
MPEG-4
Video Stream Data HierarchyVideo Stream Data Hierarchy
Types of Pictures (1)Types of Pictures (1)
•• I ( Intra ) PictureI ( Intra ) Picture•• P ( Predicted ) PictureP ( Predicted ) Picture•• B ( Bidirectional ) PictureB ( Bidirectional ) Picture
Types of Pictures (2)Types of Pictures (2)
Forward prediction
1 2 3 4 5 6 7 8
I B B B P B B B
9
I
Bidirectional prediction
Transmission Order : 1 5 2 3 4 9 6 7 8
Spatial Compression Process Spatial Compression Process FlowFlow
Chrominance Chrominance SubsamplingSubsampling
Perceptual SensitivityPerceptual Sensitivity
Discrete Cosine TransformDiscrete Cosine Transform
QuantizationQuantization
Scan TypesScan Types
Block Matching AlgorithmBlock Matching Algorithm
Example Example –– Motion Vector FieldMotion Vector Field
Backward PredictionBackward Prediction
GOP (GOP (Group Of PicturesGroup Of Pictures))-- Order of ArrivalOrder of Arrival
I frames intra-frame spatially compressed onlyP frames predict frames predicted from I frames or other P framesB frames bidirectional frames interpolated between I and P frames (must be buffered)
GOP GOP –– Quality TradeoffQuality Tradeoff
MPEG StreamsMPEG Streams
ScalabilityScalabilityData partitioningData partitioning
Only lower order DCT coefficients are transmittedOnly lower order DCT coefficients are transmitted
SNRSNRStandard quality picture + lowStandard quality picture + low--noise “helper” signalnoise “helper” signal
SpatialSpatialStandard size + additional High Definition (HD) layer Standard size + additional High Definition (HD) layer
TemporalTemporale.g. MPEGe.g. MPEG--2 where B pictures are in separate layer2 where B pictures are in separate layer
Level & ProfileLevel & ProfileMP@MLMP@ML 15Mb/sec15Mb/sec
Problem video compression Problem video compression scenarios for MPEGscenarios for MPEG--22
Quick changes in luminosity e.g leaves, water, flashbulbs
Circular motion, because motion prediction assumes objects move in a straight line
Alternating wavy lines, a variation of circular motion
Sharp, high-contrast edges, as for fonts or graphics
Multiple motions, where a single images splits into two or more,confuses motion prediction
Codec SettingsCodec Settings
QualityQualityDon’t use less that 50Don’t use less that 50Law of Diminishing Returns for higher valuesLaw of Diminishing Returns for higher values
Frames/SecFrames/SecUse Use submultiplesubmultiple of source frame rate if possibleof source frame rate if possible
KeyframeKeyframe every N framesevery N framesDepends on the codec, Sorenson default is 1 every 10 Depends on the codec, Sorenson default is 1 every 10 secssecsTransitions and cuts require Transitions and cuts require keyframeskeyframes
Codec Settings (contd.)Codec Settings (contd.)
Automatic Key FramesAutomatic Key FramesCan specify a difference thresholdCan specify a difference threshold
Limit data rateLimit data rateLimits the size for variable rate Limits the size for variable rate codecscodecs
Data Rate TrackingData Rate TrackingCombination of fixed and variable ratesCombination of fixed and variable rates0%0% => fixed only=> fixed only100%100% => data rate depends entirely on content=> data rate depends entirely on content
Temporal Scalability (very simple)Temporal Scalability (very simple)2:12:1 –– drop every second framedrop every second frame3:2:13:2:1 –– first drop every third frame then every secondfirst drop every third frame then every second
Compression StandardsCompression Standards
Motion JPEG (MJPEG)Motion JPEG (MJPEG)Sequence of JPEGsSequence of JPEGsOften used for video captureOften used for video captureMJPEG A 3Mbytes/sec 7:1 compressionMJPEG A 3Mbytes/sec 7:1 compression
H.261H.261Videoconferencing over ISDN Videoconferencing over ISDN -- 64 to 1920 64 to 1920 kbitskbits/sec/secPart of H.32X series of standardsPart of H.32X series of standards
MPEGMPEG--1 (H.262)1 (H.262)Structure: block Structure: block –– macroblockmacroblock –– slice slice –– picture picture –– GOP GOP –– sequencesequenceUses prediction or motion estimation Uses prediction or motion estimation -- I, P & B picturesI, P & B picturesNo interlacingNo interlacing
Compression Standards Compression Standards (contd.)(contd.)
MPEGMPEG--22Wide range of bit rates, resolutions and frame sizes Wide range of bit rates, resolutions and frame sizes InterlacingInterlacingScalability Scalability –– receiver can decode a subset of the full receiver can decode a subset of the full bitstreambitstream
MPEGMPEG--44Designed for low Designed for low bitratebitrate multimedia applicationsmultimedia applicationsVideo Object Planes (Video Object Planes (VOPsVOPs) ) –– similar to a sprite or a Photoshop layersimilar to a sprite or a Photoshop layerSegmentation of picture into irregular shapesSegmentation of picture into irregular shapesTexture CodingTexture CodingDiscrete Wavelet Transform (DWT) used instead of DCTDiscrete Wavelet Transform (DWT) used instead of DCT
Compression Standards Compression Standards (contd.)(contd.)
H.263H.263Incorporates MPEG 1 & 2 technology into videoconferencingIncorporates MPEG 1 & 2 technology into videoconferencing
MPEGMPEG--77Multimedia content description interface Multimedia content description interface –– indexing & queryindexing & queryFeatures Features extractedextracted from from keyframeskeyframes and stored as metadataand stored as metadata
MPEG 7MPEG 7•• Multimedia Content Description InterfaceMultimedia Content Description Interface
–– MPEGMPEG--1 & MPEG2 are for compression1 & MPEG2 are for compression–– MPEGMPEG--4 for content object such as sprites4 for content object such as sprites–– MPEGMPEG--77
•• How to identify and manage audioHow to identify and manage audio--visual contentvisual content•• Can be used independently of other MPEG standardsCan be used independently of other MPEG standards•• Similar to XMLSimilar to XML
•• SpecfiesSpecfies a standard set of descriptors which can be used to describe vara standard set of descriptors which can be used to describe various ious types of multimedia informationtypes of multimedia information
•• Offers different level of granularity for feature descriptionsOffers different level of granularity for feature descriptions–– Low Low
•• Visual: shape, size, texture, color, motion descriptorsVisual: shape, size, texture, color, motion descriptors•• Audio:Audio: key, tempo, timbre/spectral compositionkey, tempo, timbre/spectral composition
–– HighHigh•• “Roy Keane scores winning goal in Ireland v. Cameroon match”“Roy Keane scores winning goal in Ireland v. Cameroon match”
•• Features may be extracted manually or automaticallyFeatures may be extracted manually or automatically
Additional multimedia descriptive informationAdditional multimedia descriptive information
FormatFormatThe The coding scheme used (e.g. JPEG, MPEGcoding scheme used (e.g. JPEG, MPEG--22, MP3, MP3), or the overall data size), or the overall data size
Conditions for accessingConditions for accessingThis This could include intellectual property rights information, and priccould include intellectual property rights information, and pricee
ClassificationClassificationThis could include parental rating, and content classification iThis could include parental rating, and content classification into a number of prento a number of pre--defined defined
categoriescategories
Links to other relevant materialLinks to other relevant materialIIn tn the case of recorded nonhe case of recorded non--fiction content, it is very important to know the occasion of fiction content, it is very important to know the occasion of
the recordingthe recording. The . The information may help the user information may help the user in in speeding up the searchspeeding up the search
MPEG-7 Architecture
Possible Application AreasPossible Application Areas
Architecture, real estate, and interior design (e.g., searching Architecture, real estate, and interior design (e.g., searching for ideas) for ideas) Broadcast media selection (e.g., radio channel, TV channel) Broadcast media selection (e.g., radio channel, TV channel) Cultural services (history museums, art galleries, etc.) Cultural services (history museums, art galleries, etc.) Digital libraries (e.g., image catalogue, musical dictionary, biDigital libraries (e.g., image catalogue, musical dictionary, bioo--medical imaging catalogues, film, video and radio medical imaging catalogues, film, video and radio archives) archives) EE--Commerce (e.g., personalised advertising, onCommerce (e.g., personalised advertising, on--line catalogues, directories of eline catalogues, directories of e--shops) shops) Education (e.g., repositories of multimedia courses, multimedia Education (e.g., repositories of multimedia courses, multimedia search for support material) search for support material) Home Entertainment (e.g., systems for the management of personalHome Entertainment (e.g., systems for the management of personal multimedia collections, including manipulation of multimedia collections, including manipulation of content, e.g. home video editing, searching a game, karaoke) content, e.g. home video editing, searching a game, karaoke) Investigation services (e.g., human characteristics recognition,Investigation services (e.g., human characteristics recognition, forensics) forensics) Journalism (e.g. searching speeches of a certain politician usinJournalism (e.g. searching speeches of a certain politician using his name, his voice or his face) g his name, his voice or his face) Multimedia directory services (e.g. yellow pages, Tourist informMultimedia directory services (e.g. yellow pages, Tourist information, Geographical information systems) ation, Geographical information systems) Multimedia editing (e.g., personalised electronic news service, Multimedia editing (e.g., personalised electronic news service, media authoring) media authoring) Remote sensing (e.g., cartography, ecology, natural resources maRemote sensing (e.g., cartography, ecology, natural resources management) nagement) Shopping (e.g., searching for clothes that you like) Shopping (e.g., searching for clothes that you like) Social (e.g. dating services) Social (e.g. dating services) Surveillance (e.g., traffic control, surface transportation, nonSurveillance (e.g., traffic control, surface transportation, non--destructive testing in hostile environments) destructive testing in hostile environments)
Taken From: Taken From: http://mpeg.telecomitalialab.com/standards/mpeghttp://mpeg.telecomitalialab.com/standards/mpeg--7/mpeg7/mpeg--7.htm7.htm