h.264 deblocking filter irfan ullah department of information and communication engineering myongji...

Download H.264 Deblocking Filter Irfan Ullah Department of Information and Communication Engineering Myongji university, Yongin, South Korea Copyright © solarlits.com

If you can't read please download the document

Upload: francis-mccoy

Post on 27-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

  • H.264 Deblocking FilterIrfan Ullah Department of Information and Communication Engineering Myongji university, Yongin, South KoreaCopyright solarlits.com

  • OutlineIntroductionH.264 encoder and decoderOverview of DBF algorithmHardware architecture of DBFComparison with previous architectures

  • IntroductionVideo compression H.264/MPEG4H.264 encoder and decoder includes Deblocking filter (DBF)DBF improvesVisual quality of decoded frames by reducing artifacts and discontinuitiesDBF algorithm is complex in H.264 than old standards

  • IntroductionStepsApplied to each edge of all the 44 luma and chroma blocks in a MacroblockUpdate 3 pixels in each directionDeblocking is applied or not (current and neighboring 44 blocks)DBF 16x16DBF 16x16 hardware has less area and consumes less power than DBF 44 hardwareTo improve issue of hardware costMacroblock is a 16 x 16 pixel array

  • ObjectiveHardware implementation of deblocking filter for H.2644x4 block and 16x16 blockFor portable devices (hardware cost issue)4x4 for high performance16x16 for low power consumption

  • H.264 encoder block diagram

  • H.264 decoder block diagram

  • Edge filterting orderDBF removes disturbing block boundaries4x4 luma and chroma blocksVertical block edges are filtered beforeChanging of up to 3 pixels on both sidesDBF StepsEdge level (boundary strength)Sample level (, threshold value)Slice level (offset parameters)

  • Edge level adaptivity of the FilterTo every edge of 4 x 4, boundary strength (Bs) parameter is assignedEvaluated from top to bottomBs determines the strength of the filtering4 means strongest filter, 0 means no filter1-3 standard filter

  • Sample-Level Adaptivity of the FilterDistinguish between true edges and those created by quantizationTrue edges should be left unfiltered while filtering artificial edgesquantization-dependent parametersFiltering or not?

  • Slice level adaptivity of the FilterEncoder selects offsets to adjust and True edges should be left unfiltered while filtering artificial edgesTo control the properties of deblocking filter by transmitting nonzero offsetsReducing the amount of filtering by transmitting negative offsetsUsing positive offsets to increase the amount of filtering

  • H.264 Deblocking Filter Algorithm small change in intensityClipping to remove blurring by limiting

  • Hardware architectureIBUF is used to store one reconstructed MB (256 lum pixels + 124 chro pixels)SPAD and SRAMs to store partially filtered pixelsDATAPATH for both DBF 4x4 and 16 x 16

  • Hardware architecture contd..Two stage pipe line1st stage includes 12-bit adder and two shifters2nd stage includes 12-bit comparator and several twos complementary and multiplexers

    conditional branch resultsmultiplication and addition

  • Hardware architecture contd..4 x 4 DBF starts fileting as soon as new block 4 x 4 is ready16 x 16 DBF waits for IBUF to be filled with IT/IQStarts filtering after a new block is readyProcessing Order of 44 Blocks by IT/IQ ModuleHybrid edge filtering orderStandard sequential filtering order

  • Hardware architecture contd..Neighbors should be available in local on chip memoryLeft 4 x 4 blocks are stored in SPADUper in LUMA and CHRM SRAMsFor a CIF (352x288) videoUper 4x4 luminance blocksUper 4x4 chro blocks

    43528 = 140884x88x8+4x88x8 = 7048Previously, off chip memory was used, but on chip consumes less power

    No need of Transpose pixel arraysTo remove irregularity

  • ImplementationVideo frame is loaded into SRAM It is used as an input to DBF running on FPGADBF hardware applies H.264 DBF algorithmAnd writes frame back to SRAMThe resulting frame is shown on the LCD

    200 MHz, 30 VGA (640x480) frames/second. Synthesized to 7.4 K and 5.3 K gates Xilinx Virtex II FPGA, power estimatedusing Xilinx XPower toolArm Versatile PB926EJ-S development board

  • Performance36% lesspower consumptionreading unfiltered MBs and writing the filtered MBs to the SRAM

  • Comparison[20]standard cell methodology is a method of designing application-specific integrated circuits (ASICs) with digital-logic features.

    standard cell library is a collection of low-level electronic logic functions such as AND, OR, INVERT, flip-flops, latches, and buffers.

  • Low power H.264 Deblocking Filterwith hybrid filteringPresented By: Irfan Ullah

  • OutlineIntroduction of Edge filteringArchitecture of DBFTransposition bufferComparison with previous architectures

  • Edge filter order for 16x16 macroblock

  • Edge filter order for 16x16 macroblock contd

  • Edge filter order for 16x16 macroblock contd

  • low power DF architecture

  • Hybrid architectureHorizontal Edge Skip Processing Architecture

  • Transposition Buffer UsageFor QCIF video: 176x144Left neighbor SRAM: 32x32 bitsupper neighbor SRAM: 352x32 bitsTransportation buffer: 640 bitsTransportation buffers operates on 4x4 block of current MBData bus 32 bits to access 4 samples each timeEach filtered output needs 4 clock cyclesTotal cycles : 4 x 48 + 4 = 196 cyclesCorrect arrangement of data with separate SRAMsReduces 100 clock cycles per MB by HESPA48 edges

  • Transposition Buffer Usage contd..

  • Comparison

  • 3D image processing VLSI systemIrfan Ullah Department of Information and Communication Engineering Myongji university, Yongin, South KoreaCopyright solarlits.com

  • IntroductionImage processing (Vision system, multimedia processing, consumer electronics)Fast computational speed, small chip size, low powerRead/write operation, signal control, data flow

  • 3D image system3D VLSI Image chip into several layersStacked verticallyThrough-Silicon Via (TSV)Used toAvoid multi-layer pipe line delayImprove system operationReconfigurable memoryBandwidthDecrease sizeIBM's Silicon Carrier Packaging Technology

  • 3D image system Contd..Single instruction multiple data (SIMD) Multiple instruction multiple data (MIMD)

  • 3D image system Contd..

  • Chip architecture 3D image system

  • Process control3D image processor system can control image memory configuration and pipeline data flow.

  • Network on chip

  • 3D image system Contd..Common robust design method to repair VLSI system error is reconfigurable re-healing technology