cache optimization for real time mpeg-4 encoder
DESCRIPTION
Cache Optimization for Real Time MPEG-4 ENCODER. EEL 6935 Embedded Systems Long Presentation 2 Group Member: Qin Chen, Xiang Mao ECE@UFL. Outline. Design goals and challenges Video encoding basics Memory/cache optimization Arrangement of L2 cache Arrangement of L1 cache - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/1.jpg)
1
Cache Optimization for Real Time MPEG-4 ENCODER
EEL 6935 Embedded Systems Long Presentation 2
Group Member: Qin Chen, Xiang MaoECE@UFL
4/2/2010
![Page 2: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/2.jpg)
2
OutlineDesign goals and challengesVideo encoding basicsMemory/cache optimization
Arrangement of L2 cacheArrangement of L1 cache
Experimental resultsPossible improvementsSummary
4/2/2010
![Page 3: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/3.jpg)
3
Design GoalPlatform: TI TMS320DM642 DSP
Video application oriented DSP5760MIPS@720MHz
MPEG-4 encodingResolution: D1 (720x576)Frame rate: > 25fps
4/2/2010
![Page 4: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/4.jpg)
4
Two Level Cache
4/2/2010
![Page 5: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/5.jpg)
5
ChallengesLimited cache capacity vs. high rate
exchangeNeeds smart configuration of memory/cache,
as well as intelligent arrangement of algorithm flow
4/2/2010
![Page 6: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/6.jpg)
6
Video Sequence Hierarchy
4/2/2010
![Page 7: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/7.jpg)
7
Still Image and Video Coding
4/2/2010
Spatial redundancy
Temporal redundancy
Intra frame Inter frame Inter frame
![Page 8: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/8.jpg)
8
Motion Estimation/compensation
4/2/2010
![Page 9: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/9.jpg)
9
Video Encoder
4/2/2010
![Page 10: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/10.jpg)
10
Video Encoder vs. DecoderVideo standards (MPEG-2/4, H.263/4) only
define decoder operation, i.e., the bitstream syntax and the interpretation of the bitstream
Encoder has the flexibility to choose from available coding tools, as long as the encoded bitstream has valid syntax
Encoder may have different optimization target:Bitrate, quality, complexity, power
consumption, etcUsually encoder is more complex than
decoder, due to motion estimation, and DCT, etc
4/2/2010
![Page 11: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/11.jpg)
11
YUV Color SpaceRGB
3 bytes/pixelYUV420
1+1/4+1/4=1.5 bytes/pixel
For D1 resolution720 x 576 x 1.5 =
607KB/frame
4/2/2010
![Page 12: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/12.jpg)
12
Problem 1How to use the L2 cache efficiently for both
data and codes?L2 cache size 256KBConfigurable, can be set for memory or for
cache
4/2/2010
![Page 13: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/13.jpg)
13
Frame DataFor encoding each frame, 3 frames need to
be accessedCurrent frameReference frameReconstructed frame
Size of one frame in D1 resolution720 x 576 x 1.5 = 607KB
Note that the memory for current and reconstructed frame can somehow be shared.
4/2/2010
![Page 14: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/14.jpg)
14
ME Data Flow
4/2/2010
![Page 15: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/15.jpg)
15
L2 Cache Configuration
4/2/2010
192K
64K
102K
56K
34K
![Page 16: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/16.jpg)
16
Problem 2How to efficiently use L1 program cache
L1P has a capacity of 16KBMPEG-4 encoder source codes are around
100~200KB in total. The MB level source codes are around several dozens of KB
Without proper arrangement, cache miss rate is high
4/2/2010
![Page 17: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/17.jpg)
17
Encoding Code Flow
4/2/2010
Intra Frame
Motion Estimation
Motion Compensation
![Page 18: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/18.jpg)
18
Experiment SetupsTI DM642 evaluation
boardDVD player: D1
video sourcePC: video decoder
4/2/2010
![Page 19: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/19.jpg)
19
L1 Program Cache
4/2/2010
![Page 20: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/20.jpg)
20
L1 Data Cache
4/2/2010
![Page 21: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/21.jpg)
21
Final Frame Rate•Scene 1
•action movie, intensive motion
•Scene 2&3•medium motion
•Scene 4•live head and shoulder scene captured from camera
4/2/2010
![Page 22: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/22.jpg)
22
Possible ImprovementsIf the data transfer time using DMA is long,
we can use Ping-pong buffer to further reduce the DMA waiting timePing-pong buffer
Two buffers for reference frame Two buffers for current frame
While doing processing on one buffer, bringing in data to the other buffer.
When processing is done, continue to process the second buffer and bringing new data to the first data
4/2/2010
![Page 23: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/23.jpg)
23
Summary
4/2/2010
TI DM642 cache and memory architectureVideo encoding basicsL2 cache and memory configurationL1P cache optimizationExperiments
![Page 24: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/24.jpg)
24
ReferencesCache optimization for mobile devices
running multimedia applicationsOptimization of memory allocation for
H.264/AVC video decoder on digital signal processor
A real-time H.264 MP decoder based on a DM642 DSP
4/2/2010
![Page 25: Cache Optimization for Real Time MPEG-4 ENCODER](https://reader036.vdocuments.net/reader036/viewer/2022081505/56815f53550346895dce313d/html5/thumbnails/25.jpg)
25
Questions?
4/2/2010