elad hadar omer norkin supervisor: mike sumszyk winter 2010/11, single semester project....

34
Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty of Electrical Engineering High Speed Digital System Lab (HS DSL) Exploring new implementation tools for GIDEL PROCSTAR platform (PART II - PROCAPI)

Upload: byron-neal

Post on 29-Dec-2015

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Elad HadarOmer Norkin Supervisor: Mike Sumszyk

Winter 2010/11, Single semester project.Date:22/4/12

Technion – Israel Institute of TechnologyFaculty of Electrical EngineeringHigh Speed Digital System Lab (HS DSL)

Exploring new implementation tools for GIDEL PROCSTAR platform

(PART II - PROCAPI)

Page 2: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Project motivationImplementing a video analysis designs on GIDEL PROCSTAR III platform that will enable usage and exploration of a new development platform (PART I – PROCHILs, PART II – PROCWIZARD, PROCAPI, PROCMegaFIFO ).

Proper usage of development tools throughout all stages of implementation from algorithm to hardware.

Preparing a clear user-guide that will enable a fast and simple ramp-up of the tools and the appropriate flow.

Page 3: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

First part - PROCHILs

• PROCHILs is a Hardware-In-the-Loop acceleration tool for running Simulink designs on FPGAs.

• Automatically translate Simulink designs into FPGA code (compatible with the PROC board installed on the target PC) and run it under Simulink.

Page 4: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

PROCHILs main advantages• Speed - Dramatically improves simulation speed, with a

dedicated accelerator for Simulink designs. Direct Hardware Burn. Direct generation HDL code that matches the target board. Fast HW simulation using Simulink/Matlab interface.

• S imp l i c i t y - Enables building a design visually and uploading it directly, with minimal effort, into the PROC board.

• Effi c iency - Enables concurrent engineering at an early stage. Cuts development cycle time (and costs). Extremely efficient on resources consuming processing

algorithms.

• Re l i ab i l i t y - Improve design reliability.

Page 5: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

• We have measured the input data processing time

of the Hardware generated by ProcHIL and the equivalent Software simulation for different length data vectors.

• There is a significant acceleration of the processing time using the generated Hardware, especially for longer data vectors.

PROCHILs Performance

Page 6: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

• Ratio between the Hardware & Software running time for different length input vectors

• An exponential curve fitting will give us that the ratio converges to ~128.

PROCHILs Performance

Page 7: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

• Even when considering the highest processing ratio, we get a rate of which is very low.

• For 512x512 pixels image it means 1.04 [Frames/Sec] which is insufficient for video streaming.

PROCHILs main weakness

15,000,000[ / sec] 273,573[ / sec]

54.83pixels pixels

PROCAPI

• Not suited for applying on streaming data designs (Real-Time designs)

Page 8: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

PROCAPI introduction

A set of functions that provide the means to access PROC boards by supplying methods that enable real-time configuration and querying of the board.

Motivation: Learning and practice of effective debug methodology using PROCAPI while streaming video through an image processing design.

PROCAPI allows the user to control data transfer between the PC and the PROC board (using a controllable DMA channels). In PROCHIL this ability is transparent to the user.

Page 9: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Main goals and phases of work1) Learning PROC API, PROCMegaFIFO2) Define and build an integrated DSPbuilder design

combining PROCAPI video streaming functions, data channels and PROCMegaFIFO memories.

Page 10: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Hardware and Development environment

• GiDEL PROCAPI (Version 8.8)• ALTERA’s DSPBuilder blockset for Simulink (Version 10.1)

• ProcWizard (Version 8.8)• Quartus II (version 10.1)• Matlab (Version 2009a)• OpenCV (Version 2.1)

• GiDEL PROCStar III (Altera Stratix III) board (4-FPGA)

Page 11: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

System block diagram

Page 12: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Project flow

Page 13: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Prewitt edge detector implementation

Controller

Prewitt edge detector

Page 14: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Prewitt edge detector

Prewitt edge detector

Page 15: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Prewitt edge detector

222 255 201 180

155 111 143 96

87 87 55 27

34 67 0 3

A

0 0 255 0

255 255 255 0

0 0 255 0

0 0 255 0

Result 0

255

if G threshold

else

1 0 1 1 1 1

1 0 1 *A 0 0 0 *A

1 0 1 1 1 1X YG G

X YG G G

Page 16: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Pixel neighborhood storing

Page 17: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Controller

Controller

Page 18: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Controller

Page 19: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Preventing pipe contamination• enableBit and clken are both connected to read_acknowledge from

FIFO IN.• Data pipeline of the Prewitt edge detector is 512+4 stages long.• When FIFO IN is empty:

1. Stop all data propagation in order to avoid garbage in the pipe that will affect the algorithm correctness.

2. The writing will resume only when a new valid pixel arrives at the end of the data pipeline.

Page 20: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Interrupt control• Two inputs:

1. WE - Is FIFO IN empty?2. AD – Is the arriving pixel the last pixel of a frame?

• One output:1. dmaInterrupt

Page 21: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Controller implementation

• Data may be propagating or stopped.• Pixel is not the last pixel of a frame

• One cycle state• interrupt is sent

• Frame is finished but the FIFO IN is empty AD is always on until New pixel resets the counter

1 1 2

2 1 2

1 2

1 ( )

1 ( )

( )

F n F n F n WE n

F n F n F n WE n

DMAI F n F n AD n

Page 22: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Interrupt controller

1 1 2

2 1 2

1 2

1 ( )

1 ( )

( )

F n F n F n WE n

F n F n F n WE n

DMAI F n F n AD n

Page 23: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Signal compiler

VHDLLibrary

Page 24: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Quartus

VHDLLibrary

.rbf

Page 25: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Source & Header files

PROCWizard

PROC API

Read Frame

Display Frame

Page 26: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

C code structure

• opening camera • Setting DMA channels• Define new buffers

• Capturing new frame• Write the frame from

input buffer to input FIFO• Interrupt • Show original frame

Main function

(pre processing)

Second thread function

(post processing)Loop

• Write the frame from output FIFO to output buffer

• Show processed frame

Blue-OPENCV function

Purple-API function

Red-interrupt

Page 27: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Frame Rate

• We used TIC & TOC macros using OpenCV functions to asses the video output stream frame rate.

• We did so by measuring time elapsed for presenting 30 frames, And thus concluding frame rate.

• Full Prewitt edge detector design: 12[fps]

• Empty design- image capture and present- Hardware: 25[fps]

• Empty design- image capture and present- Simulink: 23[fps]

Page 28: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Frame loss detection

1 0 1

1 0 1

1 0 1

convolution with

amp

1 1 1

0 0 0

1 1 1

convolution with

amp

Page 29: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Frame loss detection

• By creating another version of the Prewitt edge detector, we managed to divide the doubled number back to its value

Page 30: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Time delay

• We have also noticed a time delay between the original input stream and the filtered one. We had 2 hypotheses to the cause of that delay:• The delay was caused by the code complexity, by the multiple

loops, memory copying.• The delay is caused by the FIFO and its size, due to an

accumulation of frames in the FIFO waiting to be extracted.

• Although we made many optimizations in the C++ code, the delay was not reduced at all.

• When we reduced the FIFOs size from 8MB to1MB the time delay decreased dramatically from about 3 seconds to less than half a second.

Page 31: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

STRATIX III FPGA

DSP-Builder based edge detection design

FIFO IN FIFO OUT

Time delay

• We assumed that FIFO OUT is always near empty, because the extraction of the processed images is not limited by the hardware’s rate (12[fps]), and performed at the high rate that the DMA can accomplish.

• The delay is mainly because of the time it takes the images to pass through the 1MB FIFO IN at a rate of 12[fps]. That calculates to 4 images delay, and means 0.333 second delay, as observed by us.

Page 32: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Results & Conclusions

• Frame rate is satisfying - 12• No frames are lost!• Time delay is very low – less than half a second

sec

frames

Page 33: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

Useful outputs

• An implemented template for all video streaming processing algorithms. Only minimal effort is needed to integrate a new algorithm and run it!

• A full user-guide which enables a fast and simple ramp-up of the tools, summs up all conclusions made and consists of needed background knowledge.

Page 34: Elad Hadar Omer Norkin Supervisor: Mike Sumszyk Winter 2010/11, Single semester project. Date:22/4/12 Technion – Israel Institute of Technology Faculty

• The Lab team• Special thanks to

Mike Sumszyk who guided us with devotion….