ece 532 design project group report - university of toronto

22
ECE 532 Design Project Group Report Virtual Piano Chi Wei Hecheng Wang April 9, 2012

Upload: others

Post on 22-Nov-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ECE 532 Design Project Group Report - University of Toronto

ECE 532 Design Project Group Report

Virtual Piano

Chi Wei

Hecheng Wang

April 9, 2012

Page 2: ECE 532 Design Project Group Report - University of Toronto

2

Table of Contents

1 Overview ................................................................................................................................................ 3

1.1 Goals ...................................................................................................................................... 3

1.2 Background and motivation ................................................................................................... 3

1.3 System overview .................................................................................................................... 3

1.4 IP and Hardware Descriptions ............................................................................................... 4

2 Outcome ................................................................................................................................................. 5

2.1 Review of initial concept ........................................................................................................ 5

2.2 Our Final Product ................................................................................................................... 5

2.3 Improvements for Future Development .................................................................................. 6

3 Project Schedule .................................................................................................................................... 7

3.1 Weekly Milestone Accomplishments ..................................................................................... 7

3.2 Discuss and Evaluation ........................................................................................................... 8

4 Description of the Blocks ....................................................................................................................... 8

4.1 MicroBlaze (software) ............................................................................................................ 8

4.1.1 Software Overview ................................................................................................... 8

4.1.2 Data Structures .......................................................................................................... 9

4.1.3 Detection Algorithm ............................................................................................... 10

4.1.4 Sound generation .................................................................................................... 11

4.1.5 Sound Playback Control ......................................................................................... 12

4.2 Video_TO_RAM Custom Tracking Logic ........................................................................... 13

4.2.1 Overview of Operation ........................................................................................... 13

4.2.2 Tracking Logic ........................................................................................................ 13

4.2.3 Finite State Machine ............................................................................................... 15

4.3 MPMC Controller ................................................................................................................. 16

4.4 PLB Bus ................................................................................................................................ 16

4.5 OPB bus ................................................................................................................................ 16

4.6 OPB-PLB, PLB-OPB Bridge ................................................................................................ 17

4.7 OPB_AC97 Controller.......................................................................................................... 17

5 Description of Our Design Tree ........................................................................................................... 17

References ............................................................................................................................................... 18

Appendix ................................................................................................................................................. 19

Appendix A – tone waveforms ................................................................................................... 19

Page 3: ECE 532 Design Project Group Report - University of Toronto

3

1 Overview

1.1 Goals Any flat surface can become a virtual piano! The user merely plays as if on a real piano

and a video camera tracks the movement of the fingers. When a finger touches any point

on the Virtual Piano, the corresponding note is played back through the speakers. The user

could lay down a paper cut-out of the piano keys on the surface or remove this visual

reference for the truly wondrous experience of playing an invisible piano.

1.2 Background and motivation Musical instruments have long been a major source of entertainment in our culture. With

all the advancements in DSP and electronics in the recent years, generating music

electronically has never been easier. Yet many musicians still train with and prefer to play

the physical instruments. With our project, we hope to advance the creation of music

further into the digital realm while still drawing upon skills learned via traditional

instruments. We have picked the Piano as our choice of instrument to digitize.

The Piano is widely played by music students and musicians. It has already seen one level

of digitization: the keyboard. However, while the keyboard allows us to fiddle with the

digital sound output, it still relies on the physical input for playing. Our project takes the

piano to the next level by removing the physical presence of the keys and allowing the

piano to be played by recognizing the gestures of the hands of the player.

1.3 System overview Below is our system level block diagram.

Figure 1 System Level Block

Diagram

Page 4: ECE 532 Design Project Group Report - University of Toronto

4

1.4 IP and Hardware Descriptions The following table provides a summary of all IP cores and hardware devices used for this

project.

Table 1 Summary of IP cores and devices

IP/Hardware Function Author

video_to_ram

(Tracking Logic)

-Modified version of video_to_ram from

previous year’s laser pointer game project

-Detect dots of green colour and keeps track

of the position 4 different dots.

Group

MicroBlaze Soft Processor Core for setting up video , hit

detection and generation of sound

Xilinx, program

implemented by group

dlmb/dlmb_ctrl Data memory controller interfaced through Local Memory Bus (LMB)

Xilinx

Ilmb/ilmb_ctrl Instruction memory controller interfaced through LMB

Xilinx

PLB Bus 2 PLB buses are used:

1. used by MicroBlaze to talk to AC97

Controller(through OPB) and memory

2. used between video_to_ram and memory

Xilinx

OPB Bus Used to communicate with AC97 Controller Xilinx

PLB2OPB Bridge/

OPB2PLB Bridge

Establish Communication Channels between

PLB and OPB Buses

Xilinx

Debug module (mdm)

Enables XMD Xilinx

DDR_SDRAM (mpmc) Memory to store the Color limit value and

the coordinates of 4 green dots

Xilinx

Opb_ac97 Audio Codec Controller Mike Wirthlin (ISU)

LM4550 AC97 Codec National

Semiconductor

Camera Captures the video and finger movement Sony

Video Daughter Card Interface with Camera Analog Devices

UART Transmit debug info using RS232 Xilinx

clock_generator Generate timing signals Xilinx

led_debug_mux Display the debugging value controlled by

DIP switches

Xilinx

Speakers Play Audio

Page 5: ECE 532 Design Project Group Report - University of Toronto

5

2 Outcome

2.1 Review of initial concept All of the primary design goals have been met, these include:

1. Ability to play virtual piano on any flat light-colored surface.

2. Simultaneous recognition of up to four fingers.*

3. Ability to play one tone at a time in the range of a single octave.

4. White piano keys only

5. Response time of < 0.25 seconds. In fact the response seems to be less than 0.1

second.

Some of the optional goals are also met, which include:

1. Allow piano to be played on any colored surface, light or dark. In fact the

background color does not matter as long as it’s not green. Furthermore, the

lighting has minimal impact on the accuracy of the detection.

2. Increased the range to 1 note plus a full octave for 8 notes in total:

C4,D4,E4,F4,G4,A4,B4,C5.

3. Increase number of tones which can be produced simultaneously to three.

*the original goal of 5-finger tracking was modified to 4 fingers as the difference only has

a minimal impact on the overall performance of the system. Thus it was revised after

consulting with the TA.

2.2 Our Final Product The finished product as of the writing of this report is capable of detecting and tracking

the coordinates of 4 green dots on fingertips simultaneously. We did not extend the

number of tracking to more than 4 fingers due to the chord generation, which limits the

number of tones in a chord to 3. Thus it would pointless to track 10 fingers as proposed

initially since it would make no difference to the system’s sound generating performances.

For the physical setup of the system, the users are required to put on green stickers on their

fingertips to enable finger tracking. A layout of keys is drawn on a white board as a

reference to guide the user’s playing. The system successfully generates the corresponding

tones when a downward finger movement is detected on a specific region on the screen.

Multiple fingers will generate a chord and the response time is satisfactory with little

lagging. Overall, the project showed positive results in meeting our initial goals; most of

the features important to the playing of virtual piano are implemented and fully tested.

Page 6: ECE 532 Design Project Group Report - University of Toronto

6

Our physical system setup is as follows.

Green stickers need to be put on to play the virtual piano.

2.3 Improvements for Future Development 1. The 3-tone chord isn’t quite as smooth as the 2-tone chord. We suspect that it was due

the slow processing speed of floating point operations that resulted in the empty space in

the AC97 FIFO buffer. As a future improvement, additional MicroBlaze can be added

dedicated to the sound generation. Alternatively, the sound generation could be done in

hardware to speed up the process and eliminate the undesired characteristics of the current

system.

2. The length of duration of playing is fixed at this point. In order to make the piano

playing experience more realistic we can make the playback length vary depend on the

Page 7: ECE 532 Design Project Group Report - University of Toronto

7

actual position and movement of the fingertips. Our detection module can detect fingers’

movement including upward and downward. This expansion is very straightforward and

only requires modification to sound generation and playback control.

3. Black keys or half tone regions and sound can be added to increase the number of

playable keys. Additional tones in general will also make the whole system more

appealing.

3 Project Schedule

3.1 Weekly Milestone Accomplishments This is a summary of comparison between proposed milestones and actual weekly

progress.

Table 2 Summary of proposed Milestones and actual weekly progress

Dates Proposed Milestones Actual Progress

Feb 8 Create the base project and experiment

with VGA decoder, AC’ 97 audio

decoder and add MPMC to the system.

Constructed a base project with MPMC and

DDR memory as well as opb_AC97 core to

control the AC97 codec

Feb 15 Complete VGA_to_RAM, test video is

captured and stored in ram correctly.

Successfully implemented video_to_ram

module to capture the video data and store it

in SDRAM. Tested sound generation as well

as storing audio PCM data in the memory for

playback.

Feb 22 (Reading week) (Reading week)

Feb 29 Build and test basic sound generator

module and active region detection

module. Demonstrate software

detection algorithm functionality.

Implemented audio playback controller to

play pre-stored audio tone samples from

memory. Implemented hardware detection

block to track a single red dot without

simulation.

Mar 7 Complete active region detection

module and sound generator module.

Complete software detection

algorithm.

Implemented the software detection algorithm

and simulated and debugged in C.

Simulation of single red dot tracking works,

began building multipoint tracking.

Mar 14 First integration attempt. Debug the

entire system.

Integrated the software and hardware system.

Added interrupt signal from the hardware and

scaled down to 2 point tracking for debugging

purpose.

Mar 21 Major bugs should be fixed Major bugs fixed, as a result the basic

functions of tracking of two red dots and

playing corresponding tones are working.

Mar 28 Optimization of software detection

algorithm and hardware modules.

Improve the sound

Changed the color detection to green for

better performance, increased the number of

tracking dots to 3. Expanded the playable

tones to 7.

Page 8: ECE 532 Design Project Group Report - University of Toronto

8

Apr 4 The system should work properly. Fix

the remaining minor bugs.

Added additional tones for playing. Fixed the

4 dots tracking issue. Physical setup of the

system was finalized

Apr 11 Final Demo Final Demonstration

3.2 Discuss and Evaluation Throughout the development of this project, we were able to follow the pace of the

proposed milestones fairly well. Although we intended to make the project progress at an

even pace, it turned out that the original plan has a heavier load on the back end. The

reason was that as the complexity of the system increased, the time and effort required to

debug the system also increased drastically. As expected, the majority of our time was

consumed trying to verify the functionalities and to debug the various issues resulted from

the system integration. Even though we had modified the milestones each week, in most

cases, they were merely changed to include specific details and to better reflect the

actually progress of our work. During one week, however we did fall behind our schedule

due to a bug in the video detection block. To remedy this, we worked extra hard for the

next week and was able to catch up without further delays in our timeline. Overall, the

original milestones were well set, but they could be improved by compressing some of the

work at the front end to leave more time for debugging.

4 Description of the Blocks

4.1 MicroBlaze (software)

4.1.1 Software Overview

The software component contains three modules: detection, sound generation and sound

playback control. Below is the program flow chart.

Page 9: ECE 532 Design Project Group Report - University of Toronto

9

Figure 2 Software Core Flow Chart

The interrupt from tracking logic is triggered when new coordinates are available. The

interrupt routine is kept to minimum size to save time for main loop to feed in sufficient

samples.

4.1.2 Data Structures

Two important data structures are created: dot structure and playback structure. Their

definitions are the following:

The dot_structure contains the position and state information of each finger. It is only

updated in the interrupt routine. The playback_structure contains all the information

needed for sound generation. We support up to three playback sources simultaneously so

up to three tones can be played together to form chord. The playback_structure can only

be marked as busy in the interrupt routine in the case of a key has been pressed. It can only

be clearly in the main loop when the tone has finished playing.

Program Initialization:

Video module setup

Audio module setup

Initialize program global data

structures and parameters

Enable microblaze interrupt

Program main loop:

Check if audio INFIFO is full.

If it’s full, skip this iteration.

Generate sound samples and

feed the data into audio FIFO.

If none of playback structure is

occupied, clear FIFO.

Interrupt Routine:

Read updated coordinates from

memory and validate new

coordinates

Updates four fingers’ states stored

in dot structures.

Update playback structure if any

new event is triggered. (Key is

pressed)

Interrupt

signal

Resume

Page 10: ECE 532 Design Project Group Report - University of Toronto

10

In addition, some enum types are also created to better organize the program. The

dot_state type contains supported tracking states. The region type stores all supported key

regions. REGION_NON is the area outside of key regions.

4.1.3 Detection Algorithm

The detection algorithm is implemented in Finite State Machine. The following is the

FSM state diagram.

Figure 3 Detection FSM

The FSM is only updated in the interrupt routine. After processing each frame, the

tracking logic will write the new coordinates to memory and send an interrupt signal to

Page 11: ECE 532 Design Project Group Report - University of Toronto

11

notify software core. Done_Downward is the state that is important in this project as it

indicates a key has been pressed. The threshold is the number of frames the finger keeps

moving in the same direction before it enters the next state. This adjusts the sensitivity of

the detection. After one key is pressed, if the finger still stays in the key regions, it will be

blocked in IDEL state until it moves out of key regions. This covers the case when user

puts the finger in the key regions after pressing the key. The user has to move his finger

out of the key regions to reset the state and move downward again to trigger another event.

Otherwise it is considered as the zombie state of previous event and will not trigger any

new event. As described in the diagram, the detection algorithm does not only detect key

pressing event, it also detects the movement of fingers. The implementation makes it

possible to further develop the project if additional features need to be added. For example,

piano sound usually lasts for certain length and start to fade out when the key is released.

This key releasing state is already captured in the FSM.

4.1.4 Sound generation

Considering the speed of microBlaze, basic sound samples (one period for each tone) are

pre-generated on PC with python script. AC97’ is set to 44.1kHz sampling rate so every

tone has fixed number of samples in one period. To have the highest sound quality, we

also added harmonic frequencies and envelope. We in total implemented 8 tones from C4

to C5. They are in Motorola 16-bits format so that it’s consistent with the system setup.

Below is the frequency and waveform for C4. (See Appendix A for the rest tones).

Figure 4 Piano Tone C4 Samples – 261Hz, 167 Samples

Page 12: ECE 532 Design Project Group Report - University of Toronto

12

To emulate the sound of piano, the envelope which we generated has the following shape.

Figure 5 Envelope Coefficients

Our system supports up to three tracks, so up to three tones can be played simultaneously.

(Three tones can be the same tone) Because tones’ samples are generated separately, their

samples are added up to form the combined sound before feeding into audio buffer. The

details of combining sound are discussed in sound playback control section.

4.1.5 Sound Playback Control

In the main loop, it keeps checking if any of three playback sources is on. If there’s

playback source with data available, then combine these sources and feed the sample into

AC’97 in buffer. Otherwise clear the INFIFO and wait for new tones. Below is the sound

combination logic.

Figure 6 Sound Combination Logic

The playback of each tone is set to the same length. When all the samples are fed into the

buffer, the playback source will be marked free and made available to accept new tones.

When all three playback sources are occupied, pressing the key will not have any effect

and this event will be ignored and lost.

Page 13: ECE 532 Design Project Group Report - University of Toronto

13

4.2 Video_TO_RAM Custom Tracking Logic

4.2.1 Overview of Operation

This custom block was based on the video_to_ram block from the video demo project.

This block performs two key functions color detection and position tracking built on the

existing functions embedded in the original video_to_ram block. Namely, the original

block accomplishes the following, it decodes the streaming video data, and converts the

YCrCb format into RGB format, then writes them into two line buffers, then writes the

line buffers’ content to the memory. See figure below.

Since we do not have a video output feature, we have changed the block so it no longer

writes the buffer content to the memory. Instead, the custom tracking logic is implemented

by reading every pixel data from the line buffers (alternating between the two and only

read when it is not being written to by the RGB decoder). The RGB value of the pixel is

then evaluated according to the content of the r_current_color_limits register and the

coordinates of the pixel is saved if it is deemed a valid data.

4.2.2 Tracking Logic

The following diagram shows the detection logic used for color detection and tracking.

Figure 7 Video Processing Flow Chart

Page 14: ECE 532 Design Project Group Report - University of Toronto

14

Figure 8 Tracking Logic Diagram

The color margin and position margin are encoded from the r_current_color_limits

register, which is read from the memory location 0x45000000. It is encoded as the

following:

Not Used Positional Margin Color Margin

[31:16] [15:8] [7:0]

The location registers contains the data of frame number, pixel count, line count and the

valid bit. The frame number, which is incremented for every new frame, is used only for

debugging purpose. The pixel count, which is incremented for every time a new pixel is

read from the buffer for detection, is used as the X coordinate. The line count, which is

incremented for every new line from the incoming video stream, is used as the Y

coordinates. They are encoded as the following:

Frame Number Y Coordinate Valid Bit X Coordinate

[0:9] [10:19] [22] [21:31]

Page 15: ECE 532 Design Project Group Report - University of Toronto

15

The location register are sorted according to their X coordinate values and then written to

the memory address from the smallest to the largest at 0x450000004, 0x450000008,

0x45000000c and 0x45000010.

4.2.3 Finite State Machine

Key FSM states are described in the following.

S_LINE_HOLDER: A single pixel is read out of the line buffer and the tracking logic

determines whether the location registers should be updated according to the color limit

register.

S_READ_REQUEST: Appropriate signals are asserted on the PLB in order to request a

4 byte read at the memory location of the color limits.

S_READ: the color limit register is updated with the content in the memory

Figure 9 Tracking Logic FSM Diagram

Page 16: ECE 532 Design Project Group Report - University of Toronto

16

S_WRITE_REQUEST_X: Appropriate signals are asserted on the PLB bus in order to

request a 4 byte write at the memory location for the dots’ coordinates. The content of the

location register X is also placed on the data bus.

S_FINAL: The FSM stalls here until a new frame is ready, i.e. line count becomes 0. Also

sends an interrupt signal to microBlaze to signal the memory has been updated.

Essentially, the FSM processes an entire frame while it spins in S_LINE_HOLDER.

During this time, the location registers’ content are updated accordingly. Once the whole

frame is done, FSM performs memory updates. It firsts reads a 32 bit data from memory to

update the color limit register, then writes the 4 sets of coordinates one by one to the

memory.

4.3 MPMC Controller The MPMC (multi-port memory controller) is an IP block in the Xilinx IP library. It

allows up to 8 buses to be connected to the same memory through different ports. In this

project, we used a total of 2 ports in the MPMC:

Port 0: used by the Microblaze to write hardware configurations and color limit and read

the coordinates of the 4 finger tips.

Port 1: used by the video_to_ram custom tracking IP block to read the color limit and

write the coordinates to the memory.

4.4 PLB Bus Processer Local Bus is obtained from the Xilinx IP catalog under the name of plb_v64.

The PLB bus provides a connection between an optional numbers of PLB masters and

slaves. It became the key transportation of data from one module to another. To eliminate

bottleneck to the Memory, we used 2 PLB buses in our system,

1. Used for the microblaze to communicate with the DDR opb_AC97.

2. Used between video_to_ram tracking block and memory.

4.5 OPB bus On-Chip Peripheral Bus, obtained as an Xilinx IP. Used for communication between

MicroBlaze and its peripherals. The reason is bus was used was because the AC97

controller block was built with an OPB interface.

Page 17: ECE 532 Design Project Group Report - University of Toronto

17

4.6 OPB-PLB, PLB-OPB Bridge Bridge which enables communication between the PLB and OPB buses, allows the

Microblaze which sits on the PLB to send commands to the AC97 controller on the OPB

bus and write playback data to the codec.

4.7 OPB_AC97 Controller

The opb_AC97 (Wirthlin) is the audio codec controller. It is an OPB slave that provides a

register‐based interface for the codec. It was found as part of the labs for an ECE course

at Iowa State University, and was created by Prof. Mike Wirthlin of Brigham Young

University. The controller interacts with the audio codec using 5 ports:

SDATA_OUT: From opb_AC97 to codec. Serial input of data frames for codec

sampled on the falling edge of BIT_CLK.

BIT_CLK: From codec to opb_AC97. 12.288 MHz clock.

SDATA_IN: From codec to opb_AC97. Serial output of data frames for codec sampled

on the rising edge of BIT_CLK.

SYNC: From opb_AC97 to codec. Defines boundaries of data frames.

RESET#: From opb_AC97 to codec. Active low hardware reset.

5 Description of Our Design Tree Table 3 Design Tree

Directory/File Description

./_xps Xilinx generated directory, contains options files for bitinit,

libgen, simgen and platgen

./blkdiagram Block diagram generated by XPS

./data/system.ucf System constraint file; contains external pins’assignment

./drivers Includes the drivers for AC97

./etc Option files for bitgen and downloading

./ub The processor

./doc/Final Report.pdf Group report of our project

./pcores/led_debug_mux_v1_00_a Contains core files for led mux

./pcores/opb_ac97_v2_00_a Contains core files for AC97

./pcores/video_to_ram_v1_00_a Contains core files for Custom tracking module

./sw/main.c Software core, contains three modules:

Finger movement detection

Sound generation

Sound playback controller

./sw/main.h Header file contains constants and parameters needed by

software core

./README Design tree documentation

Page 18: ECE 532 Design Project Group Report - University of Toronto

18

References [1] Video to RAM deco project, with core video_to_ram from Jeffrey Goeders, retrieved

from Piazza.

[2] Latchezar Dimitrov, Jonathon Riley, Steven Doo, ECE532 Digital System Design,

Laser Pointer Project. 2011.

[3] IBM Corp. (2007, May) IBM 128-Bit Processor Local Bus Architecture Specifications

Version 4.7. [Online]. https://www-

01.ibm.com/chips/techlib/techlib.nsf/techdocs/3BBB27E5BCC165BA87256A2B0064FFB

4/$file/PlbBus_as_01_pub.pdf

[4] AC97-LM4550, Texas Instrument. Literature Number: SNAS032E.

Page 19: ECE 532 Design Project Group Report - University of Toronto

19

Appendix

Appendix A – Tone Samples in Waveform

Piano Tone D4 – 293Hz, 150 samples

Piano Tone E4 – 329Hz, 134 samples

Page 20: ECE 532 Design Project Group Report - University of Toronto

20

Piano Tone F4 – 349Hz, 126 samples

Piano Tone G4 – 392Hz, 112 samples

Page 21: ECE 532 Design Project Group Report - University of Toronto

21

Piano Tone A4 – 440Hz, 100 samples

Piano Tone B4 – 494Hz, 89 samples

Page 22: ECE 532 Design Project Group Report - University of Toronto

22

Piano Tone C5 – 523Hz, 84 samples