eec-492/592 kinect application development lecture 11 wenbing zhao [email protected]
TRANSCRIPT
![Page 1: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/1.jpg)
EEC-492/592EEC-492/592Kinect Application Kinect Application
DevelopmentDevelopmentLecture 11Lecture 11
Wenbing ZhaoWenbing Zhao
[email protected]@ieee.org
![Page 2: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/2.jpg)
OutlineOutline Using Kinect’s Microphone Array
Verifying the Kinect audio configuration The Kinect SDK architecture for audio How Kinect processes audio signals Inside Kinect's microphone array Capturing and playing audio Processing audio data by suppressing and canceling
noise Understanding sound source localization and beam
formation
![Page 3: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/3.jpg)
Verifying the Kinect audio configuration navigate to Control Panel | Device Manager, look for the Kinect for Windows node, find Kinect for Windows Audio Array Control
Also check Sound, video and game controllers node
Sound driver for microphone array
![Page 4: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/4.jpg)
Troubleshooting: Kinect USB Audio not recognizing
While installing the SDK make sure the Kinect device is unplugged
Before using Kinect, restart your system once the installation is done
It's always recommended to assign a dedicated USB Controller to the Kinect sensor
![Page 5: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/5.jpg)
Using the Kinect microphone array with your computer
Navigate to Control Panel | Sound and switch to the Recording tab, find Kinect's Microphone Array as a recording device, set it as default
Can test it using Windows sound recorder
![Page 6: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/6.jpg)
The Kinect SDK architecture for Audio
![Page 7: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/7.jpg)
DirectX Media Object Controls
Noise Suppression (NS) Acoustic Echo Cancellation (AEC) Automatic Gain Control (AGC)
SDK exposes a set of Kinect SDK offers APIs to control these above features
![Page 8: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/8.jpg)
Major focus area of Kinect audio Human speech recognition
Recognize player’s voice despite loud noise and echos
Identify the speech within a dynamic range of area While playing, the player could change his position, or Multiple players could be speaking from different
directions
![Page 9: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/9.jpg)
Why Microphone Array? The logic behind placing microphones in different places
is to identify the following: The direction of the incoming sound The distance of the sound source => origin of the sound
As all the microphones are placed in different positions, the sound will arrive at each of the microphones at different time intervals
![Page 10: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/10.jpg)
Audio signal processing in Kinect Once the source and position of the sound is calculated,
the audio-processing pipeline merges the signals of all microphones and produces a signal with high-quality sound
Kinect can identify the sound within a range of +50 to -50 degrees (NOT radians as stated in the textbook!!!) KinectAudioSource class
MaxSoundSourceAngle MinSoundSourceAngle
![Page 11: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/11.jpg)
Audio signal processing in Kinect The SDK fires the SoundSourceAngleChanged event if
there is any change in the source angle The SoundSourceAngleChangedEventArgs class contains
two properties: The current source angle and the confidence that the sound source
angle is correct
The sound source angle identifies the direction (not the location) of a sound source The range of the angle is [-50, +50] degrees
Confidence level Range: 0.0 (no confidence) to 1.0 (full confidence) KinectAudioSource class has the SoundSourceAngleConfidence
property
![Page 12: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/12.jpg)
Beam Forming Like the cone of light from a lighthouse where the light is
the brightest, the audio capture hardware has an imaginary cone that is able to capture audio signals the best
Audio waves that propagate through the length of the cone can be separated from audio waves that travel across the cone
If you point the cone in the direction of the audio that your application is most interested in capturing, you can improve the ability to capture and separate that audio source from other competing audio sources
Use the beam angle to set the direction of the imaginary cone to improve your ability to capture a specific audio source
![Page 13: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/13.jpg)
Beam Angle The beam angle identifies a preferred direction for the
sensor to listen The angle is one of the following values (in degrees): {-
50, -40, -30, -20, -10, 0, +10, +20, +30, +40, +50} The sign determines direction:
A negative number indicates the audio source is on the right side of the sensor (left side of the user)
A positive value indicates the audio source is on the left side of the sensor (right side of the user)
0 indicates the audio source is centered in front of the sensor
![Page 14: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/14.jpg)
Sound Source vs. Beam Angle The sound source angle and the beam angle are both
defined from the sensor location. Both angles are defined in the x-z plane of the sensor perpendicular to the z-axis of the sensor. However, the two angles have completely different functionality
Both angles (beam and sound source) are updated continuously once the sensor has started streaming audio data (when the Start method is called)
Use the sound source angle to tell the choose the beam angle if you want to capture a particular sound source
![Page 15: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/15.jpg)
Programming Audio Source Start/stop audio source
The Kinect sensor must be in the running state in order to start the audio stream
If you try to start the audio stream and the sensor is not running, the application will throw an InvalidOperationException
Register sound source angle and beam angle events
Specify audio processing parameters Manually control beam angle Recording audio
this.sensor.AudioSource.Start(); this.sensor.AudioSource.Stop();
this.sensor.AudioSource.SoundSourceAngleChanged += soundSourceAngleChanged;this.sensor.AudioSource.BeamAngleChanged += beamAngleChanged;
![Page 16: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/16.jpg)
Setting Audio Processing Parameters Audio data can be processed by interacting with DMO via
KinectAudioSource class Echo cancellation
CancellationOnly CancellationAndSuppression None (default setting)
Noise suppression (enabled by default)
Automatically gain control: amplify weak audio, disabled by default
this.sensor.AudioSource.EchoCancellationMode = EchoCancellationMode.CancellationOnly;
this.sensor.AudioSource.EchoCancellationSpeakerIndex = 0;
this.sensor.AudioSource.NoiseSuppression = true;
this.sensor.AudioSource.AutomaticGainControlEnabled = true;
![Page 17: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/17.jpg)
Beam Angle Mode BeamAngleModel Enumeration Automatic and Adaptive: Kinect runtime sets the beam angle
automatically Automatic: default setting, use this for a low-volume loudspeaker and/or isotropic
background noise Adaptive: Use this for a high-volume loudspeaker and/or higher noise levels
Manual: The user sets the beam angle to point in the direction of the audio source of interest
The beam angle is located in the xz plane between the z-axis and the direction of an audio source set the angle using the ManualBeamAngle property
// Track the ManualBeamAngle on the right hand positionthis.sensor.AudioSource.ManualBeamAngle = Math.Atan(rightHand.Position.X /
rightHand.Position.Z) * (180 / Math.PI);
this.sensor.AudioSource.BeamAngleMode = BeamAngleMode.Automatic;
this.sensor.AudioSource.BeamAngleMode = BeamAngleMode.Manual;
![Page 18: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/18.jpg)
Recording Audio RecordAudio() is blocking
public void RecordAudio() { int recordingLength = (int)10 * 2 * 16000; byte[] buffer = new byte[1024]; Boolean startAudioStreamHere = false; using (FileStream fileStream = new FileStream(wavfilename, FileMode.Create)) { WriteWavHeader(fileStream, recordingLength); if (audioStream == null) { startAudioStreamHere = true; audioStream = this.sensor.AudioSource.Start(); } int count, totalCount = 0; while ((count = audioStream.Read(buffer, 0, buffer.Length)) > 0
&& totalCount < recordingLength) { fileStream.Write(buffer, 0, count); totalCount += count; } if (startAudioStreamHere == true) this.sensor.AudioSource.Stop(); }}
![Page 19: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/19.jpg)
Recording Audio To avoid blocking the UI, use thread
private void startBtn_Click(object sender, RoutedEventArgs e){ var audioThread = new Thread(new ThreadStart(RecordAudio)); audioThread.SetApartmentState(ApartmentState.MTA); audioThread.Start();}
![Page 20: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/20.jpg)
Recording Audio Helper methodsstruct WAVEFORMATEX{ public ushort wFormatTag; public ushort nChannels; public uint nSamplesPerSec; public uint nAvgBytesPerSec; public ushort nBlockAlign; public ushort wBitsPerSample; public ushort cbSize;}
static void WriteWavHeader(Stream stream, int dataLength) { using (var memStream = new MemoryStream(64)) { int cbFormat = 18; //sizeof(WAVEFORMATEX) WAVEFORMATEX format = new WAVEFORMATEX() { wFormatTag = 1,nChannels = 1, nSamplesPerSec = 16000, nAvgBytesPerSec = 32000, nBlockAlign = 2, wBitsPerSample = 16, cbSize = 0 }; using (var binarywriter = new BinaryWriter(memStream)) { //RIFF header WriteString(memStream, "RIFF"); binarywriter.Write(dataLength + cbFormat + 4); WriteString(memStream, "WAVE"); WriteString(memStream, "fmt "); binarywriter.Write(cbFormat); //WAVEFORMATEX binarywriter.Write(format.wFormatTag); binarywriter.Write(format.nChannels); binarywriter.Write(format.nSamplesPerSec); binarywriter.Write(format.nAvgBytesPerSec); binarywriter.Write(format.nBlockAlign); binarywriter.Write(format.wBitsPerSample); binarywriter.Write(format.cbSize); //data header WriteString(memStream, "data"); binarywriter.Write(dataLength); memStream.WriteTo(stream); } }}
static void WriteString(Stream stream, string s){ byte[] bytes = Encoding.ASCII.GetBytes(s); stream.Write(bytes, 0, bytes.Length);}
![Page 21: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/21.jpg)
Build KinectAudio App Create a new C# WPF project with name KinectAudio Add Microsoft.Kinect reference Design GUI Adding code
![Page 22: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/22.jpg)
GUI Design
Image
MediaElement
![Page 23: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/23.jpg)
Adding Code Import namespaces
Add member variables:
Register WindowLoaded event handler programmatically
KinectSensor sensor;WriteableBitmap colorBitmap;byte[] colorPixels;// Skeleton[] totalSkeleton = new Skeleton[6];Stream audioStream;string wavfilename = "c:\\temp\\kinectAudio.wav";
using Microsoft.Kinect;using System.IO;using System.Threading;
public MainWindow(){ InitializeComponent(); Loaded += new RoutedEventHandler(WindowLoaded);}
![Page 24: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/24.jpg)
Adding Code: WindowLoaded private void WindowLoaded(object sender, RoutedEventArgs e) { this.sensor = KinectSensor.KinectSensors[0]; this.sensor.SkeletonStream.Enable(); this.sensor.ColorStream.Enable(); this.sensor.AllFramesReady += allFramesReady;
this.colorPixels = new byte[this.sensor.ColorStream.FramePixelDataLength]; this.colorBitmap = new WriteableBitmap(this.sensor.ColorStream.FrameWidth, this.sensor.ColorStream.FrameHeight, 96.0, 96.0, PixelFormats.Bgr32, null); this.image1.Source = this.colorBitmap;
this.sensor.AudioSource.SoundSourceAngleChanged += soundSourceAngleChanged; this.sensor.AudioSource.BeamAngleChanged += beamAngleChanged;
// start the sensor. this.sensor.Start();}
![Page 25: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/25.jpg)
Adding Code: AudioStream Start/Stop
private void startAudioStreamBtn_Click(object sender, RoutedEventArgs e){ audioStream = this.sensor.AudioSource.Start();}
private void stopAudioStreamBtn_Click(object sender, RoutedEventArgs e) { this.sensor.AudioSource.Stop(); }
Stop/start via button clicks
![Page 26: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/26.jpg)
Adding Code: Audio Event Handlersvoid beamAngleChanged(object sender, BeamAngleChangedEventArgs e){ this.soundBeamAngle.Text = e.Angle.ToString();}
void soundSourceAngleChanged(object sender, SoundSourceAngleChangedEventArgs e){ this.soundSourceAngle.Text = e.Angle.ToString(); this.confidenceLevel.Text = e.ConfidenceLevel.ToString();}
![Page 27: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/27.jpg)
Adding Code: AllFramesReady Event Handler void allFramesReady(object sender, AllFramesReadyEventArgs e) { using (ColorImageFrame imageFrame = e.OpenColorImageFrame()) { if (null == imageFrame) return; imageFrame.CopyPixelDataTo(colorPixels); int stride = imageFrame.Width * imageFrame.BytesPerPixel;
this.colorBitmap.WritePixels( new Int32Rect(0, 0, this.colorBitmap.PixelWidth,
this.colorBitmap.PixelHeight), this.colorPixels, stride, 0); }}
![Page 28: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/28.jpg)
Recording/Replay Button Click Eventsprivate void startBtn_Click(object sender, RoutedEventArgs e)
{ var audioThread = new Thread(new ThreadStart(RecordAudio)); audioThread.SetApartmentState(ApartmentState.MTA); audioThread.Start();}
private void playBtn_Click(object sender, RoutedEventArgs e){ if (!string.IsNullOrEmpty(wavfilename) && File.Exists(wavfilename)) { kinectaudioPlayer.Source = new Uri(wavfilename, UriKind.RelativeOrAbsolute); kinectaudioPlayer.LoadedBehavior = MediaState.Play; kinectaudioPlayer.UnloadedBehavior = MediaState.Close; }}
![Page 29: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/29.jpg)
Handle Checkbox Events
private void noiseSuppression_Checked(object sender, RoutedEventArgs e) { this.sensor.AudioSource.NoiseSuppression = true; } private void echoCancellation_Checked(object sender, RoutedEventArgs e) { this.sensor.AudioSource.EchoCancellationMode = EchoCancellationMode.CancellationOnly; this.sensor.AudioSource.EchoCancellationSpeakerIndex = 0; }private void gainControl_Checked(object sender, RoutedEventArgs e){ this.sensor.AudioSource.AutomaticGainControlEnabled = true;}
Must register both checked and unchecked event handlers for each checkbox
![Page 30: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/30.jpg)
Handle Checkbox Eventsprivate void gainControl_Unchecked(object sender, RoutedEventArgs e){ this.sensor.AudioSource.AutomaticGainControlEnabled = false;}private void echoCancellation_Unchecked(object sender, RoutedEventArgs e)
this.sensor.AudioSource.EchoCancellationMode = EchoCancellationMode.None;}private void noiseSuppression_Unchecked(object sender, RoutedEventArgs e){ this.sensor.AudioSource.NoiseSuppression = false;}
![Page 31: EEC-492/592 Kinect Application Development Lecture 11 Wenbing Zhao wenbing@ieee.org](https://reader037.vdocuments.net/reader037/viewer/2022103023/56649de95503460f94ae44b0/html5/thumbnails/31.jpg)
Challenge Tasks For advanced students, improve the KinectAudio
app in the following ways: Add ProgressBar for recording and replaying Draw a vertical line in the color image indicating the audio
source angle, and another line indicating the beam angle Manually specify the beam angle to a particular joint, such
as the right hand Observe the sound source angle with the new beam angle
04/19/23EEC492/693/793 - iPhone Application
Development 31