multi-modal user interfaces - marsyas.info
TRANSCRIPT
ICASSP 2013 tutorial
Multi-modal User Interfaces a new music instruments
Sidney Fels (ssfelseceubcca) University of British Columbia
Canada George Tzanetakis (gtzancsuvicca)
TUTORIAL ICASSP 2013
1
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction
2
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Which is more interesting
3
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Multi-modal Interfacesbull Multiple modalities for both input and
output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our
body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically
6
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction
2
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Which is more interesting
3
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Multi-modal Interfacesbull Multiple modalities for both input and
output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our
body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically
6
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Which is more interesting
3
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Multi-modal Interfacesbull Multiple modalities for both input and
output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our
body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically
6
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Multi-modal Interfacesbull Multiple modalities for both input and
output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our
body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically
6
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Thishellip
4
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Multi-modal Interfacesbull Multiple modalities for both input and
output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our
body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically
6
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Multi-modal Interfacesbull Multiple modalities for both input and
output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our
body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically
6
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Or thishellip
5
Tuesday 22 October 13
ICASSP 2013 tutorial
Multi-modal Interfacesbull Multiple modalities for both input and
output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our
body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically
6
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Multi-modal Interfacesbull Multiple modalities for both input and
output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our
body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically
6
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Why music bull Musical Instruments are the ultimate
multi-modal interfaces (physical predates digital and analog interfaces)
bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering
bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI
7
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Control
8
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Continuous Control
9
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Human to human interaction and music performance
10
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Evolution of output devices
11
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
More output devices
12
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
SAGE
13
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
REACTABLE
14
Motivation and Overview
Reactable Music Technology Group (2006)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Smartphones as instruments
15
Motivation and Overview
iPhone Ocarina from Smuletrade (Wang et al 2009)
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Beyond direct mapping bull Direct Mapping
ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)
ndash Easy to learn and interpret ndash Expressive especially for continuous controllers
bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and
16
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Relevance beyond music bull Music instruments have anticipated many
developments in user interfaces such as the keyboard for typing letters and words
bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces
17
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous
streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment
bull Real-time and causality
18
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background
ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience
bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms
bull Musicndash Performance and composition culturendash No HCI DSP or programming
bull Integration ndash putting it all together
19
Motivation and Overview
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
New Interfaces for Musical Expression (NIME)
20
Motivation and Overview
First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Research on HCIMusic
21
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Tutorial objectives bull Broad overview of relevant areas to the
design and development of multi-modal user interfaces
bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area
bull Make connections between the individual topics using new music
22
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary
23
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat
bull 1 and 2 often switched
bull Tools to help with steps 1-4
24
Sensors and Actuators
Sensors + signal processingActuators + signal processingHCI
Engineering and programmingMusic Fun and Effort
Effort and pain
If you are lucky
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
What to measure bull Plethora of sensorsbull Motion (position
velocity acceleration rotation) of body parts
bull Torque forces (isometric and isotonic)
bull Pressure
bull Proximitybull Temperature bull Light bull Bio-signals
Heart rate Brain waves Galvanic skin responseMuscle activations
bull Many more hellip
25
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Transduction and Digitizing
26
Sensors and Actuators
Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Digitizing
27
Sensors and Actuators
bull Converting change in resistance to voltage (typical sensor has variable resistance)
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Physical Property Sensors
28
Sensors and Actuators
bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
29
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Human Action Oriented
30
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Material whose resistance changes when force is applied on it
bull Thin film low cost easy to interface bull Measurements are not very consistent
(differences of 10 are frequently observed)
bull An easy force sensitive button
Force-sensing resistors
31
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Piezoelectric Sensors
32
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Accelerometers
33
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator
coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed
bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal
34
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
Mircophones and Microphone Arrays
35
Sensors and Actuators
bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply
bull Capacitor (condenser) bull capacitor between a stationary metal plate and a
light metallic diaphragmbull compression changes capacitance by moving
diaphragmbull need power supply
bull Electret and Piezoelectric bull mentioned beforebull no external power needed
bull Magnetic (moving coil) bull induction - moving conductor in magnetic field
bull diaphragm with coil of wire immersed in magnetic field
bull Check out Kinecttrade
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13
ICASSP 2013 tutorial
CCD amp CMOS Camera
36
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
CMOS Camerasbull CCDs have to transfer charge rows
and columns one at a timebull CMOS photodiode arrays put amplifier
at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)
bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech
gets betterndash only useful for low-end still
bull cheap (lt$100) low power (10-50mW vs 1-2W)
bull offer single chip solution
37
Tuesday 22 October 13
ICASSP 2013 tutorial
Depth Camera
38
Sensors and Actuators
bull Kinect is probably best knownbull Motion tracking with body model
bull head arms and feetbull body geometrybull 20 joints per person
bull face recognitionbull RGB camera
bull 30 Hzbull depth sensor
bull Infrared projection + camerabull microphone array
bull directional sound localization speech recognition and noise cancelation
bull CheapTuesday 22 October 13
ICASSP 2013 tutorial
Actuators bull Electromechanical devices that affect
the physical world but are controlled digitally
bull Building blocks of robots and robotic devices
bull Output component of multi-modal interfaces
bull Examples
39
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Solenoidsbull Electromagnetic coil wound around a
movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise
40
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency
of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC
41
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal
stepsndash Move and hold no feedback circuitry required ndash Low cost
bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost
42
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed
on TV) for triangulation for use as pointing device
bull Large diversity of different styles of control is possible in games and music
43
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling
consumer electronic devicebull RGB camera bull Depth sensor based on infrared
structured light bull Microphone Array (acoustic source
localization and ambient noise suppression)
44
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more
bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port
45
Tuesday 22 October 13
ICASSP 2013 tutorial
DAQbull use a data acquisition
board plugged into your computerndash eg National Instruments
DAQ bull Up to 16 analog inputs
12-bit resolution up to 500 kSs sampling rate
bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters
bull Icube (voltage-gtMIDI signal)
bull Arduino board
46
Tuesday 22 October 13
ICASSP 2013 tutorial
Tooka a simple example (Fels et al
47
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial 48
Tooka a simple example (Fels et al
Tuesday 22 October 13
ICASSP 2013 tutorial
Events and Time Series
49
Sensors and Actuators
Time
Time
Multiple channels (for example microphone arrays)
Asynchronous Events
Synchronous Samples
Tuesday 22 October 13
ICASSP 2013 tutorial
2D3D ND + time
50
Sensors and Actuators
Time Time
Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensors lt-gt DSP lt-gt
51
Sensors and Actuators
Tuesday 22 October 13
ICASSP 2013 tutorial
Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies
52
Tuesday 22 October 13
ICASSP 2013 tutorial
Filtering bull Selective boostingattenuation of
different frequencies present in a signal
bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal
processing
53
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes
54
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Fourier Transform
55
Signals and Features
Spectrum
Tuesday 22 October 13
ICASSP 2013 tutorial
Short Time Fourier Transform
56
Signals and Features
Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform
Tuesday 22 October 13
ICASSP 2013 tutorial
Spectrogram
57
Signals and Features
256 samples 22050 Hz
4096 samples 22050 Hz
Time-Frequency Tradeoff
Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)
Tuesday 22 October 13
ICASSP 2013 tutorial
Wavelets
58
Signals and Features
STFT fixed time Frequency Resolutionbased on window size
DWT adaptive time frequency Resolution
Tuesday 22 October 13
ICASSP 2013 tutorial
Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane
bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain
59
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary
continuous times based on available discrete time samples
bull Fractional delay filtersbull Variants
ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for
band-limited continuous signals ndash Various approximation trading quality and
computational complexity bull For sensor data frequently linear or quadratic
60
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Calibration bull Comparison and adjustment between two
measurements (standard and test) bull Classic examples gravity based scales
with fixed weights tuning instruments bull Examples from NIME finding the range
(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type
bull Machine learning and control feedback are great tools for calibration
61
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Scaling bull Mapping of the sensor readings to
a desired control parameter with different range units
bull NIME examples mapping a rotary knob to frequency or a slider to volume
bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently
used bull Frequently used in conjunction
with calibration
62
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Periodicity Detection bull Music to a large extent consists of
sounds arranged at multiple time periodicities
bull Examples beats notes repeated gestures like strumming melodies chords
bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based
63
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
64
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Autocorrelation and Cross-Correlation
65
Signals and Features
Efficient computation when N is a power of 2
Tuesday 22 October 13
ICASSP 2013 tutorial
Similarity Matrix
66
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual
characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection
67
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Object tracking bull Follow the movement of interest
points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion
model bull Typically two stages ndash Target representation and location
(bottom up) ndash Target filtering and data association (top
68
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
NIME Object tracking
69
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Audio
70
Signals and Features
Tuesday 22 October 13
Mel Frequency Cepstral Coefficients
Mel-scale13 linearly-spaced filters 27 log-spaced filters
CFCF-130CF 10718
CF+130CF 10718
Mel-filtering
Log
DCT
MFCCs
Tuesday 22 October 13
ICASSP 2013 tutorial
Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)
bull Low coefficients represent most 13 13 13 13 of the signal - can throw high
bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Image bull Color texture shape bull Example color histograms
73
Signals and Features
Reduced to 256 colors
Tuesday 22 October 13
ICASSP 2013 tutorial
Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance
bull ARMA bull Statistical models such as GMM bull Modulation features
74
Signals and Features
Tuesday 22 October 13
ICASSP 2013 tutorial
Principal Component Analysis
75
Signals and Features
Projection matrix
PCAEigenanalysisof correlationmatrix
Tuesday 22 October 13
ICASSP 2013 tutorial
Self-Organizing Maps
Tuesday 22 October 13
Self-Organizing Maps
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Formulationbull Objective given a feature vector
representing something predict the class (a discrete categorical label) it belongs to
bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels
78
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models
ndash Discriminative approaches bull Support Vector Machines bull Decision trees
ndash Non-parametric bull K-nearest Neighbors
79
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Algorithms
80
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Classification Evaluationbull Accuracy F-measure Confusion
matrix bull Cross-validation and bootstrapping bull Stratified cross-validation
81
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Formulationbull Given a set of unlabeled feature vectors
partition them into sets (clusters) that contain similar items
bull Similar to classification but no training data is provided
bull Frequently the number of clusters K is provided based on domain specific knowledge
bull Variationsndash Hierarchical ndash Semi-supervised
82
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm
bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan
bull Graph-based
83
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Algorithms
84
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index
bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix
bull Various types of user studies
85
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Formulationbull Given a feature vector predict a
continuous value ie given day of the year and humidity predict temperature
bull Parametric ndash Linear regression ndash Ordinary least squares
bull Non-parametric ndash Kernel Regressionndash Regression Trees
86
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared
(correlation coefficient in linear regression between true and predicted)
bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters
87
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Surrogate Sensors
Use direct sensors to ldquolearnrdquo indirect acquisition
Use augmented instrument for training Record acoustic signal Train model to associate direct sensor
with the acoustic signal Evaluate and iterate
Use trained model in non-
Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis
Uncertainty and Time
Tuesday 22 October 13
Surrogate Sensing and the Ground Truth problem
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13
Classification
Tuesday 22 October 13
ICASSP 2013 tutorial
Some ResultsUncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Advantages Hard-to-build augmented
instrument is only used for training No modifications required Unlimited supply of training data for
the machine learning model TRAIN BY PLAYING is much more fun
than TRAIN BY ANNOTATING
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion bull Multiple sensor streams need to be
combined to make a decision bull Multiple rates might require
interpolation either of input or output or intermediate stages
bull Various possible architecture combining machine learning building blocks
93
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Sensor Fusion
94
Uncertainty and Time
Early and late are the extremes of a full spectrum of possibilities Feature Extraction
Feature Extraction
Dimensionality Reduction
Dimensionality Reduction
Feature Selection
Feature Selection
Classification
Classification
Tuesday 22 October 13
Multi-modal Results
Main idea use camera to constrain factorization results taking advantage of uncorrelated errors
Tuesday 22 October 13
ICASSP 2013 tutorial
Causality and Real Time bull Causal algorithms only need
knowledge of the past to operate ie can not ldquolookrdquo ahead
bull Causality is a necessary but not sufficient condition for real time performance
bull Real-time the processing is done with some delay at the same time as the sensor data
96
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Dynamic Time Warping
97
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo
we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden
bull Stationary process (not static) bull Markovian Property (current state depends
only on finite history ndash typically just previous time slice)
bull Transition Model P(current stateprevious state)
98
Tuesday 22 October 13
ICASSP 2013 tutorial
Inference tasks in temporal bull Filtering posterior distribution over current
state given evidence = likelihood of evidence bull Prediction posterior distribution of future
state given evidence to date bull Smoothing posterior distribution of past state
given all evidence up to the present bull Most likely explanation given sequence of
observations most likely sequence of states that has generated them
bull EM-algorithmndash Estimate what transitions occurred and what
states generated the sensor reading and update models
ndash Updated models provide new estimates and 99
Tuesday 22 October 13
ICASSP 2013 tutorial
Hidden Markov Models I
100
Uncertainty and Time
Hidden
p( | )
Observed
Model
1 2
P( | )
3 4
t t-1
Transition Probs
tEmission Probs
MODEL
Observations
Hidden State(single discretevariable)
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
101
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filter bull Linear Gaussian conditional distributions
represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current
state plus some Gaussian noise ie constant dxdt
bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1
bull Trade-off between observation reliability and model reliability
102
Tuesday 22 October 13
ICASSP 2013 tutorial
Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to
measurements)ndash Readjust model ndash Output estimate of state
bull Statistically optimal estimate of system state
103
Uncertainty and Time
Tuesday 22 October 13
ICASSP 2013 tutorial
Multimodal tempo detection for the E-sitar
104
Case Studies
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Beat tracking
105
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Human-Computer Interaction bull The discipline that studies the
interaction between humans and machines
bull Fundamental concept everything should be user-centered
bull Evaluation is not as straightforward and a variety of different techniques have been proposed
bull Typically not familiar to those coming
106
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia
quality and user experiencebull User centered approach bull Combines objective metrics and
subjective testing
107
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 108
ethnography
bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace
through immersion extended contact and subsequent analysis
bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology
bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt
bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers
Tuesday 22 October 13
ICASSP 2013 tutorial 109
ethnography
bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed
technologyndash possibly greater buy-in for the system
bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs
bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community
Tuesday 22 October 13
ICASSP 2013 tutorial 110
participatory design
bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants
(eg interviewees)
bull users considered subject matter experts
bull iterative process all design stages subject to revision
side note origins in ScandanaviaTuesday 22 October 13
ICASSP 2013 tutorial 111
participatory design
bull up sidendash users are excellent at reacting to suggested system designs
bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context
bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results
bull down sidendash hard to get a good pool of end users
bull expensive reluctant ndash users are not expert designers
bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right
bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices
bull donrsquot expect them to fully exploit the potential of new technologies
Tuesday 22 October 13
ICASSP 2013 tutorial 112
Wizard of Ozbull A method of testing a system that does not exist
ndash the voice editor by IBM (1984)
The WizardWhat the user sees
Tuesday 22 October 13
ICASSP 2013 tutorial 113
Wizard of Ozbull human simulates the systemrsquos intelligence and interacts
with user
bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo
bull user uses computer as expected
bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner
bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas
bull possible cons
Tuesday 22 October 13
ICASSP 2013 tutorial
Eat your own dogfood bull Frequently programmers donrsquot use the
software they write bull Dogfooding is the process of regularly
using the software your write and providing feedback for improving it
bull Very helpful in designing multi-modal interfaces but frequently ignored
114
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Parametric and non-parametric tests
bull Parametric 13ndash Assume normality for relevant
distributions work in parameter space (means and variances)
ndash Student t-test and ANOVA bull Non-parametric (no normality
assumption) ndash Kruskall-Wallis ndash Friedman test
115
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance
bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats
bull Independent and paired variants ndash Control group and treatment group (n = participants in each
group)ndash Same group before and after treatment ndash Assumptions sample size variance
bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t
Student t-test
116
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial 117
the t-testbull the point establish a confidence level in the
difference wersquove found between 2 sample means
bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given
p df t(pdf)
5 if t gt t(pdf) can reject null hypothesis at
Tuesday 22 October 13
ICASSP 2013 tutorial 118
significance pbull measure of the area of the normal distribution
occupiedby the null hypothesis = the chance you might be
wrong
bull null hypothesis rejection area
regions for rejecting the null hypothesis
region for rejecting the null hypothesis
X2 X2
critical value t(pdf)
X1or
Tuesday 22 October 13
ICASSP 2013 tutorial 119
calculating tbull compute combined variance for the two samples
bull compute standard error of difference sed
bull compute t
note df computation
Tuesday 22 October 13
ICASSP 2013 tutorial 120
comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml
bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-
Tuesday 22 October 13
ICASSP 2013 tutorial 121
two tailed α02 01 005 002 001 0002 0001
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova I bull Generalizes t-test to more than 2
groupsbull Observed variance is partitioned to
different sources of variationbull ANOVA ndash widely used (and probably
abused) technique in psychological research
bull Variants (models III III)
122
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Anova II bull ANOVA statistical significance are
independent of scaling and bias bull It boils down to computing various
means and variances dividing two variances comparing ratio to table to determine significance
bull Variants One way ANOVA factorial ANOVA
123
Human Computer Interaction
Tuesday 22 October 13
ICASSP 2013 tutorial
Integration and
124
IampI Case studies
bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP
PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics
covered in the tutorial can be combined into coherent multi-modal interfaces
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Thereminbull Leon Theremin 1928bull senses hand position
relative to antennaendash two antennae ndash change in electrostatic
field translated to pitch control of heterodyne oscillators
ndash beat frequency amplified
bull Clara Rockmore playing
125
Tuesday 22 October 13
ICASSP 2013 tutorial
Electronic Sackbut (Le Caine 1940s)
bull sensor keyboardndash downward and side-to-
side ndash potentiometers
bull right hand can modulate loudness and pitch
bull left hand modulates waveform
126
Science Dimension volume 9 issue 6 1977
Canada Science and Technology Museum
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 127
Tuesday 22 October 13
ICASSP 2013 tutorial 128
Glove-TalkII
bull Translates hand gestures to speechndash like a musical instrument
bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal
Tuesday 22 October 13
ICASSP 2013 tutorial 129
Spectrum of Gesture-to-Speech Mappings
ArtificialVocalTract
PhonemeGenerator
FingerSpelling
SyllableGenerator
WordGenerator
Von
Kem
pele
n (1
790)
Bell
amp B
ell (
1880
)D
udle
y et
al
(193
9)Fe
ls amp
Hin
ton
(199
8)
Kram
er amp
Lei
fer
(198
9)
Fels
amp H
into
n (1
990)
10-30 100 130 200 500
approximate timegesture for connected speech(msec)
Tuesday 22 October 13
ICASSP 2013 tutorial 130
Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels
ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)
bull Consonantsndash constrictions in hand represent constriction in vocal tract
bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)
Tuesday 22 October 13
ICASSP 2013 tutorial 131
GTII Mapping
bull 26+ dimensionsbull constrained subspace
bull 10 dimensions
Input Output
Tuesday 22 October 13
ICASSP 2013 tutorial 132
GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others
Tuesday 22 October 13
ICASSP 2013 tutorial 133
GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network
Tuesday 22 October 13
134
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
135
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 136
VowelConsonant Networkbull 10 - 5 - 1 layer network
ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation
ndash Outputbull Probability of vowel
ndash Trainingbull 2600 consonants 700 vowelsbull 0 error
ndash Testingbull 1380 consonants 234 vowelsbull 0 error
Tuesday 22 October 13
137
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 138
GTII Vowel Networkbull Various networks tried
ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network
ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters
bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error
bull Testingndash 50 examples of each vowel
Tuesday 22 October 13
ICASSP 2013 tutorial 139
A Normalized RBF Network
bull Radially centred activation unitsndash Gaussian
activationbull Weights are centre
ndash Normalized over all units in groupbull Hidden units
Tuesday 22 October 13
ICASSP 2013 tutorial 140
Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width
parameterndash Plateaus around nearest centrebull Closest RBF dominates
Tuesday 22 October 13
141
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
Tuesday 22 October 13
ICASSP 2013 tutorial 142
Consonant Networkbull 10 - 14 - 9 normalized RBF network
ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later
ndash Output formant parameters and voicingbull Training
ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error
bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error
bull Dependent on user
Tuesday 22 October 13
143
SpeechOutput
Glove-TalkII System
Foot Pedal
xyz roll pitch yaw(60 Hz)
10 flex angles4 abduction angles
thumb and pinkie rotationwrist pitch and yaw
(100Hz)
Right Hand Data
ContactSwitches
Preprocessor
Fixed PitchMapping
VC DecisionNetwork
VowelNetwork
ConsonantNetwork
Fixed StopMapping
Synthesizer
CombiningFunction
X
bull 3 neural netsbull Output Parallel Formant Speech Synthesizer
ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
144
Glove-TalkII
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
DIVA in the hands of
145
Tuesday 22 October 13
Magic Eyes
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Leveraging traditional
147
Tuesday 22 October 13
Phantom Faders
Use the actual acoustic instrument as a control surface inspired by Marimba Lumina
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Phantom Faders
149
Tuesday 22 October 13
Percussion Robots
150
Tuesday 22 October 13
Tele-operation
151
Tuesday 22 October 13
Drum sound classification
152
Tuesday 22 October 13
Self-calibration and mapping based on listening
153
Tuesday 22 October 13
Physical Modeling
154
Tuesday 22 October 13
System Architecture
155
Tuesday 22 October 13
Feedback Loop
156
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Playing a piece
157
Tuesday 22 October 13
Summary
158
Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and
implementation bull Case Studies
Tuesday 22 October 13
Summary
159
bull Many resources available13 13 13 wwwnimeorg
bull Many educational programs availablebull Musical Instruments are the ultimate
multi-modal interfaces bull Learning to play music is a lifelong
pursuitbull NIMEs are a great domain to design
test and evaluate radical ideas for HCI Tuesday 22 October 13
Questions
160
wwwnimeorg
Sid George ssfelseceubcca gtzancsuvicca
Tuesday 22 October 13