multi-modal user interfaces - marsyas.info

174
ICASSP 2013 tutorial Multi-modal User Interfaces: a new music instruments Sidney Fels ([email protected] ) University of British Columbia, Canada George Tzanetakis ([email protected] ) TUTORIAL: ICASSP 2013 1 Tuesday, 22 October, 13

Upload: others

Post on 02-Dec-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

ICASSP 2013 tutorial

Multi-modal User Interfaces a new music instruments

Sidney Fels (ssfelseceubcca) University of British Columbia

Canada George Tzanetakis (gtzancsuvicca)

TUTORIAL ICASSP 2013

1

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction

2

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Which is more interesting

3

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Multi-modal Interfacesbull Multiple modalities for both input and

output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our

body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically

6

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction

2

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Which is more interesting

3

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Multi-modal Interfacesbull Multiple modalities for both input and

output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our

body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically

6

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Which is more interesting

3

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Multi-modal Interfacesbull Multiple modalities for both input and

output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our

body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically

6

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Multi-modal Interfacesbull Multiple modalities for both input and

output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our

body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically

6

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Thishellip

4

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Multi-modal Interfacesbull Multiple modalities for both input and

output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our

body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically

6

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Multi-modal Interfacesbull Multiple modalities for both input and

output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our

body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically

6

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Or thishellip

5

Tuesday 22 October 13

ICASSP 2013 tutorial

Multi-modal Interfacesbull Multiple modalities for both input and

output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our

body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically

6

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Multi-modal Interfacesbull Multiple modalities for both input and

output bull Information feedback bull Generalize any type of existing interface bull The ultimate multi-modal interface is our

body and the physical world bull Blending of the physical and the virtual bull Challenging to design develop and adopt bull Huge potential to have impact specifically

6

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Why music bull Musical Instruments are the ultimate

multi-modal interfaces (physical predates digital and analog interfaces)

bull The complexity and subtlety of the communication of a musician with their instrument as well as in interactions with other musicians is staggering

bull New musical instruments are a great domain specific research area to design test and evaluate radical ideas for HCI

7

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Control

8

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Continuous Control

9

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Human to human interaction and music performance

10

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Evolution of output devices

11

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

More output devices

12

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

SAGE

13

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

REACTABLE

14

Motivation and Overview

Reactable Music Technology Group (2006)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Smartphones as instruments

15

Motivation and Overview

iPhone Ocarina from Smuletrade (Wang et al 2009)

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Beyond direct mapping bull Direct Mapping

ndash Sensor readings mapped directly to input controls (mouse trackpad keyboard)

ndash Easy to learn and interpret ndash Expressive especially for continuous controllers

bull Beyond Direct Mapping ndash Gesture recognition (pinch to zoom) ndash Speech recognition ndash Adaptive possibly domain and person specific ndash More similar to human to human interaction ndash Require layer of DSP and ML between input and

16

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Relevance beyond music bull Music instruments have anticipated many

developments in user interfaces such as the keyboard for typing letters and words

bull Similarly new interfaces for musical expression can anticipate developments in more general computer user interfaces

17

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Signal Processing Challengesbull Noisy sensor readings bull Multiple sampling ratebull Synchronous and asynchronous

streams at different rates bull Higher level understanding ndash Supervised and unsupervised learning ndash Time alignment

bull Real-time and causality

18

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Interdisciplinary Challengesbull Inherently interdisciplinary field bull ECE background

ndash MATLAB culture ndash No HCI user centered training ndash Focus on algorithms not programming experience

bull CS background ndash No DSP ndash No circuits ndash Focus on programing experience not algorithms

bull Musicndash Performance and composition culturendash No HCI DSP or programming

bull Integration ndash putting it all together

19

Motivation and Overview

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

New Interfaces for Musical Expression (NIME)

20

Motivation and Overview

First organized as a workshop of ACM CHIrsquo2001Experience Music Project - Seattle April 2001LecturesDiscussionsDemosPerformances

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Research on HCIMusic

21

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Tutorial objectives bull Broad overview of relevant areas to the

design and development of multi-modal user interfaces

bull Provide pointers and starting concepts for researchers from more traditional DSP backgrounds that want to enter this area

bull Make connections between the individual topics using new music

22

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies bull Summary

23

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

A typical NIME processbull Pick control spacebull Pick sound spacebull Pick mappingbull Connect with softwarebull Compose and practicebull Repeat

bull 1 and 2 often switched

bull Tools to help with steps 1-4

24

Sensors and Actuators

Sensors + signal processingActuators + signal processingHCI

Engineering and programmingMusic Fun and Effort

Effort and pain

If you are lucky

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

What to measure bull Plethora of sensorsbull Motion (position

velocity acceleration rotation) of body parts

bull Torque forces (isometric and isotonic)

bull Pressure

bull Proximitybull Temperature bull Light bull Bio-signals

Heart rate Brain waves Galvanic skin responseMuscle activations

bull Many more hellip

25

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Transduction and Digitizing

26

Sensors and Actuators

Electrical 13 Voltage 13 Resistance 13 Impedance Optical 13 Color 13 Intensity Magnetic 13 Induced Current 13 Field Direction

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Digitizing

27

Sensors and Actuators

bull Converting change in resistance to voltage (typical sensor has variable resistance)

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Physical Property Sensors

28

Sensors and Actuators

bull Piezoelectric Sensors bull Force Sensing Resistors bull Accelerometer (Analog Devices ADXL50) bull Biopotential Sensors bull Microphones bull Photodetectorsbull CCDs and CMOS camerasbull Electric Field Sensorsbull RFIDbull Magnetic trackers (Polhemus Ascension)bull and morehellip

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

29

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Human Action Oriented

30

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Material whose resistance changes when force is applied on it

bull Thin film low cost easy to interface bull Measurements are not very consistent

(differences of 10 are frequently observed)

bull An easy force sensitive button

Force-sensing resistors

31

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Piezoelectric Sensors

32

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Accelerometers

33

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Capacitive Sensing bull Trackpads and touch screensbull Small voltage applied to insulator

coated with conductive material When conductor (such as finger) applied to layer a capacitor is dynamically formed

bull Capacitance is typically measured indirectly by using it to control the frequency of an oscillator or to vary the level of coupling (or attenuation) of an AC signal

34

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

Mircophones and Microphone Arrays

35

Sensors and Actuators

bull Carbonbull lightly packed carbonbull compression increases conductionbull needs power supply

bull Capacitor (condenser) bull capacitor between a stationary metal plate and a

light metallic diaphragmbull compression changes capacitance by moving

diaphragmbull need power supply

bull Electret and Piezoelectric bull mentioned beforebull no external power needed

bull Magnetic (moving coil) bull induction - moving conductor in magnetic field

bull diaphragm with coil of wire immersed in magnetic field

bull Check out Kinecttrade

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13

ICASSP 2013 tutorial

CCD amp CMOS Camera

36

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

CMOS Camerasbull CCDs have to transfer charge rows

and columns one at a timebull CMOS photodiode arrays put amplifier

at each pixelndash about 3 transistors (photodiode version)ndash 1 transistor (photogate version)

bull amplifier circuitry takes up a lot of roomndash only viable recently as sub-micron tech

gets betterndash only useful for low-end still

bull cheap (lt$100) low power (10-50mW vs 1-2W)

bull offer single chip solution

37

Tuesday 22 October 13

ICASSP 2013 tutorial

Depth Camera

38

Sensors and Actuators

bull Kinect is probably best knownbull Motion tracking with body model

bull head arms and feetbull body geometrybull 20 joints per person

bull face recognitionbull RGB camera

bull 30 Hzbull depth sensor

bull Infrared projection + camerabull microphone array

bull directional sound localization speech recognition and noise cancelation

bull CheapTuesday 22 October 13

ICASSP 2013 tutorial

Actuators bull Electromechanical devices that affect

the physical world but are controlled digitally

bull Building blocks of robots and robotic devices

bull Output component of multi-modal interfaces

bull Examples

39

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Solenoidsbull Electromagnetic coil wound around a

movable steel or iron rod bull Linear and rotary variants bull Easy to control but not very precise

40

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Motorsbull Synchronous electric motors ndash ie frequency synchronized with frequency

of supplied current in steady state bull Brushed and brushlessbull Characterized by speed and torque bull Typically DC

41

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Stepper and Servo Motorsbull Stepper Motor ndash Brushless DC that divides rotation into equal

stepsndash Move and hold no feedback circuitry required ndash Low cost

bull Servo Motor ndash Motor coupled with sensor for feedback ndash High performance alternative to stepper motorsndash Higher cost

42

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Wii Remote bull Introduced in 2005 by Nintendo bull Accelerometer 3-axis bull Optical sensor bull 10 infrared led on sensor band (placed

on TV) for triangulation for use as pointing device

bull Large diversity of different styles of control is possible in games and music

43

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Kinectbull Launched in 2010 bull No controller other than the body bull Webcam form factor bull Guiness record fastest selling

consumer electronic devicebull RGB camera bull Depth sensor based on infrared

structured light bull Microphone Array (acoustic source

localization and ambient noise suppression)

44

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Mobile phones and tabletsbull Sensors embeddedndash GPSndash Accelerometerndash Gyrondash Temperaturendash Microphonendash Capacitive displayndash Networkingndash input port for more

bull ActuatorsSpeakerHeadphonesVibratorHigh-res displayNetworkingOutput port

45

Tuesday 22 October 13

ICASSP 2013 tutorial

DAQbull use a data acquisition

board plugged into your computerndash eg National Instruments

DAQ bull Up to 16 analog inputs

12-bit resolution up to 500 kSs sampling rate

bull Two 12-bit analog outputs 8 digital IO lines two 24-bit counters

bull Icube (voltage-gtMIDI signal)

bull Arduino board

46

Tuesday 22 October 13

ICASSP 2013 tutorial

Tooka a simple example (Fels et al

47

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial 48

Tooka a simple example (Fels et al

Tuesday 22 October 13

ICASSP 2013 tutorial

Events and Time Series

49

Sensors and Actuators

Time

Time

Multiple channels (for example microphone arrays)

Asynchronous Events

Synchronous Samples

Tuesday 22 October 13

ICASSP 2013 tutorial

2D3D ND + time

50

Sensors and Actuators

Time Time

Each pixelvoxel can be viewed as an individual 1D time series but for applications it is important to take their spatial topology also into account

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensors lt-gt DSP lt-gt

51

Sensors and Actuators

Tuesday 22 October 13

ICASSP 2013 tutorial

Structure bull Motivation and Overview bull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and implementation bull Case Studies

52

Tuesday 22 October 13

ICASSP 2013 tutorial

Filtering bull Selective boostingattenuation of

different frequencies present in a signal

bull Digital filters operate on samples bull Variants low pass high-pass all-pass bull Basic fundamental blocks of signal

processing

53

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Peak Detection bull Very common operation in DSP bull A lot of secret sauce bull Common variants ndash Fixed and adaptive threshold ndash Local maximum ndash Spacing constraints ndash Interpolation ndash Output is peak locations and amplitudes

54

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Fourier Transform

55

Signals and Features

Spectrum

Tuesday 22 October 13

ICASSP 2013 tutorial

Short Time Fourier Transform

56

Signals and Features

Windowing view Filterbank view Efficient implementation for powers of 2 window sizesFFT the fast Fourier Transform

Tuesday 22 October 13

ICASSP 2013 tutorial

Spectrogram

57

Signals and Features

256 samples 22050 Hz

4096 samples 22050 Hz

Time-Frequency Tradeoff

Sprectrogram = sequence of time varying power spectra (3D with animation or 2D)

Tuesday 22 October 13

ICASSP 2013 tutorial

Wavelets

58

Signals and Features

STFT fixed time Frequency Resolutionbased on window size

DWT adaptive time frequency Resolution

Tuesday 22 October 13

ICASSP 2013 tutorial

Denoising bull Audio ndash Simplest approach LPF ndash Spectral subtraction using noise profilendash Median Filtering in time-frequency plane

bull Imagendash Challenge preserve edge discontinuities ndash Gaussian filtering 2D ndash Anisotropic diffusion ndash preserves edgesndash Median filtering ndash Frequently done in wavelet domain

59

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Interpolation and Resamplingbull Extensive applications in DSP bull Correctly compute sample values at arbitrary

continuous times based on available discrete time samples

bull Fractional delay filtersbull Variants

ndash LM interpolation (L) followed by decimation (M)ndash Band-limited interpolation at arbitrary points ndash Sinc interpolation results in perfect reconstruction for

band-limited continuous signals ndash Various approximation trading quality and

computational complexity bull For sensor data frequently linear or quadratic

60

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Calibration bull Comparison and adjustment between two

measurements (standard and test) bull Classic examples gravity based scales

with fixed weights tuning instruments bull Examples from NIME finding the range

(minimum and maximum) of some sensor reading common response from multiple sensors or actuators of the same type

bull Machine learning and control feedback are great tools for calibration

61

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Scaling bull Mapping of the sensor readings to

a desired control parameter with different range units

bull NIME examples mapping a rotary knob to frequency or a slider to volume

bull Representation dynamic range is different than the true dynamic range bull Power law functions frequently

used bull Frequently used in conjunction

with calibration

62

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Periodicity Detection bull Music to a large extent consists of

sounds arranged at multiple time periodicities

bull Examples beats notes repeated gestures like strumming melodies chords

bull Large variety of techniques differing in accuracy and computational cost ndash Correlation based

63

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

64

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Autocorrelation and Cross-Correlation

65

Signals and Features

Efficient computation when N is a power of 2

Tuesday 22 October 13

ICASSP 2013 tutorial

Similarity Matrix

66

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Image Segmentation bull Partition image into sets of pixels bull Pixels in a set share visual

characteristics bull Helps to locate objects and boundariesbull Many approaches ndash K-means ndash Compression-basedndash Histogram-based ndash Edge Detection

67

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Object tracking bull Follow the movement of interest

points or objects in an image sequence bull Single vs multiple objects bull Typically utilize some form of motion

model bull Typically two stages ndash Target representation and location

(bottom up) ndash Target filtering and data association (top

68

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

NIME Object tracking

69

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Audio

70

Signals and Features

Tuesday 22 October 13

Mel Frequency Cepstral Coefficients

Mel-scale13 linearly-spaced filters 27 log-spaced filters

CFCF-130CF 10718

CF+130CF 10718

Mel-filtering

Log

DCT

MFCCs

Tuesday 22 October 13

ICASSP 2013 tutorial

Discrete Cosine Transformbull Strong energy compactionbull For certain types of signals 13 13 1313 approximates KL transform (optimal)

bull Low coefficients represent most 13 13 13 13 of the signal - can throw high

bull MFCCs keep first 13-20 bull MDCT (overlap-based) used in MP3 AAC Vorbis audio compression

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Image bull Color texture shape bull Example color histograms

73

Signals and Features

Reduced to 256 colors

Tuesday 22 October 13

ICASSP 2013 tutorial

Feature Extraction Dynamicsbull Delta features bull Aggregate statistics of time sequencendashMean Median Mean Variance

bull ARMA bull Statistical models such as GMM bull Modulation features

74

Signals and Features

Tuesday 22 October 13

ICASSP 2013 tutorial

Principal Component Analysis

75

Signals and Features

Projection matrix

PCAEigenanalysisof correlationmatrix

Tuesday 22 October 13

ICASSP 2013 tutorial

Self-Organizing Maps

Tuesday 22 October 13

Self-Organizing Maps

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Formulationbull Objective given a feature vector

representing something predict the class (a discrete categorical label) it belongs to

bull Typically supervised learning through providing a training set consisting of feature vectors and the associated ground truth labels

78

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithmsbull Parametric ndash Generating approaches bull Naiumlve Bayes bull Gaussian Mixture Models

ndash Discriminative approaches bull Support Vector Machines bull Decision trees

ndash Non-parametric bull K-nearest Neighbors

79

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Algorithms

80

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Classification Evaluationbull Accuracy F-measure Confusion

matrix bull Cross-validation and bootstrapping bull Stratified cross-validation

81

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Formulationbull Given a set of unlabeled feature vectors

partition them into sets (clusters) that contain similar items

bull Similar to classification but no training data is provided

bull Frequently the number of clusters K is provided based on domain specific knowledge

bull Variationsndash Hierarchical ndash Semi-supervised

82

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithmsbull K-means and variants bull Distribution based ndash EM algorithm

bull Density-based (areas of high density are clusters noise and border points ignored ndash DBScan

bull Graph-based

83

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Algorithms

84

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Clustering Evaluation bull Internal evaluation (Cluster properties)ndash Davies-Bouldin Indexndash Dunn Index

bull External evaluation (Additional ground truth labels) ndash similar to classificaiton ndash F-measure ndash Rand measure ndash Jaccard Index ndash Confusion matrix

bull Various types of user studies

85

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Formulationbull Given a feature vector predict a

continuous value ie given day of the year and humidity predict temperature

bull Parametric ndash Linear regression ndash Ordinary least squares

bull Non-parametric ndash Kernel Regressionndash Regression Trees

86

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Regression Evaluationbull Goodness of fit ndash Coefficient of determination R squared

(correlation coefficient in linear regression between true and predicted)

bull Statistical significance of results ndash Parametric methods ndash F-test ndash t-test for individual parameters

87

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Surrogate Sensors

Use direct sensors to ldquolearnrdquo indirect acquisition

Use augmented instrument for training Record acoustic signal Train model to associate direct sensor

with the acoustic signal Evaluate and iterate

Use trained model in non-

Training Surrogate Sensors in Musical Gesture Acquisition SystemsIEEE Trans on Multimedia 2011 A Tindale A Kapur G Tzanetakis

Uncertainty and Time

Tuesday 22 October 13

Surrogate Sensing and the Ground Truth problem

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Two case studies Regression1313 13 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13 13 13 13 13 13 13 1313 1313 13 13 13 13

Classification

Tuesday 22 October 13

ICASSP 2013 tutorial

Some ResultsUncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Advantages Hard-to-build augmented

instrument is only used for training No modifications required Unlimited supply of training data for

the machine learning model TRAIN BY PLAYING is much more fun

than TRAIN BY ANNOTATING

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion bull Multiple sensor streams need to be

combined to make a decision bull Multiple rates might require

interpolation either of input or output or intermediate stages

bull Various possible architecture combining machine learning building blocks

93

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Sensor Fusion

94

Uncertainty and Time

Early and late are the extremes of a full spectrum of possibilities Feature Extraction

Feature Extraction

Dimensionality Reduction

Dimensionality Reduction

Feature Selection

Feature Selection

Classification

Classification

Tuesday 22 October 13

Multi-modal Results

Main idea use camera to constrain factorization results taking advantage of uncorrelated errors

Tuesday 22 October 13

ICASSP 2013 tutorial

Causality and Real Time bull Causal algorithms only need

knowledge of the past to operate ie can not ldquolookrdquo ahead

bull Causality is a necessary but not sufficient condition for real time performance

bull Real-time the processing is done with some delay at the same time as the sensor data

96

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Dynamic Time Warping

97

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Modeling Uncertainty over bull Time series of snapshot of the world ldquostaterdquo

we are interested represented as a set of random variables (RVs)ndash Observable ndash Hidden

bull Stationary process (not static) bull Markovian Property (current state depends

only on finite history ndash typically just previous time slice)

bull Transition Model P(current stateprevious state)

98

Tuesday 22 October 13

ICASSP 2013 tutorial

Inference tasks in temporal bull Filtering posterior distribution over current

state given evidence = likelihood of evidence bull Prediction posterior distribution of future

state given evidence to date bull Smoothing posterior distribution of past state

given all evidence up to the present bull Most likely explanation given sequence of

observations most likely sequence of states that has generated them

bull EM-algorithmndash Estimate what transitions occurred and what

states generated the sensor reading and update models

ndash Updated models provide new estimates and 99

Tuesday 22 October 13

ICASSP 2013 tutorial

Hidden Markov Models I

100

Uncertainty and Time

Hidden

p( | )

Observed

Model

1 2

P( | )

3 4

t t-1

Transition Probs

tEmission Probs

MODEL

Observations

Hidden State(single discretevariable)

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

101

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filter bull Linear Gaussian conditional distributions

represent state and sensor models bull LG P(xy)=N(ay y + by σy)(c)bull Next state is linear function of current

state plus some Gaussian noise ie constant dxdt

bull Forward step mean + covariance matrix at t produces mean + covariance matrix at t+1

bull Trade-off between observation reliability and model reliability

102

Tuesday 22 October 13

ICASSP 2013 tutorial

Kalman Filtering bull Streams of noisy input data bull Basic idea t-gtt+1 ndash Prior knowledge of state ndash Prediction step (based on some model) ndash Update step (compare prediction to

measurements)ndash Readjust model ndash Output estimate of state

bull Statistically optimal estimate of system state

103

Uncertainty and Time

Tuesday 22 October 13

ICASSP 2013 tutorial

Multimodal tempo detection for the E-sitar

104

Case Studies

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Beat tracking

105

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Human-Computer Interaction bull The discipline that studies the

interaction between humans and machines

bull Fundamental concept everything should be user-centered

bull Evaluation is not as straightforward and a variety of different techniques have been proposed

bull Typically not familiar to those coming

106

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Quality of Multimedia bull New subfield in multimedia bull Methods for evaluating multimedia

quality and user experiencebull User centered approach bull Combines objective metrics and

subjective testing

107

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 108

ethnography

bull origin anthropologybull basic idea studying people ldquoin the wildrdquobull research ethnographers attempt to understand a workplace

through immersion extended contact and subsequent analysis

bull most useful very early in development build an understanding of existing (work) practices thorough enough to illuminate the possibilities for and implications of introducing technology

bull ethnographic studies most often provide warnings ndash detailed descriptions of work practices that new technology may disrupt

bull eg Lucy Suchman formerly at Xerox Parc ethnography of air traffic controllers

Tuesday 22 October 13

ICASSP 2013 tutorial 109

ethnography

bull up sidendash comprehensive understanding of current (work) practicesndash greater ability to predict the impact of a new or re-designed

technologyndash possibly greater buy-in for the system

bull down sidendash principal cost is time both the ethnographerrsquos and the usersrsquondash could perpetuate negative aspects of current practicesndash can produce vast (unmanageable) amount of datandash output is description of practices rather than specific designs

bull ethnographers are not trained as designers taught to ldquointerfererdquo as little as possible with the community

Tuesday 22 October 13

ICASSP 2013 tutorial 110

participatory design

bull Users (often musicians) become 1st class members in the design processndash active collaborators vs passive participants

(eg interviewees)

bull users considered subject matter experts

bull iterative process all design stages subject to revision

side note origins in ScandanaviaTuesday 22 October 13

ICASSP 2013 tutorial 111

participatory design

bull up sidendash users are excellent at reacting to suggested system designs

bull designs must be concrete and visiblendash users bring in important ldquofolkrdquo knowledge of work context

bull knowledge may be otherwise inaccessible to design teamndash greater buy-in for the system often results

bull down sidendash hard to get a good pool of end users

bull expensive reluctant ndash users are not expert designers

bull donrsquot expect them to come up with design ideas from scratchndash the user is not always right

bull donrsquot expect them to know what they wantndash conservative bias to perpetuate current practices

bull donrsquot expect them to fully exploit the potential of new technologies

Tuesday 22 October 13

ICASSP 2013 tutorial 112

Wizard of Ozbull A method of testing a system that does not exist

ndash the voice editor by IBM (1984)

The WizardWhat the user sees

Tuesday 22 October 13

ICASSP 2013 tutorial 113

Wizard of Ozbull human simulates the systemrsquos intelligence and interacts

with user

bull uses real or mock interfacendash ldquoPay no attention to the man behind the curtainrdquo

bull user uses computer as expected

bull ldquowizardrdquo (sometimes hidden)ndash interprets subjectrsquos input according to an algorithmndash has computerscreen behave in appropriate manner

bull good forndash adding simulated and complex vertical functionalityndash testing futuristic ideas

bull possible cons

Tuesday 22 October 13

ICASSP 2013 tutorial

Eat your own dogfood bull Frequently programmers donrsquot use the

software they write bull Dogfooding is the process of regularly

using the software your write and providing feedback for improving it

bull Very helpful in designing multi-modal interfaces but frequently ignored

114

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Parametric and non-parametric tests

bull Parametric 13ndash Assume normality for relevant

distributions work in parameter space (means and variances)

ndash Student t-test and ANOVA bull Non-parametric (no normality

assumption) ndash Kruskall-Wallis ndash Friedman test

115

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

bull Null hypothesis both groups are random samples of same population ndash p-value probability of observed arising by chance

bull Devised as a way to monitor the quality of stout (Student was a pen name used so that Guiness would protect the idea of using stats

bull Independent and paired variants ndash Control group and treatment group (n = participants in each

group)ndash Same group before and after treatment ndash Assumptions sample size variance

bull Equal sample size and equal variancebull One and two-tailed variants for variants for p-value which is determined from t

Student t-test

116

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial 117

the t-testbull the point establish a confidence level in the

difference wersquove found between 2 sample means

bull the process1 compute df2 choose desired significance p (aka α) 3 calculate value of the t statistic4 compare it to the critical value of t given

p df t(pdf)

5 if t gt t(pdf) can reject null hypothesis at

Tuesday 22 October 13

ICASSP 2013 tutorial 118

significance pbull measure of the area of the normal distribution

occupiedby the null hypothesis = the chance you might be

wrong

bull null hypothesis rejection area

regions for rejecting the null hypothesis

region for rejecting the null hypothesis

X2 X2

critical value t(pdf)

X1or

Tuesday 22 October 13

ICASSP 2013 tutorial 119

calculating tbull compute combined variance for the two samples

bull compute standard error of difference sed

bull compute t

note df computation

Tuesday 22 October 13

ICASSP 2013 tutorial 120

comparing t with critical bull look t(pdf) up in a pre-computed tablendash in back of any statistics textbookndash on web eg 13 httpwwwmedcalcbemanualmpage13-04bhtml13 httpwwwstatsoftcomtextbookstathomehtml

bull or (now most common) compute the p for your t(df) directlyndash Matlab or Excel functionsndash web calculators (eg search on ldquostudentrsquos t-

Tuesday 22 October 13

ICASSP 2013 tutorial 121

two tailed α02 01 005 002 001 0002 0001

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova I bull Generalizes t-test to more than 2

groupsbull Observed variance is partitioned to

different sources of variationbull ANOVA ndash widely used (and probably

abused) technique in psychological research

bull Variants (models III III)

122

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Anova II bull ANOVA statistical significance are

independent of scaling and bias bull It boils down to computing various

means and variances dividing two variances comparing ratio to table to determine significance

bull Variants One way ANOVA factorial ANOVA

123

Human Computer Interaction

Tuesday 22 October 13

ICASSP 2013 tutorial

Integration and

124

IampI Case studies

bull Typically several different software packages bull Laundry list of names ndash Arduino Processing OpenFrameworks MaxMSP

PureData OpenCV Marsyas Chuck Supercolliderbull Integration is not trivial bull Case studies illustrate how the various topics

covered in the tutorial can be combined into coherent multi-modal interfaces

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Thereminbull Leon Theremin 1928bull senses hand position

relative to antennaendash two antennae ndash change in electrostatic

field translated to pitch control of heterodyne oscillators

ndash beat frequency amplified

bull Clara Rockmore playing

125

Tuesday 22 October 13

ICASSP 2013 tutorial

Electronic Sackbut (Le Caine 1940s)

bull sensor keyboardndash downward and side-to-

side ndash potentiometers

bull right hand can modulate loudness and pitch

bull left hand modulates waveform

126

Science Dimension volume 9 issue 6 1977

Canada Science and Technology Museum

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 127

Tuesday 22 October 13

ICASSP 2013 tutorial 128

Glove-TalkII

bull Translates hand gestures to speechndash like a musical instrument

bull mapping partially learnedbull speech isndash intelligiblendash expressivendash slower than normal

Tuesday 22 October 13

ICASSP 2013 tutorial 129

Spectrum of Gesture-to-Speech Mappings

ArtificialVocalTract

PhonemeGenerator

FingerSpelling

SyllableGenerator

WordGenerator

Von

Kem

pele

n (1

790)

Bell

amp B

ell (

1880

)D

udle

y et

al

(193

9)Fe

ls amp

Hin

ton

(199

8)

Kram

er amp

Lei

fer

(198

9)

Fels

amp H

into

n (1

990)

10-30 100 130 200 500

approximate timegesture for connected speech(msec)

Tuesday 22 October 13

ICASSP 2013 tutorial 130

Glove-TalkII Vocabularybull Loosely based on articulatory model of speechbull Vowels

ndash open configuration of hand represents open vocal tractndash XY position determines vowel sounds (like tongue)

bull Consonantsndash constrictions in hand represent constriction in vocal tract

bull Stop consonants produced with ContactGlovebull Volume controlled with foot pedal (air pressure)bull Pitch controlled by Z position of hand (vocal cord tension)

Tuesday 22 October 13

ICASSP 2013 tutorial 131

GTII Mapping

bull 26+ dimensionsbull constrained subspace

bull 10 dimensions

Input Output

Tuesday 22 October 13

ICASSP 2013 tutorial 132

GTII Mappingbull Ad hocbull PCAbull Neural Networksbull others

Tuesday 22 October 13

ICASSP 2013 tutorial 133

GTII Neural Networksbull 3 neural networks usedndashMixture of expert modelbull VowelConsonant decision networkbull Vowel Expert networkbull Consonant Expert network

Tuesday 22 October 13

134

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

135

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 136

VowelConsonant Networkbull 10 - 5 - 1 layer network

ndash Input 10 flex anglesbull 4x2 finger jointsbull Thumb abduction and thumb rotation

ndash Outputbull Probability of vowel

ndash Trainingbull 2600 consonants 700 vowelsbull 0 error

ndash Testingbull 1380 consonants 234 vowelsbull 0 error

Tuesday 22 October 13

137

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 138

GTII Vowel Networkbull Various networks tried

ndash Ended up with normalized RBF networkbull 2 - 11 - 8 normalized RBF network

ndash Fixed std deviation (015)ndash Input xy valuesndash Output 8 formant parameters

bull Trainingndash 100 examples of each vowel with random noise addedndash 01 error

bull Testingndash 50 examples of each vowel

Tuesday 22 October 13

ICASSP 2013 tutorial 139

A Normalized RBF Network

bull Radially centred activation unitsndash Gaussian

activationbull Weights are centre

ndash Normalized over all units in groupbull Hidden units

Tuesday 22 October 13

ICASSP 2013 tutorial 140

Normalized RBF Unitsbull RBF is active as gesture nears centrebull Normalization ndash interpolation according to width

parameterndash Plateaus around nearest centrebull Closest RBF dominates

Tuesday 22 October 13

141

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

Tuesday 22 October 13

ICASSP 2013 tutorial 142

Consonant Networkbull 10 - 14 - 9 normalized RBF network

ndash Input finger and thumb anglesbull Note added thumb abduction and rotation later

ndash Output formant parameters and voicingbull Training

ndash 350 approximants 1510 fricatives and 740 nasalsndash 005 error

bull Testingndash 255 approximants 960 fricatives 160 nasalsndash 1 error

bull Dependent on user

Tuesday 22 October 13

143

SpeechOutput

Glove-TalkII System

Foot Pedal

xyz roll pitch yaw(60 Hz)

10 flex angles4 abduction angles

thumb and pinkie rotationwrist pitch and yaw

(100Hz)

Right Hand Data

ContactSwitches

Preprocessor

Fixed PitchMapping

VC DecisionNetwork

VowelNetwork

ConsonantNetwork

Fixed StopMapping

Synthesizer

CombiningFunction

X

bull 3 neural netsbull Output Parallel Formant Speech Synthesizer

ndash ALF F1 A1 F2 A2 F3 A3 AHF V F0ndash 100 Hz 6 bit quantization [0 63]

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

144

Glove-TalkII

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

DIVA in the hands of

145

Tuesday 22 October 13

Magic Eyes

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Leveraging traditional

147

Tuesday 22 October 13

Phantom Faders

Use the actual acoustic instrument as a control surface inspired by Marimba Lumina

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Phantom Faders

149

Tuesday 22 October 13

Percussion Robots

150

Tuesday 22 October 13

Tele-operation

151

Tuesday 22 October 13

Drum sound classification

152

Tuesday 22 October 13

Self-calibration and mapping based on listening

153

Tuesday 22 October 13

Physical Modeling

154

Tuesday 22 October 13

System Architecture

155

Tuesday 22 October 13

Feedback Loop

156

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Playing a piece

157

Tuesday 22 October 13

Summary

158

Covered many aspectsbull Sensors and Actuators bull Signals and Features bull Uncertainty and time bull Human Computer Interaction bull Integration and

implementation bull Case Studies

Tuesday 22 October 13

Summary

159

bull Many resources available13 13 13 wwwnimeorg

bull Many educational programs availablebull Musical Instruments are the ultimate

multi-modal interfaces bull Learning to play music is a lifelong

pursuitbull NIMEs are a great domain to design

test and evaluate radical ideas for HCI Tuesday 22 October 13

Questions

160

wwwnimeorg

Sid George ssfelseceubcca gtzancsuvicca

Tuesday 22 October 13