hidden melody in music playing motion: music …...3.2 bass guitar playing in this experiment, we...

7
Buenos Aires – 5 to 9 September, 2016 Acoustics for the 21 st CenturyPROCEEDINGS of the 22 nd International Congress on Acoustics General Musical Acoustics: Paper ICA2016-692 Hidden melody in music playing motion: Music recording using optical motion tracking system Min-Ho Song (a) (a) fourMs group, Department of Musicology, University of Oslo, Norway, [email protected] Abstract This paper shows a feasibility study of recording a sound using optical marker-based motion tracking cameras. Optical marker-based motion tracking system can record the motion of moving object using multiple high-speed infrared (IR) cameras. Recent development of the device enables capturing the detailed motions with high spatial precision of 0.01m and high sampling rate up to 10kHz. Therefore, not only the global movements of human body or handheld instruments but also the local acoustic vibrations can be recorded within the motion data, which can be transformed to actual sound radiating from the acoustic instrument. To evaluate the feasibility, several light-weight reflective markers were attached to various positions on the string instruments. Several musical excerpts were selected considering the cameras’ Nyquist sampling rate. The instruments were played by professional players changing the loudness of the excerpts. The playing motions were recorded with a high-quality optical motion tracking system. Since the global motion trajectory is a relatively slow motion having the frequency component lower than 10Hz, an audible signal could be retrieved from the motion tracking data with low-pass filter. Although the current professional motion tracking system requires significantly high signal-to-noise ratio and can only retrieve the sound up to far less than 5kHz, but the result of the experiment shows that the optical marker-based motion tracking system can be useful in recording sound information from visual domain. Keywords: Optical Motion Tracking, Sonification, Music Playing Motion, Sound Retrieval

Upload: others

Post on 18-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Hidden melody in music playing motion: Music …...3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various

Buenos Aires – 5 to 9 September, 2016 Acoustics for the 21st Century…

PROCEEDINGS of the 22nd International Congress on Acoustics

General Musical Acoustics: Paper ICA2016-692

Hidden melody in music playing motion: Music recording using optical motion tracking system

Min-Ho Song(a)

(a) fourMs group, Department of Musicology, University of Oslo, Norway, [email protected]

Abstract

This paper shows a feasibility study of recording a sound using optical marker-based motion tracking cameras. Optical marker-based motion tracking system can record the motion of moving object using multiple high-speed infrared (IR) cameras. Recent development of the device enables capturing the detailed motions with high spatial precision of 0.01m and high sampling rate up to 10kHz. Therefore, not only the global movements of human body or handheld instruments but also the local acoustic vibrations can be recorded within the motion data, which can be transformed to actual sound radiating from the acoustic instrument.

To evaluate the feasibility, several light-weight reflective markers were attached to various positions on the string instruments. Several musical excerpts were selected considering the cameras’ Nyquist sampling rate. The instruments were played by professional players changing the loudness of the excerpts. The playing motions were recorded with a high-quality optical motion tracking system. Since the global motion trajectory is a relatively slow motion having the frequency component lower than 10Hz, an audible signal could be retrieved from the motion tracking data with low-pass filter. Although the current professional motion tracking system requires significantly high signal-to-noise ratio and can only retrieve the sound up to far less than 5kHz, but the result of the experiment shows that the optical marker-based motion tracking system can be useful in recording sound information from visual domain.

Keywords: Optical Motion Tracking, Sonification, Music Playing Motion, Sound Retrieval

Page 2: Hidden melody in music playing motion: Music …...3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various

22nd International Congress on Acoustics, ICA 2016 Buenos Aires – 5 to 9 September, 2016

Acoustics for the 21st Century…

2

Hidden melody in music playing motion: Music recording using optical motion tracking system

1 Introduction An optical motion tracking system is one of the widely used motion data acquisition methods in musical gesture/motion studies. The system consists of multiple high-speed infrared (IR) cameras that can record the trajectories of moving points (retro-reflective markers) in the three-dimensional space along the time axis. The recent improvement of these cameras makes it possible to record with HD resolution and high frame rate (up to 10kHz), which enable to capture the detailed fast motion that we could not see before.

In this study, we try to retrieve radiating sound of the musical instrument from the local acoustic vibrations recorded within the musician’s global movements using the optical motion tracking cameras.

There are related works that using high-speed cameras to recover sound from the visual data [1, 2]. Since these works do not use physical markers and use image-processing techniques, these methods have the advantage of simplified data acquisition process. However, we decided to use marker-based recording in this study because it can have some advantages over marker-less methods. First, marker-based measurement is more accurate than marker-less techniques. Second, the fast varying data can be retrieved from the moving object because the 3D trajectory of the object is known also.

The method is straightforward; the trajectories of attached reflective markers are recorded with high frame rate and the global movement caused by the musician’s motion is removed and transformed into sound.

2 Constraints 2.1 Nyquist-Shannon criterion

Table 1: Frequency range of conventional motion capture cameras

Optitrack (fps) ViCon (fps) Qualisys (fps) Normal mode 100 - 360 60 - 420 180 - 484

High-speed mode unknown unknown 10000

Considering Nyquist-Shannon criterion [3, 4], only the sound with bandwidth lower than half of the optical camera’s maximum frequency (frame per second) can be reconstructed. Table 1 shows the maximum frequency range (frames per second: fps) of several conventional optical motion tracking cameras. It should be note that sound with spectrum lower than 200Hz can only be theoretically retrieved using normal mode. However, the speed is related with data processing ability of the device and if we sacrifice the Field-of-View (FOV), then the measurable

Page 3: Hidden melody in music playing motion: Music …...3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various

22nd International Congress on Acoustics, ICA 2016 Buenos Aires – 5 to 9 September, 2016

Acoustics for the 21st Century…

3

bandwidth can increase up to 5kHz (a half of 10kHz), not perfect, but reasonable range to retrieve sound.

2.2 Detectability To detect a position of the marker in space, optical motion capture cameras emit IR light and the reflected light from the marker is received back again. If the reflected light energy (𝐸(!"#$)) is lower than the detectability threshold (𝐸(!"#)) of photo sensor, the marker cannot be detected. In order to assure the successful detectability of marker, the emitting IR light should have high intensity (𝐼(!"#$)) or the surface of marker (𝑆) should be large enough to reflect sufficient energy for detection. The condition for successful detection is shown in Eq. (1) below.

𝐸(!"#$) ∝𝐼(!"#$)

𝑓∙ 𝑆 ≫ 𝐸(!"#) (1)

When the camera operates with high frequency, the emitting light energy decreases as the shutter speed increases. In other words, cameras cannot have sufficient time duration for emitting adequate light energy. For example, when operating with 200Hz, ideally the camera can open the shutter for 5ms but if the frequency is 10kHz, the camera has only 0.1ms. This causes detectability problem and there comes for a need of big marker as we can see in Eq. (1). However, attaching big size markers on a musical instrument can be a problem because it can limit the movement freedom of the musician or even change the sound quality of the instrument. Therefore, in the case of recording music playing motion, the size of reflective marker should be selected in advance and the possible retrievable frequency bandwidth is determined afterwards.

2.3 Residual There are several factors that influence the error in estimating the position of markers [5]. Since the calculation of the position of a marker needs matrix inversion process, naturally, it involves a numerical error. Also the intrinsic, extrinsic parameters of motion cameras cannot be free from errors [6]. There errors make a marker position to be determined with limited resolution due to random noise. If the small signal we want to see has low Signal-to-Noise ratio, the signal cannot be retrieved or it would need a special noise reduction filter to increase the SNR.

3 Experiments 3.1 Pre-test: Loudspeaker

Page 4: Hidden melody in music playing motion: Music …...3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various

22nd International Congress on Acoustics, ICA 2016 Buenos Aires – 5 to 9 September, 2016

Acoustics for the 21st Century…

4

In this pre-test, we tried to test whether the fast vibration (no global movement) over the frequency range of the normal-mode can be recorded with high-speed mode of the cameras. In this test, the input signal of linear sine sweep signal (100Hz to 4kHz, duration of 10 second, Figure 1 left-bottom) was given to a loudspeaker (Marantz LD20) and the vibration of the diaphragm was recorded with four motion capture cameras (Oqus 400, Qualisys Ltd.).

The recording with the sampling rate of 10kHz was not possible with 4mm lightweight half-spherical marker due to detectability (darkening) problem. After trying with the 8kHz-sampling rate, we could obtain a marker trajectory shown in Figure 1 (middle-top). Comparing the spectrogram of the original linear sine sweep signal with the retrieved one (Figure 1 middle-bottom), we can see the increasing frequency component in the low frequency region but soon it disappears in the high frequency region around 2kHz (it is audible up to 2kHz). This is due to the characteristic of the moving-coil loudspeaker that the mechanical impedance increases at higher frequencies [7]. As the mechanical impedance increases, the vibration amplitude on the diaphragm decreases and masked by the random noise of the camera.

We increased the input signal by 60dB and the retrieved sound is audible up to around 3kHz. It implies that the motion cameras can record a sound up to several kHz, which contains important feature in speech and music signal. The result shows that if the measurement random noise is low or the sound source has high electroacoustic efficiency, then we could retrieve higher frequency sound.

Figure 1: Linear sine sweep signal is retrieved from the loudspeaker diaphragm.

Page 5: Hidden melody in music playing motion: Music …...3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various

22nd International Congress on Acoustics, ICA 2016 Buenos Aires – 5 to 9 September, 2016

Acoustics for the 21st Century…

5

3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various string instruments, but for thin strings, marker attaching acted as an external force to the string, therefore a 6-string bass guitar was selected as the test instrument. Six 4mm half spherical markers were attached to the strings. The attaching positions were selected as where the piezo-transducers lie below. The musician played chromatic scale with ‘A’ string (3rd string). The player was asked to move freely during the play. In Figure 2, we can see that the marker trajectory of ‘A’ string (3rd yellow point in the middle) shows the slow trend of musical playing movement, but also we can see small high frequency vibration together. To remove the player’s movement, the high-pass filter with cutoff frequency at 20Hz was applied and transformed to sound signal. The result is given in Figure 3. Although we are having some loss in the high frequency components, we can clearly see the chromatic scale is retrieved and it is clearly audible.

Figure 2: 6-string bass guitar playing for the experiment of sound recording using motion capture

cameras. The motion of ‘A’ string (yellow point in right-top figure) contains both musician’s motion and also the string vibration (right-bottom figure).

Page 6: Hidden melody in music playing motion: Music …...3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various

22nd International Congress on Acoustics, ICA 2016 Buenos Aires – 5 to 9 September, 2016

Acoustics for the 21st Century…

6

Figure 3: Chromatic scale is played and retrieved using motion capture cameras.

4 Conclusions The experimental results show that the improvement of motion capture camera is now taking us to see fast and small movement that we couldn’t see before. The method has several disadvantages because it requires special cameras and also need to attach a reflective marker to sound producing position, which may force the target sound to modulate. However, if we can overcome the three constraints, there is a chance of using smaller markers with higher frequencies, able to detect very small movement with optical motion captures cameras.

Acknowledgments The author would like to thank Anders Tveit for participating on motion recording.

References [1] Davis, A.; Rubinstein, M.; Wadhwa, N.; Mysore, G. J.; Durand, F.; Freeman, W. T. The visual

microphone: passive recovery of sound from video. ACM Transactions on Graphics, 33(4), 2014, pp. 79(1)-79(10).

[2] Akutsu, M.; Oikawa, Y.; Yamasaki, Y. Extract voice information using high-speed camera. Proceedings of Meetings on Acoustics, 19(1), 055019 (2013);

[3] Nyquist, H. Certain topics in telegraph transmission theory. Transaction in AIEE. 1928; 47: 617-44 (Reprint as classic paper in: Proceedings of the IEEE. 2002 Feb; 90(2)).

[4] Shannon, C. Communication in the presence of noise. Proceedings in Institute of Radio Engineers. 1949; 37(1): 10-21 (Reprint as classic paper in: Proceedings of the IEEE. 1998 Feb; 86(2)).

[5] Jensenius, A. R.; Nymoen, K.; Skogstad, S. A.; Voldsund, A. A Study of the Noise-Level in Two Infrared Marker-Based Motion Capture Systems, In Proceedings of the 9th Sound and Music

Page 7: Hidden melody in music playing motion: Music …...3.2 Bass guitar playing In this experiment, we tried to retrieve sound from the real music-playing situation. We have tested various

22nd International Congress on Acoustics, ICA 2016 Buenos Aires – 5 to 9 September, 2016

Acoustics for the 21st Century…

7

Computing Conference - "Illusions". Logos Verlag Berlin. ISBN 9783832531805. Paper. 2012. s258 - 263

[6] Pollefeys, M.; Koch, R.; Van Gool, L. Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters. International Journal of Computer Vision, 32(1), 1999, pp.7-25.

[7] Kinsler, L. E.; Frey, A. R.; Coppens, A. B.; Sanders, J. V. Fundamentals of Acoustics, 4th Edition ISBN 0-471-84789-5. Wiley-VCH, 1999., pp.406-411.