Natsuki Akaishi, Kohei Yatabe, Yasuhiro Oikawa
Abstract: Time stretching of music signals has a crucial problem, i.e., smearing of percussive sounds. Some time stretching algorithms have addressed this problem by detecting percussive components and manipulating them differently from the other components. However, conventional methods cause artifacts. In this paper, to prevent percussion smearing, we propose a preprocessing for time stretching. The proposed algorithm aims to preserve time scale of percussive components while stretching the rest of components in the usual way. To do so, time-frequency bins dominated by percussive components are squeezed in time direction so as to preserve the shape of spectrogram of percussive components. Our experiment showed that our method can improve sound quality for long stretching. Fig. Illustration of the proposed time-directional squeezing. In the bottom figures, the percussive component is colored in blue, the sinusoidal component is colored in red, and the mixed component is colored in purple.
We used eight algorithms for seven audio excerpts from [1]. The seven excerpts were stretched by the eight algorithms with a moderate time stretching factor α = 1.6 and a large stretching factor α = 3.2.
Google Chrome is recommended.
α=1.0 |
PV [2] |
PVL [3] |
PVDR [4] |
NW [5] |
HP [6] |
prop. PV |
prop. PVL |
prop. PVDR |
|
Bongo | |
|
|
|
|
|
|
|
|
CastanetsViolin | |
|
|
|
|
|
|
|
|
DrumSolo | |
|
|
|
|
|
|
|
|
Glockenspiel | |
|
|
|
|
|
|
|
|
Jazz | |
|
|
|
|
|
|
|
|
Pop | |
|
|
|
|
|
|
|
|
Stepdad | |
|
|
|
|
|
|
|
|
α=1.0 |
PV [2] |
PVL [3] |
PVDR [4] |
NW [5] |
HP [6] |
prop. PV |
prop. PVL |
prop. PVDR |
|
Bongo | |
|
|
|
|
|
|
|
|
CastanetsViolin | |
|
|
|
|
|
|
|
|
DrumSolo | |
|
|
|
|
|
|
|
|
Glockenspiel | |
|
|
|
|
|
|
|
|
Jazz | |
|
|
|
|
|
|
|
|
Pop | |
|
|
|
|
|
|
|
|
Stepdad | |
|
|
|
|
|
|
|
|
References:
[1] J. Driedger and M. Müller, “TSM Toolbox: MATLAB implementations of time-scale modification algorithms,” in Proc. Int. Conf. Digit. Audio Eff. (DAFx), 2014.
[2] M. Portnoff, “Time-scale modification of speech based on short-time fourier analysis,” IEEE Trans. Acoust., Speech, Signal Process., 1981.
[3] J. Laroche and M. Dolson, “Improved phase vocoder timescale modification of audio,” IEEE Trans. Speech Audio Process., 1999.
[4] N. Holighaus and Z. Průša, “Phase vocoder done right,” in Eur. Assoc. Signal Process. (EUSIPCO), 2017.
[5] F. Nagel and A. Walther, “a novel transient handling scheme for time stretching algorithms,” J. Audio Eng. Soc., 2009.
[6] J. Driedger, M. Müller, and S. Ewert, “Improving time-scale modification of music signals using harmonic-percussive separation,” IEEE Signal Proc. Lett., 2014.