Audio Morphing for Percussive Sound Generation

Audio Morphing
Proposed Algorithm
Results
Conclusion
Audio Morphing for Percussive Hybrid Sound
Generation
Andrea Primavera1
, Francesco Piazza1
and Joshua D. Reiss2
1
A3lab - DIBET - Universita’ Politecnica delle
Marche - Ancona - ITALY
2
C4DM - Queen Mary University of London -
London - UK
Andrea Primavera Audio Morphing for Percussive Hybrid Sound Generation 1/21

Audio Morphing
Proposed Algorithm
Results
Conclusion
Introduction
State of the Art
Audio Morphing is typically employed to accomplish two diﬀerent
tasks:
• Obtaining a smooth transition between two sounds;
• Generating a new intermediate sound which has the
characteristics of the two given samples (e.g., timbre and
duration).
Topic
An automatic audio morphing technique applied to percussive musical
instruments has been developed in this work.
Proposed Innovation
Creating new hybrid percussive sounds perceptually interesting (e.g.,
morphing among diﬀerent brands of the same instrument).
AIM

Audio Morphing
Proposed Algorithm
Results
Conclusion
Introduction
State of the Art
During the last decades several algorithms have been presented to morph
audio signals: linear interpolation, sinusoidal plus residual [1] [2] and
spectral envelope using various features [3].
One of the first approaches used to
implement audio morphing.
y(t) = αs1(t) + (1 − α)s2(t)
PRO: Low computational cost;
CONS: When the timbre of the in-
put signals is slightly different, both
the references could be individually
perceivable.
Linear Interpolation (LI) Based on harmonic modeling [4] [5]:
s(t) =
N
n=1
An(t)cos[θn(t)] + e(t)
The morphed sound is obtained in-
terpolating the fundamental frequen-
cy, the harmonics timbre, and the re-
sidual envelope.
PRO: Capability of morphing sounds
with different timbre;
CONS: Higher computational co-
st than LI, some problems with
percussive sounds.
Sinusoidal Plus Noise (SMS)

Audio Morphing
Proposed Algorithm
Results
Conclusion
Proposed Algorithm
PreProcessing
Processing
An automatic audio morphing algorithm employed to generate new
hybrid percussive sounds is presented here. It is based on:
• PreProcessing (frequency domain);
• Processing exploiting linear interpolation (time domain).
Proposed Algorithm
Fig.1 Block diagram of the proposed morphing algorithm: yellow indicates the
frequency domain while blue represents the time domain.

Audio Morphing
Proposed Algorithm
Results
Conclusion
Proposed Algorithm
PreProcessing
Processing
The PreProcessing operation is performed on both the signal references
in order to obtain hybrid sounds that change in a linear fashion when the
interpolation factor varies linearly.
It allows to align the samples before executing the morphing exploiting
cross-correlation.
1. Time Align

Audio Morphing
Proposed Algorithm
Results
Conclusion
Proposed Algorithm
PreProcessing
Processing
• The AR model is employed instead of the ADSR;
• Attack and release are determined using the Amplitude Centroid
Trajectory [6] [7] approach and evaluating the T60 [8] for each
audio sample.
Fig. 2: Time evolution of the spectral
centroid for a percussive sample.
SpectralCentroid[t] =
M
b=1 fb[t]ab[t]
M
b=1 ab[t]
where:
f is the frequency and a is the amplitude of
the b up to M bands, evaluated using FFT.
2. AR Model Analysis

Audio Morphing
Proposed Algorithm
Results
Conclusion
Proposed Algorithm
PreProcessing
Processing
The release portions are time stretched or compressed as a function
of the interpolation factor value:
tr = αtr1 + (1 − α)tr2
3. Time Stretching

Audio Morphing
Proposed Algorithm
Results
Conclusion
Proposed Algorithm
PreProcessing
Processing
The Processing operation is executed on the preprocessed signal
references using the linear interpolation approach.
Since percussive sounds are typically characterized by a noisy spectrum
and “similar” timbre this approach is preferable to additive synthesis.
Linear Interpolation

Audio Morphing
Proposed Algorithm
Results
Conclusion
Results
Analysis of Time and Spectral Features
1 Listening Test (MOS)
2 Listening Test (MDS)
Several tests have been carried out to evaluate the eﬀectiveness of the
proposed approach through objective and subjective comparisons.
Real recordings of drum kits
have been employed using the
BFD2 sound library [9].

Audio Morphing
Proposed Algorithm
Results
Conclusion
Results
Spectral centroid evolution and release time have been evaluated as a
function of the interpolation factor.
Analyzed algorithms:
• Linear Interpolation (LI);
• Linear Interpolation with Preprocessing (LIP);
• Sinusoidal plus Residual (SMS).
Morphing examples:
• Different kind of snare drums (i.e, electronic and classic snare);
• Toms of different size and brand;
• Hi-hats of different brand.
Analysis of Time and Spectral Features (Objective Measure)

Audio Morphing
Proposed Algorithm
Results
Conclusion
Results
In order to evaluate the audio perception of the produced hybrid sounds
two diﬀerent listening tests [10] have been performed:
• Mean Opinion Score (MOS) −→ Audio quality, estimated
interpolation factor;
• MultiDimensional Scaling (MDS) −→ Perceptual space.
Analyzed algorithms:
• Linear Interpolation (LI);
• Linear Interpolation with Preprocessing (LIP);
• Sinusoidal plus Residual (SMS).
Morphing examples:
• Diﬀerent kind of snare drums (i.e, electronic and classic snare);
Listening test (Subjective Measure)

Audio Morphing
Proposed Algorithm
Results
Conclusion
Results
SNARE TOM HI-HAT
0 0.2 0.4 0.6 0.8 1
0.8
1
1.2
1.4
1.6
1.8
Interpolation Factor
ReleaseTime(sec)
LI
LIP
SMS
0 0.2 0.4 0.6 0.8 1
2.8
3
3.2
3.4
3.6
3.8
4
ReleaseTime(sec)
LI
LIP
SMS
0 0.2 0.4 0.6 0.8 1
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
ReleaseTime(sec)
LI
LIP
SMS
Fig. 3 Release time as a function of different interpolation factor.
0 0.2 0.4 0.6 0.8 1
2000
2500
3000
3500
Spectralcentroid(Hz)
LI
LIP
SMS
0 0.2 0.4 0.6 0.8 1
820
840
860
880
900
920
940
960
980
LI
LIP
SMS
0 0.2 0.4 0.6 0.8 1
8000
8500
9000
9500
LI
LIP
SMS
Fig.4 Spectral centroid evolution as a function of different interpolation factor.
• Release time changes linearly with the change of the interp. factor;
• The spectral centroid does not seem affected by the preprocessing.
Considerations

Audio Morphing
Proposed Algorithm
Results
Conclusion
Results
Based on MOS, the estimated interpolation factor and the perceived
audio quality for LI, LIP and SMS are evaluated in the morphing between
a classic and electronic snare.
0 0.2 0.4 0.6 0.8 1
0
0.2
0.4
0.6
0.8
1
Est.Interp.Factor
LI
LIP
SMS
Fig. 5 Estimated interpolation factor.
LI LIP SMS
Quality 0.90 0.88 0.10
STD 0.06 0.07 0.08
Tab. 1 Percived audio quality expressed in
a scale from 0 to 1 (0 bad - 1 excellent).
• For the proposed approach (LIP) the estimated interpolation factor
changes roughly linearly with the change of the interpolation factor.
• For linear interpolation (LI) the factor 0.5 was perceived as near 0.8.
• Samples generated using SMS sound unnatural, this conﬁrm the
harmonic models limitation in the reproduction of percussive sound.
Considerations

Audio Morphing
Proposed Algorithm
Results
Conclusion
Results
Based on MDS [11] [12], the test aims to recreate the corresponding
perceptual spaces of the hybrid sound generated in morphing between a
classic and electronic snare with LI and LIP.
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
1
2
3
4
5
6
7
8
1 Dimension
2Dimension
Fig.6 MDS analysis obtained with LI.
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6
−0.8
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
1
2 3 4
5
6
7
8
1 Dimension
2Dimension
Fig.7 MDS analysis obtained with LIP.
LI LIP
CORR 0.986 0.980
P-VAL 7.2*10−6 1.9*10−5
Tab. 2 Correlation between ﬁrst
dimension and release time.
LI LIP
CORR 0.66 0.77
P-VAL 0.07 0.02
Tab. 3 Correlation between second
dimension and audio quality.

Audio Morphing
Proposed Algorithm
Results
Conclusion
Results
Fig.8 Correlation between first
dimension and release time for LI and
LIP.
The curves have been shifted on
the y-axis in order to enhance the
readability.
• High correlation between the
release time and the
movement along the first
dimension.
• Release time is an important
factor in the evaluation of
percussive audio morphing
algorithms.
• Although the correlation
between the second
dimension and the audio
quality is not zero, the values
obtained are not sufficient to
establish a strong relationship
between them.
Considerations

Audio Morphing
Proposed Algorithm
Results
Conclusion
Conclusion
Future Works
Questions
Bibliography
In conclusion:
• An automatic audio morphing technique applied to percussive
musical instruments has been presented in this work.
• The presented technique is based on preprocessing of the sound and
linear interpolation time domain.
• Different tests have been carried out in order to evaluate the
proposed approach, in terms of subjective evaluation and objective
measures comparing it with the previous state of the art.
• Objective analysis confirms that release time changes linearly with
the change of the interpolation factor while spectral centroid does
not seem affected by the preprocessing.
• Listening tests confirm a linear evolution in the perception of the
hybrid sounds using the presented approach.
• MDS multidimensional scaling suggests that release time is the most
important feature in the perception of hybrid percussive sounds.

Audio Morphing
Proposed Algorithm
Results
Conclusion
Conclusion
Future Works
Questions
Bibliography
Future work will be oriented toward:
• Further investigation of the proposed approach analyzing the
obtained performance for more dissimilar sounds.
• Real time implementation of the approach, considering an embedded
platform.
• Reﬁnement of the preprocessing phase, extending it with the ADSR
model.

Audio Morphing
Proposed Algorithm
Results
Conclusion
Conclusion
Future Works
Questions
Bibliography
QUESTIONS?

Audio Morphing
Proposed Algorithm
Results
Conclusion
Conclusion
Future Works
Questions
Bibliography
M. H. Serra, D Rubine, and R. Dannenberg,
“Analysis and synthesis of tones by spectral interpolation,”
J. Audio Eng. Soc., vol. 38, no. 3, pp. 111–128, March. 1990.
N Osaka,
“Timbre interpolation of sounds using a sinusoidal model,”
in Proc. Int. Computer Music Conference, Banﬀ, Canada, Sep 1995,
vol. 34.
M. Slaney, M. Covell, and B. Lassiter,
“Automatic Audio Morphing,”
in Proc. IEEE International Conference on Acoustics, Speech and
Signal Processing, Atlanta, May. 1996.
X. Serra and J. Smith,
“Spectral Modeling Synthesis: A Sound Analysis/Synthesis System
Based on a Deterministic Plus Stochastic Decomposition,”
J. Computer Music, vol. 14, no. 4, pp. 12–24, 1990.

Audio Morphing
Proposed Algorithm
Results
Conclusion
Conclusion
Future Works
Questions
Bibliography
F. Boccardi and C. Drioli,
“Sound morphing with Gaussian mixture models,”
in Proc. International Conference on Digital Audio Eﬀects),
Limerick, Ireland, December 2001.
John Hajda,
“A New Model for Segmenting the Envelope of Musical Signals: The
Relative Salience of Steady State versus Attack, Revisited,”
in Proc. 101th Audio Engineering Society Convention (AES’96), Los
Angeles, USA, Nov. 1996.
J.J. Burred, X. Rodet, and M. Caetano,
“Automatic Segmentation of the Temporal Evolution of Isolated
Acoustic Musical Instrument Sounds Using Spectro-Temporal Cues,”
in Proc. International Conference on Digital Audio Eﬀects
(DAFX’10), Graz, AU, Sep. 2010.
Sabine, W.C.,
“Collected Papers on Acoustics,” 1964, reprinted by Dover, New
York.

Audio Morphing
Proposed Algorithm
Results
Conclusion
Conclusion
Future Works
Questions
Bibliography
“BFD2 version 2.0.1 Acoustic Drum Production Environment,
fxpansion,” .
ITU-R BS. 1534,
“Method for subjective listening tests of intermediate audio quality,”
2001.
D. Williams and T. Brookes,
“Perceptually motivated audio morphing: warmth,”
in Proc. of 128th Audio Engineering Society Convention (AES’10),
London, UK, May. 2010.
A. Zacharakis and J. Reiss,
“An additive synthesis technique for independent modiﬁcation of the
auditory perceptions of brightness and warmth,”
in Proc. of 130th Audio Engineering Society Convention (AES’11),
London, UK, May. 2011.

Audio Morphing for Percussive Sound Generation

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

Semelhante a Audio Morphing for Percussive Sound Generation

Semelhante a Audio Morphing for Percussive Sound Generation (20)

Mais de a3labdsp

Mais de a3labdsp (13)

Último

Último (20)

Audio Morphing for Percussive Sound Generation