Tonic Identification System for Hindustani and Carnatic Music

Tonic Identification System for
Hindustani and Carnatic Music
Master Thesis Presentation 2012
Universitat Pompeu Fabra
Barcelona, Spain
Supervisor: Xavier Serra,
Co-Supervisor: Justin Salamon

Outline
 Introduction
 Tonic in Indian art music
 Tonal structure of Tanpura
 Motivation and Goals
 Relevant Work
 Methodology
 System Overview
 Tonic Identification approach
 Evaluation
 Results
 Pending & Future Work

Indian art music
 Hindustani and Carnatic music (quite different from each
other!!!)
Hindustani Or
North Indian
music
Carnatic
music

Introduction: Tonic in Indian art music
 The base pitch chosen by a vocal performer that allows
to explore the full pitch range in a comfortable way [1]
 Anchored as ‘Sa’ swara in a performance
 All the other notes used in the raga exposition derive their
meaning in relation to this pitch value
 All other accompanying instruments are tuned using this
pitch as reference
P
i
t
c
h
time
Tonic

Role of Drone Instrument
 Performer needs to hear this pitch throughout the
concert
 Reinforces the tonic and establishes all harmonic and
melodic relationships
 Helps the performer to stay in tune
TanpuraSitar
Surpeti or
Shrutibox
Electronic
Tanpura

Introduction: Tonal structure of Tanpura
 Four strings
 Tunings
 Sa-Sa’-Sa’-Pa
 Sa-Sa’-Sa’-Ma
 Sa-Sa’-Sa’-Ni
 Special bridge with thread
inserted (Jvari)
 Violate Helmholtz law [2]
 Rich overtones [1]
Bridge

Introduction: Goals and Motivation
 Automatic labeling of the tonic in large databases of
Indian art music
 Devise a system for identification of
 Tonic pitch for vocal excerpts
 Tonic pitch class profile for instrumental excerpts
 Use all the available data (audio + metadata) to
achieve maximum accuracy
 Confidence measure for each output from the system

Introduction: Goals and Motivation
 Tonic identification: fundamental problem, a crucial
input for:
 Intonation analysis
 Raga recognition
 Melodic motivic analysis
 Develop culture specific computational models [3]
 Bias of MIR community towards western music
 Make the understanding of musical concepts more universal

Relevant work: Tonic Identification
 Very little work done in the past
 Based on melody [ 4,5]
 Ranjani et al. take advantage of melodic characteristics
of Carnatic music [4]

Relevant work: Summary
 Utilized only the melodic aspects
 Used monophonic pitch trackers for heterophonic data
 Limited diversity in database
 Special raga categories,aalap sections, solo vocal
recordings
 Unexplored aspects:
 Utilizing background audio content comprising drone sound
 Taking advantage of different types of available data, like
audio and metadata

Methodology: System Overview
Manual
annotation
Tonic
Ye
s
Ye
s
No
No
Audio Metadata

Methodology: System Overview
 Culture specific characteristics for tonic identification
 Presence of drone
 Culture specific melodic characteristics*
 Raga knowledge*
 Use variable amount of data that is sufficient enough to
identify tonic with maximum confidence.
 Audio data
 Metadata (Male/Female, Hindustani/Carnatic, Raga etc.)

Methodology: Tonic Identification
 Audio example:
 Utilizing drone sound
 Chroma or multi-pitch analysis
Multi-pitch Analysis [7]
Vocals
Drone

Methodology: Tonic Identification
Why Multi-pitch processing?
 No concept of chords or functional harmonies
 Less noisy as compared to Chroma features
 Avoid dominant contribution from melody
 Does not combine the information across different
octaves
 Tonic PCP is present in two consecutive octaves

Tonic Identification: Signal Processing
Audio
Sinusoids
Time frequency salience
Sinusoid Extraction
Tonic candidates
Salience function
computation
Tonic candidate
generation

 STFT
 Hop size: 11 ms
 Window length: 46 ms
 Window type: hamming
 FFT = 8192 points

 Spectral peak picking
 Absolute threshold: -60 dB
 Relative threshold: -40 dB

 Frequency/Amplitude
correction
 Parabolic interpolation

 Harmonic summation [7]
 Spectrum considered: 55-7200 Hz
 Frequency range: 55-1760 Hz
 Base frequency: 55 Hz
 Bin resolution: 10 cents per bin (120 per
octave)
 N octaves: 5
 Maximum harmonics: 20
 Alpha: 1
 Beta: 0.8
 Square cosine window across 50 cents

 Tonic candidate generation
 Number of salience peaks
per frame: 5
 Frequency range: 110-550
Hz
 After candidate selection
salience is no longer
considered!!!!

Tonic Identification : Approach1
 Identifying tonic in correct octave using multi-pitch
histogram (only for vocal excerpts)
 Classification based template learning
 Class of an instance is the rank of the tonic
 20 features: f1-f10, a1-a10
f2
f3
f4
f5

Tonic Identification : Approach1
 Decision Tree:
>5<=5
>-7<=-7
>-11<=-11
>5<=5 >-6<=-6
Sa
Sa
Pa
salience
Frequency
Sa
Sa
Pa
salience
Frequency

Tonic Identification: Insights

Tonic Identification : Approach 2
 Caters to both vocal and instrumental excerpts
 Identify tonic PCP using multi-pitch histogram
 Estimate the correct octave using predominant melody
 Use predominant melody extraction approach proposed
by Justin Salamon et al. [6]
 Tonic PCP
 Peak Picking + Machine learning (Similar to approach 1)
 Tonic octave estimation
 Rule based method + Classification based approach

 Tonic PCP identification: Classification based template
learning
 Two kind of class mappings
 Rank of the highest tonic PCP
 Highest peak as Tonic or Non tonic
 Feature extracted from histogram (same as approach 1)
f2
f3
f4
f5

 Tonic octave
 Rule based method
 Classification based approach
 25 Features: a1-a25
50 100 150 200 250 300 350
0
0.2
0.4
0.6
0.8
1
Frequency bins (1 bin = 10 cents), Ref: 55 Hz
NormalizedSalience
Perdominent Melody Histogram

 Tonic octave
 Rule based method
 Classification based approach
 25 Features: a1-a25

Evaluation: Database
 Subset of CompMusic database (>300 Cds) [3]
Approach 1: #364, 3min
Approach 2: #540, 3min (PCP) + 238, full recordings (Octave)

Evaluation: Database
 Tonic distribution
 Statistics (for 364 vocal excerpts)
 Male (80 %), Female (20%), Hindustani (38%), Carnatic
(62%), Unique artist (#36)
 Statistics (for 540 vocal and instrumental excerpts)
 Hindustani (36%), Carnatic (64%), Unique artist (#55)

Evaluation: Annotations
 Annotations done by me [Verified by a musician]
 Extracted 5 tonic candidates from multi-pitch histograms
between 110-370 Hz
 Matlab GUI to speed up the annotation procedure

Evaluation: Accuracy measures
 Output correct within 20 cents of the ground truth
 10 fold cross validation + rule based classification (1 fold)
 Weka: data mining tool
 Feature selection: CfsSubsetEval (features > 80% folds)
 Classifier: J48 decision tree
 Performs better than
 SVM-polynomial kernel (6% difference in accuracy)
 K* classifier (5% difference in accuracy)

Results
Approach(%) Map #folds
Class
EQ
# Features Tonic pitch Tonic PCP 5th 4th Other
AP1 M1 10 no 3, S1 93 - 2.5 2.8 1.7
AP2_EXP1 - - - - - 85 10.7 0.93 3.3
AP2_EXP2 M2 1 no 1, S2 - 93.7 1.48 8.9 0.9
AP2_EXP3 M2 10 no 4, S3 - 92.9 1.9 3.5 1.7
AP2_EXP4 M2 10 yes 4, S4 - 74.2 11 7.6 6.7
AP2_EXP5 M3 1 no 1, S2 - 91 3.3 3 2.7
AP2_EXP6 M3 10 no 2, S5 - 91.8 2.2 3 3
AP2_EXP7 M3 10 yes 2, S5 - 87.8 4.2 4 3.9
 M1: tonic rank, M2 : tonic PCP rank, M3 : highest peak tonic or non-tonic
 S1: [f2, f3, f5], S2: [f2], S3: [f2, f4, f6, a5], S4: [f2, f3, a3, a5], S5: [f2, f3]

Results
 Approach 2, Octave identification
 Rule based approach – 99 %
 Classification based approach – 100%

Discussion: Approach 1& 2
 AP-1: Performance for male singers (95%), female singers
(88%)
 Error cases
 Mostly Ma tuning songs
 More female singers
 Sensitive to selected frequency range for tonic
candidates, a range of 110-370 Hz works optimal
Sa
Sa
Pa
salience
Frequency
Sa
Sa
Pa
salience
Frequency
Sa
Sa
M
a
salience
Frequency

Discussion : Approach 2, Octave Id.
 Challenges faced by rule based approach
 Hindustani musicians can go upto -500 cents below tonic
 Carnatic musicians generally don’t go below tonic
 Melody estimation errors at low frequency
 Concept of Madhyam shruti

Conclusions and Future Work
 Drone sound in the background provides an important
cue for the identification of tonic and can be utilized to
automatically perform this task
 System should be fed with more information to
differentiate between ‘Pa’ and ‘Ma’ tuning
 Future Work:
 Exploring melodic characteristics for tonic identification
 Deeper analysis of confidence measure concept
 Study influence of cultural background on human
performance for this task

Pending Work
 July:
 Try couple of more approaches for tonic identification (using
concepts similar to source separation to incorporate pitch
salience into the system)
 Devise a confidence measure for best performing approach
 Define the order of input data to the system
 August:
 Increase the database used for evaluation: more
annotations, plan to reach 1000 excerpts!!!
 Write Thesis

Contributions
 Publications:
 J. Salamon, S. Gulati and X. Serra, A Multipitch Approach to Tonic
Identification in Indian Classial Music. Proc. of ISMIR 2012
 S. Gulati, J. Salamon and X. Serra, Tonic Identification System for Tonic
Identification for Hindustani and Carnatic music. CompMusic
Workshop, Istanbul 2012 (To be submitted today)
 Database
 Uploaded >125 CDs in musicbrainz as a part of CompMusic project
(taking assistance of team at IIT Bombay)
 Annotated ~350 audio excerpts (aim is to reach 1000)
 Code:
 C++ code for multi-pitch representation public (GPL)
 Other:
 Create two packages in Freesound with >30 Tanpura recordings with
different tuning combinations, instrument settings etc.

REFERENCES
1. B. C. Deva. The Music of India: A Scientific Study. Munshiram Manoharlal
Publishers, Delhi, 1980.
2. C. V. Raman. On some Indian stringed instruments. In Indian Association for the
Cultivation of Science, volume 33, pages 29-33, 1921.
3. X. Serra. A multicultural approach in music information research. In 12th Int. Soc. for
Music Info. Retrieval Conf., Miami, USA, Oct. 2011.
4. R. Sengupta, N. Dey, D. Nag, A. K. Datta, and A. Mukerjee, “Automatic Tonic ( SA )
Detection Algorithm in Indian Classical Vocal Music,” in National Symposium on
Acoustics, 2005, pp. 1-5.
5. T. V. Ranjani, H.G.; Arthi, S.; Sreenivas, “Carnatic music analysis: Shadja, swara
identification and rAga verification in AlApana using stochastic models,” Applications
of Signal Processing to Audio and Acoustics (WASPAA), IEEE Workshop, pp. 29-
32, 2011.
6. J. Salamon and E. G omez. Melody extraction from polyphonic music signals using
pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language
Processing, 20(6):1759–1770, Aug. 2012.
7. J. Salamon, E. G omez, and J. Bonada. Sinusoid extraction and salience function
design for predominant melody estimation. In Proc. 14th Int. Conf. on Digital Audio
Effects (DAFX-11), pages 73–80, Paris, France, Sep. 2011.

Tonic Identification System for Hindustani and Carnatic Music

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a Tonic Identification System for Hindustani and Carnatic Music

Semelhante a Tonic Identification System for Hindustani and Carnatic Music (20)

Mais de Sankalp Gulati

Mais de Sankalp Gulati (9)

Último

Último (20)

Tonic Identification System for Hindustani and Carnatic Music

Notas do Editor