A vast amount of musical knowledge has been gathered for centuries by musicologists and music enthusiasts. Most of this knowledge is implicitly expressed in artist biographies, reviews, facsimile editions, etc. Music Digital Libraries make this information available and searchable. Documents are indexed and keyword-based search is generally provided. However, implicit knowledge present in text is not understood by machines, so complex queries cannot be answered.
As a first step towards the approximation of machine understanding to the accumulated musicological knowledge, documents stored in Digital Libraries must be semantically annotated. Current descriptive metadata and markup annotations provide some structured information. Nevertheless, it is insignificant compared with the epistemic potential of the source text. Once documents are properly annotated, complex structures and meaningful relations between pieces of information may emerge. This supports a paradigm shift, from keyword-based systems to knowledge-based systems, hence enabling musicologists to formulate more complex queries. As a consequence of that, Digital Libraries will turn into real knowledge environments, instead of mere searchable repositories.
Manual annotation of documents is very expensive, and sometimes unwieldy. Thus, the use of reliable, automatic processes is crucial to build knowledge environments. In the last few years several studies and approaches on the use of Semantic Web technologies in Digital Libraries have been proposed. The result of these intersection has been coined as Semantic Digital Libraries. Most of the related work is focused on the acquisition of Semantic Web methodologies for knowledge representation, which, among other advantages, facilitates information exchange between multiple knowledge bases. In the case of Music Digital Libraries though, tools and methodologies developed around the Semantic Web for automatic knowledge acquisition have received less attention.
In our work, an extensive survey is provided about the applicability and performance of state-of-the-art semantic technologies for knowledge acquisition from Music Digital Libraries. These technologies are adapted to fulfill the requirements and specificity of the music domain. An evaluation of analyzed tools is performed over artist biographies and Flamenco Music articles gathered from different sources (e.g. The Grove Dictionary, Wikipedia). An exhaustive overview of the possibilities that a knowledge layer may offer to Music Digital Libraries is exposed. Finally, some guidelines for future work in this research direction are also provided.
8. Musical
Libraries
Current
DL
digital
recording,
scan
OCR,
manual
transcrip0on
Informa0on
Extrac0on,
seman0c
annota0on
9. Musical
Libraries
Current
DL
Web
search
digital
recording,
scan
OCR,
manual
transcrip0on
Informa0on
Extrac0on,
seman0c
annota0on
10. Musical
Libraries
digital
recording,
scan
OCR,
manual
transcrip0on
Informa0on
Extrac0on,
seman0c
annota0on
Current
DL
Web
search
11. • The
Seman&c
Web
aims
at
conver0ng
the
current
web,
dominated
by
unstructured
and
semi-‐structured
documents
into
a
web
of
linked
data.
• Achievements
useful
for
Digital
Libraries
– Common
framework
for
data
representa0on
and
interconnec0on
(RDF,
ontologies)
– Seman0c
technologies
to
annotate
texts
(En0ty
Linking)
– Language
for
complex
queries
(SPARQL)
Seman0c
Web
12. Wikipedia
and
DBpedia
13
-‐ Digital
Encyclopedia
-‐ Unstructured
-‐ Keyword
search
-‐ Knowledge
Base
-‐ Structured
-‐ Query
search
15. • Dbpedia
example
queries
– Composers
born
in
Vienna
in
XVIII
Century
– American
jazz
musicians
that
have
wri_en
songs
recorded
by
RCA
Records
• Dbpedia
graph
applica0ons
– En0ty
Relevance
– En0ty
Similarity
– En0ty
Recommenda0on
DBpedia
16
20. • Encyclopedic
dic0onary
• One
of
the
largest
reference
works
in
Western
music
• Ar0st
biographies
crawled
from
the
Grove
Music
Online
– 16,707
biographies
(1st
paragrahph)
– From
pre-‐medieval
to
contemporary
Dataset:
The
New
Grove
21. • What
are
the
most
relevant
music
schools?
• What
are
the
ar0sts
most
similar
to
Schoenberg?
• Which
are
the
most
represented
roles
in
the
Grove?
• Is
there
a
migra0on
tendency
in
ar0sts?
To
which
ci0es?
• What
is
the
best
city
to
die
for
a
musician?
Dataset:
The
New
Grove
22
23.
Anton
Webern
(b
Vienna,
3
Dec
1883;
d
Mi_ersill,15
Sept
1945).
Austrian
composer
and
conductor.
Webern,
who
was
probably
Schoenberg's
first
private
pupil,
and
Alban
Berg,
who
came
to
him
a
few
weeks
later
[…]
Boulez
and
Stockhausen
and
other
integral
serialists
of
the
Darmstadt
School
[…]
Informa0on
Extrac0on
24
24.
Anton
Webern
(b
Vienna,
3
Dec
1883;
d
Mi_ersill,15
Sept
1945).
Austrian
composer
and
conductor.
Webern,
who
was
probably
Schoenberg's
first
private
pupil,
and
Alban
Berg,
who
came
to
him
a
few
weeks
later
[…]
Boulez
and
Stockhausen
and
other
integral
serialists
of
the
Darmstadt
School
[…]
Informa0on
Extrac0on
25
25.
Anton
Webern
(b
Vienna,
3
Dec
1883;
d
Mi_ersill,15
Sept
1945).
Austrian
composer
and
conductor.
Webern,
who
was
probably
Schoenberg's
first
private
pupil,
and
Alban
Berg,
who
came
to
him
a
few
weeks
later
[…]
Boulez
and
Stockhausen
and
other
integral
serialists
of
the
Darmstadt
School
[…]
Informa0on
Extrac0on
26
26.
Anton
Webern
(b
Vienna,
3
Dec
1883;
d
Mi_ersill,15
Sept
1945).
Austrian
composer
and
conductor.
Webern,
who
was
probably
Schoenberg's
first
private
pupil,
and
Alban
Berg,
who
came
to
him
a
few
weeks
later
[…]
Boulez
and
Stockhausen
and
other
integral
serialists
of
the
Darmstadt
School
[…]
Informa0on
Extrac0on
27
27.
Anton
Webern
(b
Vienna,
3
Dec
1883;
d
Mi_ersill,15
Sept
1945).
Austrian
composer
and
conductor.
Webern,
who
was
probably
Schoenberg's
first
private
pupil,
and
Alban
Berg,
who
came
to
him
a
few
weeks
later
[…]
Boulez
and
Stockhausen
and
other
integral
serialists
of
the
Darmstadt
School
[…]
Informa0on
Extrac0on:
En0ty
Linking
28
Domain
Knowledge
Base
28. • Webern,
who
was
probably
Schoenberg's
first
private
pupil
Informa0on
Extrac0on:
Rela0on
Extrac0on
29
29. • Webern,
who
was
probably
Schoenberg's
first
private
pupil
Informa0on
Extrac0on:
Rela0on
Extrac0on
30
Webern
Schoenberg
pupil_of
35. Knowledge
Graph:
Data
Analy0cs
36
Country
Births
Deaths
Difference
United
States
2317
2094
-‐10%
Italy
1616
1279
-‐21%
Germany
1270
1292
2%
France
991
1058
7%
United
Kingdom
882
877
-‐1%
City
Births
Deaths
Difference
London
322
507
57%
Paris
304
720
137%
New
York
266
501
88%
Vienna
177
292
65%
Rome
159
256
61%
36. • City
– Paris,
London,
Vienna,
Rome,
Venice,
Berlin,
Paris,
New
York
• Venue
– Covent
Garden
Theatre,
King's
Theatre,
Drury
Lane,
Carnegie
Hall,
Théâtre
de
la
Monnaie,
Stad_heater,
Theatre
Royal,
…
• Educa0onal
Ins0tu0on
– Paris
Conservatoire,
Moscow
Conservatory,
Juilliard
School,
St
Petersburg
Conservatory,
Bmus,
Prague
Conservatory,
Leipzig
Conservatory,
Vienna
Hochschule
für
Musik,
...
Knowledge
Graph:
En0ty
Relevance
37
37. • Biography
subject
– Haydn,
Claude
Debussy,
Arnold
Schoenberg,
Robert
Stevenson,
Paul
Hindemith,
Giovanni
Pierluigi
da
Palestrina,
Gustav
Mahler,
Maurice
Ravel,
Jean-‐Philippe
Rameau
– Mozart?,
Bach?,
Wagner?
• Genre
– chamber
music,
cappella,
jazz,
folk
music,
avant
garde,
baroque
music,
electronic
music,
musical
theatre,
plainchant
Knowledge
Graph:
En0ty
Relevance
38
38. • PageRank
algorithm
and
Maximal
Common
Subgraph
• Arnold
Schoenberg:
Anton
Webern,
Paul
Hindemith,
Gustav
Mahler,
Alban
Berg,
Claude
Debussy
• Guido
Adler:
Heinrich
Jalowetz,
Eusebius
Mandyczewski,
Robert
Fuchs,
Karl
Weigl,
Anton
Wranitzky
• Manuel
de
Falla:
Ricardo
Viñes,
Juan
Vicente
Lecuna,
Enrique
Granados,
Miguel
Llobet
Soles,
Luigi
Russolo
• Miles
Davis:
Dizzy
Gillespie,
Herbie
Hancock,
Paul
Chambers,
Tony
Williams,
Cannonball
Adderley
Knowledge
Graph:
En0ty
Similarity
39
44. • Data
gathered
– 1,174
Ar0sts
(text
biography)
– 76
Palos
(flamenco
genres)
– 2,913
Albums
– 14,078
Tracks
– 771
Andalusian
loca0ons
• Knowledge
Extracted
– Place
of
birth
– Date
of
birth
– En0ty
men0ons
in
text
FlaBase:
Flamenco
Knowledge
Base
45
45. • Number
of
ar0sts
by
year
of
birth
FlaBase:
Data
Analy0cs
46
48. • Music
Digital
Libraries
can
benefit
from
seman0c
approaches
• Music
Digital
Libraries
are
s0ll
in
an
early
stage
of
development
compared
to
the
Web
(Linked
Open
Data,
Google
Knowledge
Graph)
• Knowledge
acquisi0on
from
Digital
Libraries
can
help
musicologists
not
only
to
search
content,
but
also
to
discover
new
knowledge
Conclusions
49
49. • This
work
was
partly
funded
by
the
COFLA2
research
project
(Proyectos
de
Excelencia
de
la
Junta
de
Andalucía,
FEDER
P12-‐
TIC-‐1362).
Aknowledgments
50
50. • Oramas
S.,
Sordo
M.,
Espinosa-‐Anke
L.,
Serra
X.
(2015).
A
Seman,c-‐based
approach
for
Ar,st
Similarity.
Interna0onal
Society
for
Music
Informa0on
Retrieval
Conference
ISMIR
2015.
In
Press.
• Oramas
S.,
Gomez
F.,
Gomez
E.,
Mora
J.
(2015).
FlaBase:
Towards
the
crea,on
of
a
Flamenco
Music
Knowledge
Base.
Interna0onal
Society
for
Music
Informa0on
Retrieval
Conference
ISMIR
2015.
In
Press.
• Sordo,
M.,
Oramas
S.,
&
Espinosa-‐Anke
L.
(2015).
Extrac,ng
Rela,ons
from
Unstructured
Text
Sources
for
Music
Recommenda,on.
Interna0onal
Conference
on
Applica0ons
of
Natural
Language
to
Informa0on
Systems
NLDB
2015.
• Oramas
S.,
Sordo
M.,
Espinosa-‐Anke
L.
(2015).
A
Rule-‐based
Approach
to
Extrac,ng
Rela,ons
from
Music
Tidbits.
2nd
Workshop
on
Knowledge
Extrac0on
from
Text
at
WWW
2015.
• Oramas
S.,
Sordo
M.,
Serra
X.
(2014).
Automa,c
Crea,on
of
Knowledge
Graphs
from
Digital
Musical
Document
Libraries.
Conference
in
Interdisciplinary
Musicology
CIM
2014.
Bibliography
51
51. Knolwedge
Acquisi0on
from
Music
Digital
Libraries
Sergio
Oramas,
Mohamed
Sordo
sergio.oramas@upf.edu
@sergiooramas
Thanks!