SlideShare uma empresa Scribd logo
1 de 30
Baixar para ler offline
Soviet Popular Music Landscape
Community Structure and Success
Predictors
Dmitry Zinoviev
Department of Mathematics and Computer Science
Suffolk University, Boston
Dmitry Zinoviev * IC S * Suffolk University  2
Research Question
Who Rocks and Why?
Dmitry Zinoviev * IC S * Suffolk University  3
Real Research Questions
● Does sharing performers with other groups
influence the groups' eventual success?
● If so, is the success predictable from the
performers' sharing network?
● What is the linguocultural and genre structure
of the ex-Soviet music universe?
Dmitry Zinoviev * IC S * Suffolk University  4
Research Strategy
● Collect data about sharing and success
● Build a network based on shared musicians
● Define “success”
● Correlate network measures (such as centralities)
with success measures
● Attempt to predict success from the network
measures using machine learning techniques
● Look into genres/languages and communities
Dmitry Zinoviev * IC S * Suffolk University  5
DATA
Dmitry Zinoviev * IC S * Suffolk University  6
Data Set
● 4,560 non-academic music groups performing in
the USSR and post-Soviet countries in 1960–2015
● 17,000 performers (at least 3,600 shared)
● 275 coded genres (rock, pop, disco, jazz, folk, etc.)
● Wikipedia pages in 122 languages
Dmitry Zinoviev * IC S * Suffolk University  7
New Groups by Year
Dmitry Zinoviev * IC S * Suffolk University  8
2,216 Groups on Wikipedia
● Russia
● Estonia
● Ukraine
● Latvia
● Lithuania
● Belarus
● Moldova
Dmitry Zinoviev * IC S * Suffolk University  9
NETWORK
Dmitry Zinoviev * IC S * Suffolk University  10
Network Construction
●
Group → node; labels in the original language
● Two nodes connected if the groups shared at least
one musician over their lifetime
● Undirected, unweighted, unconnected graph with
no loops and no parallel edges
● For each node, calculate degree, average neighbors
degree, closeness, betweenness, and eigenvalue
centrality, and clustering coefficient
Dmitry Zinoviev * IC S * Suffolk University  11
Network
Overview
● Node size
represents
degree
(number of
shares)
Dmitry Zinoviev * IC S * Suffolk University  12
Network Description
● 80% of the groups (3,602) are in the giant
connected component; all other connected
components have <13 groups each
● Excellent community structure (m=0.76), 43
communities; each of the largest 25 communities
has 20+ groups
● Community = groups that have a lot of mutual
musician sharing
Dmitry Zinoviev * IC S * Suffolk University  13
SUCCESS
Dmitry Zinoviev * IC S * Suffolk University  14
What's “Success”?
● No sales data!
● No charts!
● Informal/semi-legal/illegal status
● Proxies for long-term success (we still remember them!):
– Wikipedia page(s) visit frequency within last 3 years (collected
from http://stats.grok.se)
– Wikipedia page(s) Google PageRank
– Available for 2,000 groups
Dmitry Zinoviev * IC S * Suffolk University  15
PageRank (PR) Correlations
Dmitry Zinoviev * IC S * Suffolk University  16
Visit Frequency (VF) Correlations
Dmitry Zinoviev * IC S * Suffolk University  17
Prediction (1)
● Random Decision Forest (RDF) machine learning
predictor
● Predict above-median VF vs below-median VF:
accuracy 69% (expected by chance: 50%)
● Predict Google PR: accuracy 50% (expected by
chance: 17%); 95% if 1 error allowed
● Quite poor, but not hopeless
Dmitry Zinoviev * IC S * Suffolk University  18
Prediction (2)
● But isn't visit frequency affected by group size?
(More performers—more search queries?)
● Add group size as a control variable
● Predict above-median VF vs below-median VF:
accuracy 69% (was: 69%)
● No difference!
Dmitry Zinoviev * IC S * Suffolk University  19
GENRES
Dmitry Zinoviev * IC S * Suffolk University  20
Genres and Sharing
● Build a network of similar genres (recursive
generalized similarity):
– Two genres are similar if used by similar groups
– Two groups are similar if play similar genres
●
Genre → node; two nodes are connected if the
genres are “very similar”
● Community structure (m=0.3):
– Punk/jazz, metal, disco/pop, blues/hip-hop, light rock
Dmitry Zinoviev * IC S * Suffolk University  21
Genre
Network
Metal
Light rock
Punk
Soul
Folk/jazz/hh
Disco
Ethno
Some genres are
hierarchical
(rock/metal/black metal).
TODO: Assign them to
different levels.
Dmitry Zinoviev * IC S * Suffolk University  22
Musicians Prefer Similar Genres
Dmitry Zinoviev * IC S * Suffolk University  23
LINGUOCULTURAL
STRUCTURE
Dmitry Zinoviev * IC S * Suffolk University  24
Languages, Genres, and Sharing
● Group sharing network has 25 communities with
20+ groups in each
● Preferred language = language of the most
frequently visited Wikipedia page
● Look into genres and preferred languages within
each community: Are they homo- or
heterogeneous?
Dmitry Zinoviev * IC S * Suffolk University  25
Genres per Community
In 9
communities,
>50% of groups
perform the one
genre.
In 23
communities,
>50% of groups
perform in no
more than 2
genres.
71% of all
shares—
homogeneous
Dmitry Zinoviev * IC S * Suffolk University  26
Preferred Languages per Community
In 24
communities,
>50% of groups
have the same
preferred
language!
84% of all shares
—homogeneous
Dmitry Zinoviev * IC S * Suffolk University  27
Language and Genre Homogeneity: Either or Both?
Language-defined
Genre-defined
Not very convincing?
Mixed
Dmitry Zinoviev * IC S * Suffolk University  28
Conclusion
● Musician sharing networks of non-academic music
groups in the USSR and post-Soviet countries have
community structure inspired by preferred
language and musical genre
● Centrality and clustering measures of this network
are correlated with long-term success of groups in
terms of popularity on Wikipedia and to some
extent can serve as success predictors
Dmitry Zinoviev * IC S * Suffolk University  29
Dataset Available
● https://github.com/dzinoviev/sovietmusic
Dmitry Zinoviev * IC S * Suffolk University  30
Made in Pythonia
Get your copy of “Data Science Essentials in Python” at
https://pragprog.com/book/dzpyds/data-science-essentials-in-python

Mais conteúdo relacionado

Destaque

Verdi in Venice
Verdi in VeniceVerdi in Venice
Verdi in Venice
mariane m
 

Destaque (12)

Heroes 1 Monstros
Heroes 1 MonstrosHeroes 1 Monstros
Heroes 1 Monstros
 
Verdi in Venice
Verdi in VeniceVerdi in Venice
Verdi in Venice
 
Margravine Menaces
Margravine MenacesMargravine Menaces
Margravine Menaces
 
Agile crash course - how to build bad software
Agile crash course - how to build bad softwareAgile crash course - how to build bad software
Agile crash course - how to build bad software
 
Baiuteii
BaiuteiiBaiuteii
Baiuteii
 
October 2016 Newsletter
October 2016 NewsletterOctober 2016 Newsletter
October 2016 Newsletter
 
Circulação lei estadual - 2013 - proposta patrocínio
Circulação   lei estadual - 2013 - proposta patrocínioCirculação   lei estadual - 2013 - proposta patrocínio
Circulação lei estadual - 2013 - proposta patrocínio
 
Selayang Pandang Ethical Fashion
Selayang Pandang Ethical FashionSelayang Pandang Ethical Fashion
Selayang Pandang Ethical Fashion
 
Trabalho de gegrafia
Trabalho de gegrafiaTrabalho de gegrafia
Trabalho de gegrafia
 
Introduction to Erlang Part 1
Introduction to Erlang Part 1Introduction to Erlang Part 1
Introduction to Erlang Part 1
 
Tööstuse digitaliseerimisest, Arne Kaasik, 07.12.2016 TSENTER
Tööstuse digitaliseerimisest, Arne Kaasik, 07.12.2016 TSENTERTööstuse digitaliseerimisest, Arne Kaasik, 07.12.2016 TSENTER
Tööstuse digitaliseerimisest, Arne Kaasik, 07.12.2016 TSENTER
 
Kalev Kaarna "Tootmisjuhtimise tähesupp"
Kalev Kaarna "Tootmisjuhtimise tähesupp"Kalev Kaarna "Tootmisjuhtimise tähesupp"
Kalev Kaarna "Tootmisjuhtimise tähesupp"
 

Mais de Dmitry Zinoviev

Network analysis of the 2016 USA presidential campaign tweets
Network analysis of the 2016 USA presidential campaign tweetsNetwork analysis of the 2016 USA presidential campaign tweets
Network analysis of the 2016 USA presidential campaign tweets
Dmitry Zinoviev
 

Mais de Dmitry Zinoviev (20)

Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)Machine Learning Basics for Dummies (no math!)
Machine Learning Basics for Dummies (no math!)
 
WHat is star discourse in post-Soviet film journals?
WHat is star discourse in post-Soviet film journals?WHat is star discourse in post-Soviet film journals?
WHat is star discourse in post-Soviet film journals?
 
The “Musk” Effect at Twitter
The “Musk” Effect at TwitterThe “Musk” Effect at Twitter
The “Musk” Effect at Twitter
 
Are Twitter Networks of Regional Entrepreneurs Gendered?
Are Twitter Networks of Regional Entrepreneurs Gendered?Are Twitter Networks of Regional Entrepreneurs Gendered?
Are Twitter Networks of Regional Entrepreneurs Gendered?
 
Using Complex Network Analysis for Periodization
Using Complex Network Analysis for PeriodizationUsing Complex Network Analysis for Periodization
Using Complex Network Analysis for Periodization
 
Algorithms
AlgorithmsAlgorithms
Algorithms
 
Text analysis of The Book Club Play
Text analysis of The Book Club PlayText analysis of The Book Club Play
Text analysis of The Book Club Play
 
Exploring the History of Mental Stigma
Exploring the History of Mental StigmaExploring the History of Mental Stigma
Exploring the History of Mental Stigma
 
“A Quaint and Curious Volume of Forgotten Lore,” or an Exercise in Digital Hu...
“A Quaint and Curious Volume of Forgotten Lore,” or an Exercise in Digital Hu...“A Quaint and Curious Volume of Forgotten Lore,” or an Exercise in Digital Hu...
“A Quaint and Curious Volume of Forgotten Lore,” or an Exercise in Digital Hu...
 
Network analysis of the 2016 USA presidential campaign tweets
Network analysis of the 2016 USA presidential campaign tweetsNetwork analysis of the 2016 USA presidential campaign tweets
Network analysis of the 2016 USA presidential campaign tweets
 
Network Analysis of The Shining
Network Analysis of The ShiningNetwork Analysis of The Shining
Network Analysis of The Shining
 
The Lord of the Ring. A Network Analysis
The Lord of the Ring. A Network AnalysisThe Lord of the Ring. A Network Analysis
The Lord of the Ring. A Network Analysis
 
Pickling and CSV
Pickling and CSVPickling and CSV
Pickling and CSV
 
Python overview
Python overviewPython overview
Python overview
 
Welcome to CS310!
Welcome to CS310!Welcome to CS310!
Welcome to CS310!
 
Programming languages
Programming languagesProgramming languages
Programming languages
 
The P4 of Networkacy
The P4 of NetworkacyThe P4 of Networkacy
The P4 of Networkacy
 
DaVinci Code. Network Analysis
DaVinci Code. Network AnalysisDaVinci Code. Network Analysis
DaVinci Code. Network Analysis
 
C for Java programmers (part 3)
C for Java programmers (part 3)C for Java programmers (part 3)
C for Java programmers (part 3)
 
C for Java programmers (part 1)
C for Java programmers (part 1)C for Java programmers (part 1)
C for Java programmers (part 1)
 

Último

一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
wsppdmt
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Bertram Ludäscher
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
vexqp
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
vexqp
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
ranjankumarbehera14
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 

Último (20)

一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
一比一原版(UCD毕业证书)加州大学戴维斯分校毕业证成绩单原件一模一样
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
怎样办理圣路易斯大学毕业证(SLU毕业证书)成绩单学校原版复制
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Tumkur [ 7014168258 ] Call Me For Genuine Models We...
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
Data Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdfData Analyst Tasks to do the internship.pdf
Data Analyst Tasks to do the internship.pdf
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 

Soviet Popular Music Landscape: Community Structure and Success Predictors

  • 1. Soviet Popular Music Landscape Community Structure and Success Predictors Dmitry Zinoviev Department of Mathematics and Computer Science Suffolk University, Boston
  • 2. Dmitry Zinoviev * IC S * Suffolk University  2 Research Question Who Rocks and Why?
  • 3. Dmitry Zinoviev * IC S * Suffolk University  3 Real Research Questions ● Does sharing performers with other groups influence the groups' eventual success? ● If so, is the success predictable from the performers' sharing network? ● What is the linguocultural and genre structure of the ex-Soviet music universe?
  • 4. Dmitry Zinoviev * IC S * Suffolk University  4 Research Strategy ● Collect data about sharing and success ● Build a network based on shared musicians ● Define “success” ● Correlate network measures (such as centralities) with success measures ● Attempt to predict success from the network measures using machine learning techniques ● Look into genres/languages and communities
  • 5. Dmitry Zinoviev * IC S * Suffolk University  5 DATA
  • 6. Dmitry Zinoviev * IC S * Suffolk University  6 Data Set ● 4,560 non-academic music groups performing in the USSR and post-Soviet countries in 1960–2015 ● 17,000 performers (at least 3,600 shared) ● 275 coded genres (rock, pop, disco, jazz, folk, etc.) ● Wikipedia pages in 122 languages
  • 7. Dmitry Zinoviev * IC S * Suffolk University  7 New Groups by Year
  • 8. Dmitry Zinoviev * IC S * Suffolk University  8 2,216 Groups on Wikipedia ● Russia ● Estonia ● Ukraine ● Latvia ● Lithuania ● Belarus ● Moldova
  • 9. Dmitry Zinoviev * IC S * Suffolk University  9 NETWORK
  • 10. Dmitry Zinoviev * IC S * Suffolk University  10 Network Construction ● Group → node; labels in the original language ● Two nodes connected if the groups shared at least one musician over their lifetime ● Undirected, unweighted, unconnected graph with no loops and no parallel edges ● For each node, calculate degree, average neighbors degree, closeness, betweenness, and eigenvalue centrality, and clustering coefficient
  • 11. Dmitry Zinoviev * IC S * Suffolk University  11 Network Overview ● Node size represents degree (number of shares)
  • 12. Dmitry Zinoviev * IC S * Suffolk University  12 Network Description ● 80% of the groups (3,602) are in the giant connected component; all other connected components have <13 groups each ● Excellent community structure (m=0.76), 43 communities; each of the largest 25 communities has 20+ groups ● Community = groups that have a lot of mutual musician sharing
  • 13. Dmitry Zinoviev * IC S * Suffolk University  13 SUCCESS
  • 14. Dmitry Zinoviev * IC S * Suffolk University  14 What's “Success”? ● No sales data! ● No charts! ● Informal/semi-legal/illegal status ● Proxies for long-term success (we still remember them!): – Wikipedia page(s) visit frequency within last 3 years (collected from http://stats.grok.se) – Wikipedia page(s) Google PageRank – Available for 2,000 groups
  • 15. Dmitry Zinoviev * IC S * Suffolk University  15 PageRank (PR) Correlations
  • 16. Dmitry Zinoviev * IC S * Suffolk University  16 Visit Frequency (VF) Correlations
  • 17. Dmitry Zinoviev * IC S * Suffolk University  17 Prediction (1) ● Random Decision Forest (RDF) machine learning predictor ● Predict above-median VF vs below-median VF: accuracy 69% (expected by chance: 50%) ● Predict Google PR: accuracy 50% (expected by chance: 17%); 95% if 1 error allowed ● Quite poor, but not hopeless
  • 18. Dmitry Zinoviev * IC S * Suffolk University  18 Prediction (2) ● But isn't visit frequency affected by group size? (More performers—more search queries?) ● Add group size as a control variable ● Predict above-median VF vs below-median VF: accuracy 69% (was: 69%) ● No difference!
  • 19. Dmitry Zinoviev * IC S * Suffolk University  19 GENRES
  • 20. Dmitry Zinoviev * IC S * Suffolk University  20 Genres and Sharing ● Build a network of similar genres (recursive generalized similarity): – Two genres are similar if used by similar groups – Two groups are similar if play similar genres ● Genre → node; two nodes are connected if the genres are “very similar” ● Community structure (m=0.3): – Punk/jazz, metal, disco/pop, blues/hip-hop, light rock
  • 21. Dmitry Zinoviev * IC S * Suffolk University  21 Genre Network Metal Light rock Punk Soul Folk/jazz/hh Disco Ethno Some genres are hierarchical (rock/metal/black metal). TODO: Assign them to different levels.
  • 22. Dmitry Zinoviev * IC S * Suffolk University  22 Musicians Prefer Similar Genres
  • 23. Dmitry Zinoviev * IC S * Suffolk University  23 LINGUOCULTURAL STRUCTURE
  • 24. Dmitry Zinoviev * IC S * Suffolk University  24 Languages, Genres, and Sharing ● Group sharing network has 25 communities with 20+ groups in each ● Preferred language = language of the most frequently visited Wikipedia page ● Look into genres and preferred languages within each community: Are they homo- or heterogeneous?
  • 25. Dmitry Zinoviev * IC S * Suffolk University  25 Genres per Community In 9 communities, >50% of groups perform the one genre. In 23 communities, >50% of groups perform in no more than 2 genres. 71% of all shares— homogeneous
  • 26. Dmitry Zinoviev * IC S * Suffolk University  26 Preferred Languages per Community In 24 communities, >50% of groups have the same preferred language! 84% of all shares —homogeneous
  • 27. Dmitry Zinoviev * IC S * Suffolk University  27 Language and Genre Homogeneity: Either or Both? Language-defined Genre-defined Not very convincing? Mixed
  • 28. Dmitry Zinoviev * IC S * Suffolk University  28 Conclusion ● Musician sharing networks of non-academic music groups in the USSR and post-Soviet countries have community structure inspired by preferred language and musical genre ● Centrality and clustering measures of this network are correlated with long-term success of groups in terms of popularity on Wikipedia and to some extent can serve as success predictors
  • 29. Dmitry Zinoviev * IC S * Suffolk University  29 Dataset Available ● https://github.com/dzinoviev/sovietmusic
  • 30. Dmitry Zinoviev * IC S * Suffolk University  30 Made in Pythonia Get your copy of “Data Science Essentials in Python” at https://pragprog.com/book/dzpyds/data-science-essentials-in-python