soviet popular music landscape: community structure and success predictors
TRANSCRIPT
Soviet Popular Music LandscapeCommunity Structure and Success
Predictors
Dmitry ZinovievDepartment of Mathematics and Computer Science
Suffolk University, Boston
Dmitry Zinoviev * IC S * Suffolk University 3
Real Research Questions● Does sharing performers with other groups
influence the groups' eventual success?
● If so, is the success predictable from the performers' sharing network?
● What is the linguocultural and genre structure of the ex-Soviet music universe?
Dmitry Zinoviev * IC S * Suffolk University 4
Research Strategy● Collect data about sharing and success● Build a network based on shared musicians● Define “success”● Correlate network measures (such as centralities)
with success measures● Attempt to predict success from the network
measures using machine learning techniques● Look into genres/languages and communities
Dmitry Zinoviev * IC S * Suffolk University 6
Data Set● 4,560 non-academic music groups performing in
the USSR and post-Soviet countries in 1960–2015
● 17,000 performers (at least 3,600 shared)
● 275 coded genres (rock, pop, disco, jazz, folk, etc.)
● Wikipedia pages in 122 languages
Dmitry Zinoviev * IC S * Suffolk University 8
2,216 Groups on Wikipedia
● Russia
● Estonia
● Ukraine
● Latvia
● Lithuania
● Belarus
● Moldova
Dmitry Zinoviev * IC S * Suffolk University 10
Network Construction● Group → node; labels in the original language
● Two nodes connected if the groups shared at least one musician over their lifetime
● Undirected, unweighted, unconnected graph with no loops and no parallel edges
● For each node, calculate degree, average neighbors degree, closeness, betweenness, and eigenvalue centrality, and clustering coefficient
Dmitry Zinoviev * IC S * Suffolk University 11
Network Overview
● Node size represents degree (number of shares)
Dmitry Zinoviev * IC S * Suffolk University 12
Network Description● 80% of the groups (3,602) are in the giant
connected component; all other connected components have <13 groups each
● Excellent community structure (m=0.76), 43 communities; each of the largest 25 communities has 20+ groups
● Community = groups that have a lot of mutual musician sharing
Dmitry Zinoviev * IC S * Suffolk University 14
What's “Success”?● No sales data!● No charts!● Informal/semi-legal/illegal status● Proxies for long-term success (we still remember them!):
– Wikipedia page(s) visit frequency within last 3 years (collected from http://stats.grok.se)
– Wikipedia page(s) Google PageRank
– Available for 2,000 groups
Dmitry Zinoviev * IC S * Suffolk University 17
Prediction (1)● Random Decision Forest (RDF) machine learning
predictor
● Predict above-median VF vs below-median VF: accuracy 69% (expected by chance: 50%)
● Predict Google PR: accuracy 50% (expected by chance: 17%); 95% if 1 error allowed
● Quite poor, but not hopeless
Dmitry Zinoviev * IC S * Suffolk University 18
Prediction (2)● But isn't visit frequency affected by group size?
(More performers—more search queries?)
● Add group size as a control variable
● Predict above-median VF vs below-median VF: accuracy 69% (was: 69%)
● No difference!
Dmitry Zinoviev * IC S * Suffolk University 20
Genres and Sharing● Build a network of similar genres (recursive
generalized similarity):– Two genres are similar if used by similar groups
– Two groups are similar if play similar genres
● Genre → node; two nodes are connected if the genres are “very similar”
● Community structure (m=0.3):– Punk/jazz, metal, disco/pop, blues/hip-hop, light rock
Dmitry Zinoviev * IC S * Suffolk University 21
Genre Network
Metal
Light rockPunk
Soul
Folk/jazz/hh
Disco
Ethno
Some genres are hierarchical (rock/metal/black metal). TODO: Assign them to different levels.
Dmitry Zinoviev * IC S * Suffolk University 24
Languages, Genres, and Sharing
● Group sharing network has 25 communities with 20+ groups in each
● Preferred language = language of the most frequently visited Wikipedia page
● Look into genres and preferred languages within each community: Are they homo- or heterogeneous?
Dmitry Zinoviev * IC S * Suffolk University 25
Genres per CommunityIn 9 communities, >50% of groups perform the one genre.In 23 communities, >50% of groups perform in no more than 2 genres.
71% of all shares—homogeneous
Dmitry Zinoviev * IC S * Suffolk University 26
Preferred Languages per CommunityIn 24 communities, >50% of groups have the same preferred language!
84% of all shares—homogeneous
Dmitry Zinoviev * IC S * Suffolk University 27
Language and Genre Homogeneity: Either or Both?
Language-defined
Genre-defined
Not very convincing?
Mixed
Dmitry Zinoviev * IC S * Suffolk University 28
Conclusion● Musician sharing networks of non-academic music
groups in the USSR and post-Soviet countries have community structure inspired by preferred language and musical genre
● Centrality and clustering measures of this network are correlated with long-term success of groups in terms of popularity on Wikipedia and to some extent can serve as success predictors
Dmitry Zinoviev * IC S * Suffolk University 29
Dataset Available● https://github.com/dzinoviev/sovietmusic