collaborative classificacion of popular music_rose marie santini_2011

Upload: marie-santini

Post on 07-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    1/38

    Collaborative classification ofpopular music on the internet and

    its social implicationsRose Marie Santini

    Department of Information Science,Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, Brazil

    Abstract

    Purpose This paper aims to discuss how collaborative classification works in online musicinformation retrieval systems and its impacts on the construction, fixation and orientation of the socialuses of popular music on the internet.

    Design/methodology/approach Using a comparative method, the paper examines the logicbehind music classification in Recommender Systems by studying the case of Last.fm, one of the mostpopular web sites of this type on the web. Data collected about users ritual classifications arecompared with the classification used by the music industry, represented by the AllMusic web site.

    Findings The paper identifies the differences between the criteria used for the collaborativeclassification of popular music, which is defined by users, and the traditional standards of commercialclassification, used by the cultural industries, and discusses why commercial and non-commercialclassification methods vary.

    Practical implications Collaborative ritual classification reveals a shift in the demand for culturalinformation that may affect the way in which this demand is organized, as well as the classificationcriteria for works on the digital music market.

    Social implications Collective creation of a music classification in recommender systemsrepresents a new model of cultural mediation that might change the way of building new uses, tastes

    and patterns of musical consumption in online environments.

    Originality/value The paper highlights the way in which the classification process mightinfluence the behavior of the users of music information retrieval systems, and vice versa.

    Keywords Collaborative classification, Commercial classification, Popular music,Music information retrieval systems, Recommender systems, Last.fm, Social action, Social networks

    Paper type Case study

    IntroductionThe aim of this paper is to study the type of relationship that can be establishedbetween the collaborative construction of an artistic classification system (in this case,

    music) and patterns of social organization that cause new social uses of culture toemerge; different uses to those hitherto made known and oriented by the culturalindustries.

    The conceptual issue underlying the hypothesis addressed in this research is theexistence of a dual movement. On the one side there are the construction, stabilizationand orientation of the social uses of music by means of its classification. On the other,simultaneously, there is the collaborative ritual classification as a way to reveal theuses and values that music acquires in a dynamic and transitory way in the social field.

    The current issue and full text archive of this journal is available at

    www.emeraldinsight.com/1065-075X.htm

    OCLC27,3

    210

    Received January 2011Revised March 2011Accepted March 2011

    OCLC Systems & Services:

    International digital library

    perspectives

    Vol. 27 No. 3, 2011

    pp. 210-247

    q Emerald Group Publishing Limited

    1065-075X

    DOI 10.1108/10650751111164579

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    2/38

    The paper deals specifically with the collaborative classification processes of musicin Recommender Systems (RS), which represent a new model of cultural mediation thatwas born and consolidated on the Internet. Recommender Systems (RS) arecomputer-based systems of classification, organization and recommendation of

    cultural goods, based on user practices and tastes. These systems run on a technologyknown as Collaborative Filtering, which is also used as a synonym for RS, when theintention is to refer to a specific type of software in which the information filtering isperformed with human help that is, with the collaboration of a network of users.

    With the emergence of collaborative filtering technologies, users are now able to build afolksonomy for the songs available online, and this classification has a different logic offunctioning and use when compared to traditional taxonomies or controlled vocabulary.Therefore, the main goal of this research is to analyze specifically what the differences andsimilarities are between the criteria used for the collaborative, user-defined classification ofmusic and traditional patterns of commercial classification, used by the cultural industries.In order to reach this goal, this research draws on a case study of the Last.Fmrecommender system, one of the most popular systems of this kind on the internet.

    In the case of Last.Fm, the classification and organization of information is activelyproduced and reproduced by the users, through the compilation of individualfolksonomies. Furthermore, the categories by which the works are organized give themvisibility in relation to other users. Thus, the classification acts as a mediation processbetween the music supply and demand that revolves around the system. Likewise, thegenre-based classification system created by the music industry has always worked asa mediation strategy that guides the consumption of cultural products, according toproduction availability and commercial distribution.

    In this regard, the issues raised by genre definition become a fundamental theme inthe complex conflict between the different types of theory and empiricism that mightbe applied in this case. These conflicts are built into the opposition that exists between

    a theory of fixed genres (that defines rules for each genre) and an opposing theorybased on empiricism, which shows the impossibility or ineffectiveness of reducing allthe real and possible works to these genres (Williams, 1977).

    The challenge faced by the social sciences is to understand the processes by whichthe similarities are perceived, and how the genres are created and enforced. In thispaper the genres are believed to represent constructed principles of social organization,which impregnate the works with significance according to their thematic content, butalso according to their utility and contextual uses. According to DiMaggio (1987), thegenres also respond to the creation of a structural demand for cultural information andaffiliation to social groups.

    Therefore, the expression music genre is used in this paper to make reference to aset of works of music classified by groups, based on the similarities perceived both

    by users and by the music industry. The concept of music genre enables thecomparison and identification of the differences between the principles of classificationused by common listeners and those used by the music industry, principles that reflect,respectively, the structure of the users tastes and the production and distributionstructure of cultural goods.

    This perspective is even more important when considering the goal of this paperand the context of analysis: the Recommender System known as Last.fm and theprocesses of collaborative classification carried out by users. Connected to the internet,

    Collaborativeclassification of

    popular music

    211

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    3/38

    this group of listeners constitutes a social network with a high level of interactionbetween its users, and the internet is the place where participation, classification andthe uses of content are organized in a non-centralized, non-hierarchical manner.

    Bearing this new context in mind, a context in which online collaboration is carried

    out within a global and heterogeneous network of users, it is necessary to identify theconsequences that this collective creation of a music classification system might haveon the construction of new uses, tastes and patterns of cultural consumption.

    If the artistic classification systems reflect changes in the social organization and viceversa, the decentralized and shared environment provided by the internet, understoodhere as a socio-structural and organizational factor, might reveal a shift in the demandfor cultural information, affect the way in which this demand is organized and how thepreferences are built, as well as reflecting the patterns of work classification.

    The issues of vocabulary control and classifications of popular musicAccording to Library and Information Science, the classification and categorization of

    music archives is a controversial process. Not only because music genres arenotoriously difficult to identify, but also because the concept of genre itself is a difficultone to deal with (Aucouturier and Pachet, 2003; McKay and Fujinaga, 2006).

    For collections housed in traditional libraries, which consist mainly of classicalmusic catalogs and Western artistic tradition, there is a substantial quantity ofliterature and guidelines that help the classification and categorization of this material.Western classical music is the focus of attention of most researchers in this field, beingthoroughly studied in the academic world. On the other hand, academic researchdealing with the classification of popular music is still under-represented in thescientific field, in spite of its significant growth in recent decades (Thompson, 2008).

    Therefore, there are less means available to catalog and classify popular musiccollections than classical music if any can be found at all. Popular music is still

    regarded as a spurious area of knowledge in the bibliographic field[1]. Qualifyingand classifying popular music is made even more complex due to its fluid andtransitory nature, and the constant growth of the field.

    The development of different types of structures and categories from the formalvocabulary is a time-consuming process. For example, when certain popular musicclassification structures begin to become established at a given moment, they areinevitably faced with the difficulties imposed by their own constant processes ofmutation. These transformations refer, both to the recurrent aesthetic-musical changes,and to the many facets that popular music acquires socially, in each place and time.

    Moreover, the indexation, description, classification and retrieval of music files facecomplications that are common to visual documents their non-textual qualitiesfurther intensify the role played by language and vocabulary in the music

    classification systems.From as early as the 1960s, founding texts such as Kasslers (1966) anticipated the

    preoccupation with automatic retrieval techniques of music information. But it was onlyin the 1990s that the number of researchers and studies interested in music indexationand classification based on documental content analysis as opposed to traditionalanalysis based on editorial information actually grew. This phenomenon owes a lot tothe rapidly increasing number of collections of works of music available as audio files,which was intensified by the development of new technologies and social networks.

    OCLC27,3

    212

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    4/38

    The description and classification of multimedia files available on the internetcannot possibly be achieved without the participation of an active army ofusers/volunteers. On the internet, after the popularization of the MP3, a huge portion ofmusic information began to be annotated manually, in a non-structured way (as

    opposed to taxonomy and controlled vocabulary), through a collaborative, dynamicand dis-intermediated process that is, through a direct user-work relationship.Nowadays, millions of users connect daily to web sites and recommender systems (likeLast.fm) in order to classify the music they like by the use of free labels (tags). EachLast.fm user-generated tag is made available to all members of the community. Bydoing this, tag-based recommender systems are able to build a collaborative depositoryof musical knowledge. According to Aucouturier and Pampalk (2008), thiscollaborative classification has a direct influence on the way users of a saidcommunity understand, recognize, describe and listen to the songs.

    In spite of the continuous efforts deployed by scientists and companies in thedevelopment of technologies that can automatically annotate audio files, the usefulnessof this method is still restricted to a few music domains such as computerizedinstrument recognition. Until now, the automatic annotation methods available are notmature enough to classify and label sounds and digital audio files (Sordo et al., 2007).

    Furthermore, machines face limitations (they cannot be substitutions for humancognitive processes) when classifying cultural goods, which makes automaticannotation methods only complementary to the manual methods, not substitutive.These systems are incapable of interpreting the works social value, the constitution ofmusic genres and the social dynamics in question. At the same time, manualannotation performed by experts is expensive and time consuming when you look atthe amount of songs available on the web, a number in the billions.

    Despite all the breakthroughs in the research fields about popular music classificationin the most recent information retrieval systems, there are very few parameters available

    to guide the processes of representing the material. According the Elaine Menard (2007),despite the fact that visual (image) and audio (music) documents share the same issues ofinformation representation, because of their non-textual characteristics, there are moreformal structures available for image description than for music description. Forexample: the Art and Architecture Thesaurus (AAT), the Thesaurus for GraphicMaterials I and II (TGM I and TGM II) and the ICONCLASS vocabulary.

    The music field still lacks important tools with which to build a structuredvocabulary, be it in the classical music domain or in the popular music one. TheLibrary of Congress Subject Headings (LCSH), which is a commonly used reference foraccess to themes and catalogs of musical works in most of the libraries around theworld, is obviously influenced by and focused on Western Art Music, as well as beinginsufficient in many regards (Thompson, 2008). In this way, different authors[2] agree

    that the organization of music in classification schemes (verbal, numeric oralphanumeric) has been a problematic area for cataloguers and librarians, and thatthe various schemes available are incapable ofcapturing all types of music.

    Although the Dewey Decimal Classification (DDC) and Library of CongressClassification (LCC) are widely used to classify music resources, new music and othermusical experimentations are difficult to accommodate in those schemes. In bothclassification systems, numbers are provided primarily for the music that comes fromNorth American and European traditions, not sufficiently covering other kinds of music.

    Collaborativeclassification of

    popular music

    213

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    5/38

    Lorraine Nero (2006), studying the case of Trinidad and Tobago music classificationin Caribbean libraries, analyzes the adequacy of those systems in coveringLatin-American popular music. The author examines the solutions adopted by theselibraries and provides practical examples on how to accommodate world music in

    Dewey Decimal Classification (DDC) and Library of Congress Classification (LCC)schemes. By adding new numbers to differentiate the music by each territory orcountry, some suggestions are made to distinguish various and changing forms ofmusic listed under the label popular music. Nero (2006) purports to show that,considering the dynamism of the cultural environment, classification schemes need tobe equally dynamic to assist in adequately classifying the resources, and to enableexploration of the process of incorporating new genres into DDC and LCC schemes.

    However, these adaptations of the DDC and LCC can eliminate some of theproblems, but it should be noted that, in a bibliographic record-sharing environment,these numbers would not be understood by all participants. This has made it difficultto assess collections and for users to browse and locate items. Cataloguers also have toreclassify items and use adequate cross-references from shared services to fit intoadaptations that attend a broader public.

    At the time of formulating and developing this research, the closest thing availableto a controlled vocabulary for popular music was on the web site AllMusic.com (a kindof IMDB The Internet Movie Database for music). The web site was opened in1995 by the All Music Guide business group (AMG), which today belongs to the UScompany Macrovision Corporation.

    The vocabulary and facets created and published by the AllMusic.com group forcategorizing and classifying popular music, are used worldwide as the recordingindustry standard for the classification of cultural product catalogs. AllMusicsthesaurus is a reference for the organization of catalogs within major record labels(EMI, Sony/BMG, Universal and Warner), it is also used by international publications

    and consulting companies in the music field (for example, Billboard magazine andNielsen Business Solution Consulting Company), as well as by companies that work inonline music distribution and sales, such as Microsoft, AOL, Yahoo!, Amazon, Barnes& Noble, Best Buy, Ticketmaster, Musicmatch, iTunes and Napster[3].

    The construction of the AllMusic taxonomy depends on the work of a permanenteditorial staff and on the contribution of hundreds of experts in the popular music fieldin order to classify a catalog of over one million artists and 13 million songs. AllMusicdefines itself as the most comprehensive music reference source on the planet[4]. Avariety of data about works of music, song-writers, singers etc. both popular andclassical is available free of charge on the internet, to be consulted by whoever maybe interested.

    One of the limitations of AllMusic is the fact that it does not classify the songs

    directly. The search for a particular song indicates only its corresponding writer(s),singer(s) and album. Therefore, only the artists and their albums are classified;a situation that differs greatly from the collaborative classification deployed byLast.fm users. Table I shows the facets used by AllMusic for the classification of themusic content available in its database.

    Both AllMusic and Last.fm describe related contents or similar contents. Forexample, in each detailed description of a given artist, the web site indicates othersimilar artists, thus classifying and relating all the music content about an artist or

    OCLC27,3

    214

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    6/38

    album by genres, styles and other similarities. Table II shows the statistics relating to

    the AllMusic catalog by the number of information entries in its database.

    The concepts developed on Web 2.0 (like folksonomy, social tags and collaborative

    classification) are not part of this web sites categorization method. On the contrary,

    AllMusic.com supervises and controls the classification and the description of its

    material through an editorial staff, which justifies its hierarchical model of

    organization and classification of music, as well as its controlled vocabulary.

    Besides searching for artists, albums and songs on its database, users also have the

    option of browsing the categories used in the classification, like genre, style, sub-genre,

    Facets Examples

    Music genre Jazz; Pop-Rock; Electronic; ClassicalMusic style (or sub-genre) Free Jazz; Folk-Rock; Choro;

    Types of musical instrument Sax (Soprano); Piano; Guitar; VoiceCity/Country/Locate French; Bahia- Brazil; US; NYCMood Drama; Funny; Delicate; SophisticatedThemes Birthday; Christmas Party; RelaxingSimilar artists Tom Jobim; Billy Holiday; Stan Getz

    Note: See AllMusic web site (www.allmusic.com), section About cover stats, available at: www.allmusicguide.com/cg/amg.dll?p=amg&sql=32:amg/info_pages/a_about_cover_stats.html (accessed4 February 2009)Source: Adapted from AllMusic.com (www.allmusic.com)

    Table I.Facets used by

    AllMusic.com

    Categories No. of entries on the database

    Albums 1,580,326Songs 13,526,702Classical compositions 311,833Artists (all names) 1,166,997Composers 275,846Album reviews 341,627Description of classical composition 27,519Biography 96,314Album credits 19,145,765Music genres 9Music styles 919Theme 86Mood 184

    Instruments 5,045Related links 13,246,506Cover images 1,062,661Artist images 76,574

    Note: See AllMusic web site (www.allmusic.com), section Coverage Statistics, available at: www.allmusic.com/cg/amg.dll?p=amg&sql=3 2:amg/info_pages/a_about_cover_stats.html (accessed20 February 2009)Source: AllMusic.com (www.allmusic.com)

    Table II.AllMusic.com statistics

    Collaborativeclassification of

    popular music

    215

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    7/38

    types of instruments, places, mood and themes. In relation to music genres, the web siteclassifies the songs in only 11 categories:

    (1) pop/rock;

    (2) jazz;(3) R&B;

    (4) rap;

    (5) country;

    (6) blues;

    (7) electronic;

    (8) latin;

    (9) reggae;

    (10) world musical; and

    (11) classical.

    Nevertheless, it can be observed that that the categories of mood and themerepresent a significant change in the formal description of musical items. Unliketraditional cataloguing and categorization, which consider only tangible aspects of thedata, AllMusic starts to consider less stable categories when classifying music;categories that indicate subjective perceptions such as emotion and context of themusical content. This type of subjective description is abundant in the folksonomystructure.

    Considering that the classification of music by genres is arbitrary and that thecategories are random and imprecise (Gjerdingen and Perrott, 2008) althoughhistorically determined with the growth of the cultural industry in the twentiethcentury, popular music classification systems began to be determined and cultivatedsocially, under commercial pressure from the recording industry.

    Therefore, if music genres can be considered as socially constructed categories developed mainly by the music industry it becomes necessary to examine thedifferences between collaborative classification, carried out by users, and the rulingclassification system used by the Industry. Having said that, the next section willanalyze the ways in which users describe their collections of music and their ownmusical taste at Last.fm. It will also examine in which aspects the usersmusic-cognitive perception differs from the classification categories used andpropagated by the market; as well as to what extent the commercial categories can beincorporated or corrupted by social use.

    Collaborative classification of popular music at Last.fmLast.fm was created in 2003 as an online radio, but with the idea of being amusic-based social network. In August 2005 the web site started using software calledAudioscrabbler (which registers the listeners habits) as an interface for access to thesongs, by means of user-created tags and inspired by the innovations developed onweb sites such as Flickr and Del.icio.us.

    Last.fm encourages its users to create tags[5] in order to:

    . assemble playlists of songs based on the collaborative classification of music;

    OCLC27,3

    216

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    8/38

    . categorize the profile and the taste of each listener; and

    . improve the classification system of the biggest music catalog in the world.

    Therefore, Last.fm allows its listeners to classify artists, albums and songs that they

    listen to and enjoy within its recommendation system. The RS uses the collaborativeclassification based on tags to group artists and songs. Grouping the material throughtags determines the performance of the systems recommendation, which is also basedon the listening habits of each user registered on its database. In the Frequently AskedQuestions section of the web site, Last.fm describes the uses of the tags:

    [. . .] like keywords or labels that you can use to classify music artists, albums or tracks.They are simply short descriptions. You can assign as many tags as you like to any track,album, or artist. Tags are a great way to label items by genre (rock, electropop,alt-country, and so on), but the possibilities are endless (idem note # 5).

    Thus, Last.fm actively promotes users classifications that surpass the traditionalfrontiers of the industry-standard music genres. That is, listeners are encouraged to

    use tags as a way of organizing their tastes and preferences, and to categorize thecontent available on the web site based on their individual musical perception and onpersonal/collective uses.

    The contents classified on Last.fm are mainly songs and artists. Albums, recordinglabels, videos, user communities, chats and other things can also be classified, butthese are only minor classifications. The percentage of collaborative classification thattakes place on Last.fm is distributed as shown in Tables III and IV. These tables revealthat only a small percentage of the Last.fm catalog is classified. That is, only 3.8percent of the items available in their collection has actually been tagged in a universeof 100 million songs registered on the web site by 2008[6].

    At the same time, some items receive an excessive amount of labels, which showsthat the most popular artists receive the biggest amount of tags, meaning that oneartist alone can receive thousands of tags. Figure 1 shows the concentration of tags inrelation to the artists popularity.

    Total of tags . 50 millionTagged songs . 50% (. 25 million)Tagged artists . 40% (. 20 million)Tagged albums , 5% (. 2.5 million)Tagged labels , 1% (, 500,000)Others , 4% (, 2 million)

    Source: Lamere and Pampalk (2008)

    Table III.Distribution of tags on

    Last.fm

    Single tags . 1.2 millionSingle tags applied to more than ten items . 130,000Tagged items . 3.8 millionTags created per month . 2.5 millionTaggers per month . 300,000

    Source: Lamere and Pampalk (2008)

    Table IV.Statistics of the use of

    tags on Last.fm

    Collaborativeclassification of

    popular music

    217

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    9/38

    Table IV shows that the number of single tags created and used by just one user is

    very significant and related to the users classifying contents based on personal

    criteria. On the other hand, the last line of the same table indicates that there are

    300,000 taggers per month, out of a total 30 million Last.fm users which, at first

    glance, would suggest a monthly participation of only 1 percent of the listeners[7].

    Nevertheless, the number of users involved should not be measured by its monthlyfrequency, for there is a risk of interpretative distortions.

    According to a study carried out by MIR Research (2008), the total number of people

    who created some sort of tag in order to classify songs on Last.fms system

    corresponds to approximately 60 percent of its user network. This means that the

    majority of users classify, or have classified, music content on Last.fm at a given

    moment. Table V shows the percentage of non-taggers that use Last.fm and the

    average vocabulary of these users per age.

    Figure 1.Number of tags versusartists ranking

    Non-taggersAge n % Taggers average vocabulary size

    14-19 years old 41.3 48.1 6 tags19-22 37.5 43.6 722-25 40.6 47.3 925-30 34.8 41.4 830-60 28.4 36.0 13Total average 39.4 8.6 tags

    Source: MIR Research (2008)

    Table V.Age and users behavioron Last.fm

    OCLC27,3

    218

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    10/38

    Table V shows that older users have a broader vocabulary when categorizing songswith tags. This phenomenon could be related to the fact that the older the users, thegreater the chance of their having the knowledge and cultural experience required toclassify and retrieve information.

    Although there are endless possibilities for creating tags, some categories ofclassification tend to become more popular and legitimate than others among the usercommunity. The following table introduces the 20 most common tags used on Last.fmto classify and search songs within its recommender system.

    Table VI indicates that most of the popular tags correspond to a music sub-genre.Although the data displayed above represent only a small portion of the number oftags existent on Last.fm, the tag seen live is an example of the social use and valuesacquired by music among a user community. This means that second most popular tagon the web site is a category that has very little chance of being incorporated into theIndustry-controlled vocabulary.

    Another important point is that some of the most popular tags among Last.fmlisteners represent genres that are underrepresented or secondary in the musicindustrys catalogs for example the tags metal, punk, ambient andexperimental.

    Last.fm users tend to employ broader classification methods (especially whentalking about genre) when compared to the processes of information seeking, whichtend to be more specific, using research for tags that represent styles or sub-genres, asdemonstrated in Tables VII and VIII. Nevertheless, when the tag distribution is closely

    Ranking TagNumber of users who created the

    tagNumber of times the tag was used

    on Last.fm

    1 Rock 220,754 2,310,160

    2 Seen live 77,839 1,274,5633 Alternative 144,219 1,171,2774 Indie 139,013 1,080,6975 Electronic 121,320 975,8846 Pop 104,846 875,5967 Metal 90,804 676,2448 Female vocalists 76,169 616,7129 Classic rock 72,047 548,693

    10 Alternative rock 88,148 546,35611 Jazz 75,862 539,68412 Punk 7,207 513,10613 Indie rock 75,088 467,45814 Folk 70,082 399,98115 Singer-songwriter 53,493 387,446

    16 Ambient 66,232 379,52617 Hip-hop 9,042 367,94218 Experimental 62,467 366,72919 Hard rock 58,966 365,06820 Dance 62,312 357,844

    Note: Information retrieved at Last.fms web site (www.lastfm.com), section Charts, sub-sectionTop tags. Data collected on February 8, 2009, available at: www.lastfm.com.br/charts/toptagsSource: Last.fm (www.lastfm.com)

    Table VI.Ranking of most used

    tags on Last.fm

    Collaborativeclassification of

    popular music

    219

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    11/38

    examined, the importance of both the music genre and the sub-genre is less striking.Over 38 percent of the categories attributed to songs correspond to users mood oropinion about these songs, as indicated in Table IX.

    Table IX indicates that the tags corresponding to subjective categories (such asopinion and mood) are constantly created by users in order to categorize songs.However, when comparing Tables VII-IX, it is possible to infer that whereas artistclassification is determined mainly by a cognitive perception of music genres, theclassification of songs is strongly influenced by contexts of use and feeling.

    Type of tag Frequency of classification (%) Examples

    Music genre 68 Heavy metal, punkLocale 12 French, Seattle

    Mood 5 Chill, partyOpinion 4 Love, favoriteInstrumentation 4 Piano, female vocalStyle 3 Political, humor, psychedelicOther 2 Coldplay, composersPersonal 1 Seen live, I own itOrganizational 1 Check out, to buyTotal 100

    Source: Lamere (2008)

    Table VII.Frequency of artistsclassification on Last.fm

    Type of tag Frequency of classification (%) Frequency of search per tag (%)

    Music genre 68 51Locale 12 7Mood 5 4Opinion 4 2Instrumentation 4 5Style/sub-genre 3 26Personal 1 0Organizational 1 0Period 1 3Other 1 2

    Source: Bosteels et al. (2008)

    Table VIII.Classification versusfrequency of search hitsper artist on Last.fm

    Type of tag Frequency (%) Examples

    Music genre 23.8 Heavy metal, punkLocale 3.9 French, SeattleMood/opinion 38.8 Party, Love, favorite, RelaxingStyle 10.7 Piano, female vocalOther 22.8

    Source: Thompson (2008)

    Table IX.Frequency ofclassification of songs onLast.fm

    OCLC27,3

    220

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    12/38

    Last.fm versus AllMusic: comparative analysis of popular musical domainsThis paper intends to apply the comparative methodology known as domain analysis(Hjrland, 2002) inspired by the method used by Thompson (2008) in order toevaluate the differences and similarities between a collaborative classification of

    popular music and the controlled vocabulary patterns used by the recording industry.The domain analysis was carried out using classification-related data from two web

    sites Last.fm and AllMusic with the intention of comparing the user-createdfolksonomy (represented by Last.fm) with the controlled vocabulary used by the MusicIndustry (represented by AllMusic).

    This comparative method was considered the most adequate to fulfill the goals ofthe research due to the huge amount of data available and its non-obstructive quality;that is, the decisions about vocabulary and the use of tags can be observed without anyinteraction or interference by the researcher.

    AllMusic.comAllMusic was chosen to represent a popular music classification system based oncontrolled vocabulary, which was built and organized by an editorial staff of expertsand music critics.

    The music classification available on AllMusic is broadly used as reference for theorganization of catalogs and publications within the Music Industry, as mentionedbefore. As a result, its vocabulary is invested with legitimacy and authenticity in thesphere of commercial categorization. Browsing through AllMusic allows interestedusers to search for songs, artists and albums through links such as popular genres,instruments, country, mood and theme.

    The method used at AllMusic for classifying songs introduces considerabledifferences in relation to a traditional categorization system such as the Library of

    Congress Subject Heading (LCSH). The LCSH focuses mainly on differentiating genresand providing a certain degree of geographic information. The AllMusic web site has abroader vocabulary, which includes the subjective information created by editors(like the mood and themes tags).

    The categories used by AllMusic (see Table I) were grouped in four major facets inorder to simplify the comparison with the Last.fm vocabulary. Thus, music genresand music styles and sub-genres compose only one facet, named genre.

    The categories known as mood and themes were put together under the facetcalled opinion, since they are based on the editors subjective perception whenclassifying music. Therefore, in order to carry out a comparative analysis between thedomains, the AllMusic facets were organized as follows:

    (1) Genre: vocabulary terms that describe the music genre or sub-genre to which asong or an artist belongs.

    (2) Audio attributes: vocabulary terms that describes the predominant instrumentin a song or in an artists work.

    (3) Place/geography: vocabulary terms that associate the work of music or artistwith a country, region or any other geographic indicator.

    (4) Opinion: vocabulary terms that represent subjective data or the AllMusiceditors opinions (including mood and themes).

    Collaborativeclassification of

    popular music

    221

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    13/38

    In Table X, the category similar artists (mentioned in Table I) was excluded from thegroup of facets in order to be studied separately in section 4.2.

    The vocabulary terms that do not represent any of the above-mentioned categorieswere classified as Others/No category. The four-facet system is used in the empirical

    research as the base for measuring the differences between AllMusics controlledvocabulary and Last.fms folksonomy, which can be explained as follows.

    Last.fmWith the intention of attaining a broad understanding of the classification activitiescarried out by users, four sets of data were gathered from the Last.fm web site. DataSet #1 (DS-1) consists of the 150 most-used tags by users according to the Lastfmdatabase[8].

    In order to further investigate the classification activities performed on Last.fm, asecond set of data was required in order to examine the uses of tags at the solo artistlevel. Therefore, 11 artists were chosen among the most popular on Last.fm. For each of

    these artists, the five main tags associated with them were collected. This set of fivetags per artist is called Data Set #2 (DS-2)[9].The selection process of the artists to be analyzed was deliberate. Artists were

    chosen instead of songs because on AllMusic the songs are classified only by the musicgenre, leaving the other categories aside. Whereas, the artist classification is richer andmore robust, including all the existent facets in its controlled vocabulary, a setting thatenables a broader and more complex comparison with the Last.fm folksonomy.

    Before identifying the artists to be used in the comparative analysis, 11 more tagswere chosen among the most popular tags on Last.fm. Each of these tags correspondsto an industry-standard music genre, representing a different segment of worldpopular music: Rock, Folk, Pop, Electronic, Country, Hip-Hop, R&B, Jazz, Hardcore andLatin. For each of these tags/genres, the most popular artist in March 2009 (according

    to Last.fm) was selected (see Table XI).In order to avoid a narrow analysis and considering that the most popular artists

    usually have over twenty thousand tags on Last.fm (see Figure 1), the analysis wasbroadened to include more than just the five most used tags per artist, which normallycorrespond to music genres and thus exclude other facets of the analysis. Therefore,the 60 most used tags to classify each one of these 11 artists were also considered[10].This set of 60 tags per artist constitutes Data Set #3 (DS-3). The nature of the first threedata sets to be analyzed is indicated on Table XII.

    The first methodological procedure carried out in order to compare the vocabularieswas the organization of the tags on data sets DS-1, DS-2 and DS-3 according to the four

    AllMusic facets Grouping of the four major facets

    Music genre/music styles (sub-genres) GenreTypes of music instrument/instrumentation Audio attributesCity/country/locale Place/geographyMood/themes OpinionSimilar artists [disregarded for separate examination in section 4.2]

    Source: Adapted from AllMusic (2009c)

    Table X.Organizing the AllMusicfacets

    OCLC27,3

    222

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    14/38

    facets available on AllMusic (see Table I). After matching the tags and the facets, acomparison was carried out in order to establish the similarities and differencesbetween the vocabulary employed by Lastm.fm users and the AllMusic commercialvocabulary. The next section explains the codification method and the comparativeanalysis of the domains, presenting the results found.

    4.1 Method 1: comparison between the users and the industrys classification categoriesThe first method of codification involves comparing the three data sets with thevocabulary of the four AllMusic facets. Each tag in each of the three sets DS-1, DS-2and DS-3 was interpreted and assigned to one of the AllMusic facets: Genre, AudioAttributes, Place/Geography or Opinion.

    When a tag could not be placed under any of those facets, it was allocated to theOthers/No Category facet. After matching the tags and the facets, one of thefollowing labels was attributed to each tag in order to assess the degree ofcompatibility between each domain:

    . (Y)es: The tag taken from Last.fm appears in the AllMusic vocabulary exactly as

    it is written.. (P)artial: The tag taken from Last.fm is partially found in the AllMusic

    vocabulary. The criteria for this label include variations in spelling,word-combining, synonyms and other similarities.

    . (N)o: The tag taken from Last.fm does not appear in the AllMusic vocabulary.

    It is important to stress that most of the tags on Last.fm are written in English, which isthe primary language on the web site and on the internet as a whole. Even with a global

    Genre Most popular artist Number of listeners Number of plays

    Rock Coldplay 2,162,574 105,313,971Folk Bob Dylan 1,194,990 48,898,452

    Pop Madonna 1,185,293 36,849,612Electronic Depeche Mode 1,155,188 43,290,636Country Johnny Cash 1,050,251 34,419,422Hip-Hop Kanye West 986,279 39,072,857R&B Rihanna 953,443 21,596,786

    Jazz Norah Jones 875,199 21,866,624Blues Tom Waits 600,978 27,033,292Hardcore Rise Against 554,635 38,242,584Latin Manu Chao 535,291 15,757,151

    Note: The data were extracted on March 29, 2009 from www.lastfm.com.br/music/+tag/Source: Last.fm (2009)

    Table XI.Most popular artists by

    genre on Last.fm

    Data set Spreadsheet no. Nature of data

    DS-1 Spreadsheet 1 150 most used tags on Last.fmDS-2 Spreadsheets 2 to 7 Five most used tags per most popular artist in each genreDS-3 Spreadsheets 8 to 20 60 most used tags per most popular artist in each genre

    Table XII.Nature of the data setsextracted from Last.fm

    Collaborativeclassification of

    popular music

    223

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    15/38

    user network, most of Last.fms listeners are geographically concentrated in the USAand Great Britain[11]. Therefore, all the tags written in English were included in theanalysis and compared to the AllMusic vocabulary written in the same language. Tagsannotated in other languages were discarded.

    Result 1. The first three sets of data collected presented different results whenmatched to the AllMusic facets. As indicated in Table XIII, in DS-1 the 150 mostpopular tags correspond mainly to the music genre facet, a result that coincidesapproximately with the statistics found by Lamere (2008)[12].

    In DS-2, composed of the five most used tags per artist, the presence of the musicgenre is even stronger, representing 81.80 percent. This means that users classify theartists firstly by genre (including here the music style or sub-genre), and secondlyby Audio Attributes, a result that concurs with the statistics presented by Bosteels et al.(2008), as shown in Table VIII.

    However, when the analysis is broadened to include DS-3, which is composed of the60 most used tags, it can be observed that the categories corresponding to the opinionand others/no category facets account for a substantial portion of the artistclassification.

    Result 2. In relation to the compatibility of the tags to the controlled vocabulary onAllMusic, some considerable differences were found between the Industry-standardcommercial classification and the user-defined classification. Table XIV shows thedegree of compatibility between the data sets collected from Last.fm and the AllMusicvocabulary.

    Analysis of DS-1 suggests that the combination of the first two lines (yes andpartial) indicates a high degree of compatibility between the domains. That is, 72.6percent of the 150 most popular tags according to Last.fm users also exist in anidentical or similar way in the AllMusic vocabulary.

    Combining the total and the partial compatibility levels of tags in DS-2, it is

    noticeable that 60 percent of the vocabulary used by listeners to define the music genrescoincides with the vocabulary established by AllMusic for the same artists. It isimportant to note that the cases of partial compatibility occur due to the fact that thecategories used by the Industry are more generic, whereas the categories used bylisteners are more specific and usually make use of more than one word to define a genre.

    Degree of compatibility DS-1 (%) DS-2 (%) DS-3 (%)

    (Y)es 42.0 20 18.18(P)artial 30.6 40 9.43(N)o 27.3 40 72.39

    Table XIV.Compatibility betweenAllMusics controlledvocabulary and Last.fmsfolksonomy

    Facets DS-1 (%) DS-2 (%) DS-3 (%)

    Genre 62.0 81.80 42.87Audio attributes 5.3 12.70 8.93Place/geography 6.6 0 8.03Opinions 17.3 0 25.63Others/no category 8.6 5.50 14.54

    Table XIII.Analysis of the facetsmatching the dataextracted from Last.fm

    OCLC27,3

    224

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    16/38

    However, DS-3 shows that when all the categories available on AllMusic areconsidered, especially the facet known as opinion, only 21.61 percent the tags matchthe AllMusic vocabulary (combining total and partial compatibility). This means

    that the subjective categories used by Last.fm listeners in order to classify the same

    artists differ considerably when compared to the classification used by the musicindustry, with a degree of incompatibility of 72.39 percent between the twovocabularies.

    Therefore, the comparison between the three groups shown in Table XIV leads us toa dual interpretation. On the one side, DS-1 and DS-2 indicate that most of the tags usedon Last.fm exist or have a similar annotation on AllMusic, the music industrystandard. This result reinforces the idea that the metadata found in the folksonomy forrepresentation of music genres are more reliable in terms of music classification thanpreviously speculated in other studies. Consequently, if the appropriate codificationand analysis is carried out, collaborative classification can offer cohesive metadata andinformation for the organization of popular music, especially as regards music genre.

    However, the degree of compatibility between the vocabularies on Last.fm andAllMusic falls when the data selected for analysis are examined in more detail. Forexample, DS-1 corresponds to the set of the 150 most used tags on Last.fm, and itdiffers from the AllMusic classification by 27.3 percent. DS-2 is composed of 55 tags (5

    tags x 11 artists/genres), but it calls for a more detailed comparison as regards theartists, where there is a 40 percent difference in relation to the Industry vocabulary more than DS-1. Similarly, DS-3 corresponds to a group of 660 tags (60 tags x 11artists/genres), with the aim of establishing an even more rigorous analysis betweenthe folksonomy and the controlled vocabulary, resulting in a degree of incompatibilityof 72.3 percent.

    Thus, the distance between the controlled vocabulary and the folksonomy increaseswhen the corpus for the analysis is broadened. Consequently, the comparison betweenthe three sets of data suggests that the more data considered and the more meticulousthe methodology, the greater the difference between user-classification andindustry-classification is going to be.

    This discrepancy between social use and commercial criteria for classification which tends to grow at a similar pace to the growth and popularization of thefolksonomy on the internet represents a crisis for the music representation modelsshaped by the recording industry. There are heated disputes between the symbolicsystem of commercial categorization and listeners cultural practices, especially asregards the subjective classification present in both vocabularies.

    As the users start to classify and represent the information according to theirperceptions, affections and music habits, classification of this type of work tends to

    distance itself from the commercial patterns that guide the listeners cognitiveprocesses and the social uses of the music. On the one hand, the differences betweenvocabularies can be interpreted as lack of knowledge or lack of cultural competenceon the users part. On the other, such differences may reveal a subversive resistance tothe commercial classification categories or a multiplicity of social uses that does not fitthe music industry standards of classification. This multiplicity has always existed atthe level of social dynamics, but is often neglected or hidden by commercial interestsand limitations.

    Collaborativeclassification of

    popular music

    225

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    17/38

    Result 3. In this section, the same results from DS-2 and DS-3 shown in Table XIVwill be demonstrated, only they will be broken down according to each of the elevengenres investigated (see Table XI). The aim of this analysis is to identify the keymechanisms involved in the constitution of the Last.fm genres.

    Anand and Peterson (2000) suggest that when it comes to competitive areas in thefield of popular music, the market works as a magnetic force around which groupswith the same interests are consolidated, and that the cognitive perception of genresoccurs through the creation, dissemination and interpretation of market informationby social groups. This argument is based on the production-of-culture perspective, atheoretical approach of cultural sociology consolidated in the 1970s that focuses onhow the symbolic elements of culture are shaped by systems in which they are created,distributed, evaluated and preserved (Peterson and Berger, 1975; DiMaggio, 1977;Blau, 1989; Crane, 1992; Bourdieu, 1993; Fligstein, 1996; Peterson and Berger, 1996; DuGay, 1997; Anand and Peterson, 2000; Peterson, 2001; Peterson and Anand, 2004; Lenaand Peterson, 2008).

    This sociological perspective has been successfully applied to a range of quitedifferent situations, especially in which the manipulation of symbols is a by-product (asin the case of popular music) rather than the purpose of collective activity (Crane, 1992;Peterson, 2001; Peterson and Anand, 2004). The production-of-culture perspective givesrise to two explanations for the degree of proximity between the genre-basedclassification employed by users and the industry standard classification system.

    First condition: the compatibility between the classification of genres on Last.fmand the categorization established by the industry is determined according to twovariables:

    (1) The greater the market penetration and size of a given genre for the culturalindustries, the stronger the influence of commercial classification.

    (2) When the limits of the genre in question are socially defended by its public, they

    tend to be equally defended by the industry as a market segment. That is, whenthe genre limits are strongly associated to cultural practices, the commercialclassification tends to reproduce the socially generated genre classification. Inthis case, it is the industry that adapts its classification system to match thesocial use, not the other way round.

    Second condition: the difference between what users and the industry understandwhen classifying a music genre might be related to one or more of the three factorslisted below. The genre with the lowest degree of compatibility between Last.fm andAllMusic might represent:

    (1) A smaller market, when compared with the other genres analyzed.

    (2) A large market, but the Industrys classification is too broad and/or lacksdifferentiation while the folksonomy considers the genre as broken down intomore precise and specific categories, based on real social uses.

    (3) A ritual classification at the social level that tries to invert or distinguish itselffrom the commercial classification. That is, an anti-commercial classificationcultivated by markedly distinct groups that intend to demarcate their tasteboundaries in relation to other groups that might be subject to industryinfluence.

    OCLC27,3

    226

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    18/38

    Figure 2 shows the results of the compatibility between Last.fm and AllMusic permusic genre, found in the analysis of DS-2. It should be underlined that the tags thatcompose DS-2 refer only to music genres and audio attributes, and not to theplace/geography and opinion facets.

    Figure 2 is presented in decreasing order, from the genre with the highest degree ofincompatibility to the genre with the lowest degree. The white area indicatesincompatibility between the vocabularies:

    Based on a comparative interpretation, Figure 2 was analyzed by taking the twoextreme cases that is, the genres with the highest and the lowest degrees ofcompatibility in DS-2. At the left end of the graph is the genre Latin music, whichshows the highest degree of incompatibility between DS-2 and AllMusic. The genreknown as Latin (American) music and often abbreviated to Latin music refersto music produced in all Latin American countries (including the Caribbean). Thismusic category covers a broad variety of styles, such as the rural music from thenortheast of Mexico, Cuban habanera, Argentinean tango and even symphonies bythe Brazilian classical composer Heitor Villa-Lobos, among many others (Starr and

    Waterman, 2006).According to the industrys commercial criteria, Latin-American music includes

    songs sung in Spanish, Portuguese and in other Creole languages from Haiti. Thismusic classification category used by the industry is excessively vague and confusing,referring more to markets identified by language and geography than to the musicalstyles themselves.

    For example, even the popular music genres from European countries such as Spainand Portugal end up under the Latin music label, because of the language in whichthey are sung (Starr and Waterman, 2006). Despite the language criteria, instrumentalmusic composed by Latin Americans also falls under this category, according to themusic industry (Negus, 1999; Gebesmair, 2001).

    Therefore, the category named Latin music attempts to cover a hugeheterogeneity both at the supply (artists, songs, styles) and demand (differentaudiences and tastes) level. It could be said that this is a fictional category, created tosatisfy the needs and the boundaries created by the industry and, consequently, bearslittle correspondence to the categories created by Last.fm users.

    Figure 2.Compatibility of DS-2 with

    the AllMusic vocabularyby genre

    Collaborativeclassification of

    popular music

    227

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    19/38

    When compared to rock, R&B and country, which together represent a considerablepercentage of the music market, the category known as Latin music is a relativelysmall market niche in commercial terms; this is the case even in the USA, the biggestmusic market in the world, which has a substantial Hispanic community. Considering

    that corporations have a limited amount of resources to distribute among all the genresin their portfolios, this music category receives little industry investment (Negus, 1998).

    Until 1997, the RIAA (Recording Industry of Association of America) had neverpublished the official US sales figures for Latin music. From 1998 onwards, Latinmusic was included in the statistics under the label others (Negus, 1998), andincorporated by the recording industry as a music genre in subsequent years, once thisniche market began to grow.

    At the other end of Figure 2 is the genre hardcore, which has a very specific publicwith a very strong sense of identity, based on their music taste. The social groups thatidentify themselves as hardcore use different forms of cultural expertise to definethemselves, recognize their peers and outsiders. The social boundaries of hardcore fansare expressed by details that range from their physical appearance (hairstyle, clothing)to their social practices and interaction.

    The boundaries between these communities of taste are socially ritualized throughgenre barriers. Since the emergence of hardcore, the strength of this stylesboundaries was intensively cultivated by small social groups around a culturalidentity, while the music industry still considered it a sub-genre or a variation of rockmusic (Bryson, 1996; Peterson, 2004).

    This social phenomenon forced the industry to set hardcore aside as a separategenre with its own target audience. That is, the power of the boundaries established byhardcore music is cultivated by its consumers who pressure the industry to adaptthe commercial classification system to this new market which is why the degree ofcompatibility between the vocabularies is higher in this case.

    Figure 3 shows the results of the degrees of compatibility found in the analysis ofDS-3 for each of the chosen genres. Unlike DS-2, DS-3 includes all the facets for theanalysis of the tags (genres, audio attributes, place/geography, and opinion).As mentioned in relation to Table XIII, this set of data shows a significant amount oftags that correspond to the categories opinion and others.

    Figure 3.Compatibility of DS-3 withthe AllMusic vocabulary

    OCLC27,3

    228

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    20/38

    Figure 3 is organized in decreasing fashion, from the genre with the highest degree ofcompatibility to the genre with the lowest degree of compatibility. The organization ofthe results, which privileges compatibility over differences, was intentional. For this

    comparative method the similarities are more important than the differences between

    the Last.fm classification and the AllMusic vocabulary, as the number of tags analyzed(60 per artist) constitutes a much larger set of data than the one found in the AllMusiccategories (which has around 30 categories for each artist).

    The main aim in considering more tags for this analysis was precisely to enhancethe probability of finding more compatible vocabularies. When analyzing thedifferences, there was a risk of distorting the results, seeing as the number of categoriesin relation to the domains analyzed is disproportional. The compatibility between thevocabularies is represented by the black areas in the graph.

    On the far right, again, is the hardcore genre. Contrary to in Figure 2, where thegenre shows more compatibility with the AllMusic vocabulary, in this set of data theexact opposite occurs. Proportionally, this is the genre where the tags differ most from

    the industry-generated categories.In order to analyze such a huge difference, it is necessary to remember that DS-3 is

    composed of a significant amount of subjective tags, whereas DS-2 is composed onlyof music genres and audio attributes. In DS-2 (see Figure 2) none of the tags that

    apply to hardcore artists are entirely compatible with the AllMusic vocabulary, inother words, all the tags are only partially similar. The results for this data set revealthat those who listen to hardcore value a more precise classification than thatestablished by the industry, which defines broader genres.

    Listeners of hardcore classify artists with more specific tags, expressing theirknowledge and cultural competence. This means that the users/taggers of hardcoreare those who best classify by genre. On the other hand, they cultivate musicopinions that are opposed to those spread to the masses by the music market.

    In relation to the more subjective categories in DS-3 (Figure 3), the hardcorepublic is characterized by having strong social barriers. This means that aficionados ofthis type of music usually want to set themselves apart from other communities oftaste (Peterson, 2004). This explicit distinction between Last.fm users is related to asubjective classification that is intentionally anti-commercial, especially as regardsthe opinion facet.

    Therefore, while belonging to this group requires in-depth knowledge about artistsin terms of genres, sub-genres and styles, its members also attempt to maintaindistance from the industry-promoted commercial classification. These are stronglydifferentiated groups who cultivate their boundaries of taste alongside those of othersimilar groups[13].

    On the far left of Figure 3 is the genre called rhythm and blues, which isabbreviated to R&B. The abbreviated form is commonly used both by the industry andby users, though many market reports refer to the genre as urban music, or urbancontemporary or even contemporary R&B.

    Despite its African origin, R&B has been incorporated by the industry as a westernpopular music genre used to classify songs and artists that combine elements andinstruments from other genres such as soul, funk, dance and hip hop (Starr andWaterman, 2006).

    Collaborativeclassification of

    popular music

    229

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    21/38

    Together with country, alternative and rap, urban music or R&B accountedfor 50 percent of all record sales in 2008 247 million records were sold that year(Nielsen Company, 2009, p. 35). Although record sales have been falling since the year2000 as shown in Figure 4 R&B is still one of the most profitable music genres for

    the recording industry[14].Figures 3 and 4 reinforce the hypothesis that the greater the market influence and

    size of a given genre, the stronger the influence of commercial classification on theusers sensorial-cognitive perception. The best-selling genre for the industry (R&B) isalso the genre with the highest percentage of tags on Last.fm that are compatible withthe commercial classification.

    4.2. Method 2: differences between user recommendation and Industry recommendationBoth Last.fm and AllMusic indicate or recommend similar artists to those selected bythe user. The list of artists considered similar on Last.fm is based on two methods.The first criterion is based on the listeners music habits. If many users listen to artistX and also to artists Y and Z, then artists Y and Z can be identified as being similar toartist X[15]. Therefore, as the uses change, similarities may vary too.

    A second function is added to this similarity equation to make Last.fmsrecommender system more precise. An important method used to link similar artistswithin a system is the use of the same tags to qualify different artists. Therefore, whendifferent artists have a considerably high frequency (of both classification and access)through certain common tags, those artists are considered similar on Last.fm.

    AllMusic also indicates similar artists (divided into three groups: similar artists,followed and influenced by). Musical similarity or proximity between artists is definedby the web sites editorial staff and music critics, who are hired to analyze the contents

    Figure 4.Worldwide record salesin 2008

    OCLC27,3

    230

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    22/38

    and indicate who is related to whom[16]. This means that the similarity indicated byLast.fms RS is based on users listening habits, whereas on AllMusic it is based on theorganization defined by specialists hired by the Industry.

    The intention of applying Method 2 is to identify whether similar artists indicated

    by AllMusic are also present on the similarities list shown on Last.fm. The aim is toverify whether the consumption profiles or market segments considered by theindustry correspond to the users tastes and uses. Therefore, the same artistsmentioned in Method 1 (see Table XI) were compared in terms of the Last.fm andAllMusic recommendations[17].

    For each artist analyzed, Last.fm shows 200 similar artists, indicating the degree ofsimilarity at five levels: super high, very high, high, medium and lower. Thefirst three degrees of similarity were considered (super high, very high and high),while on AllMusic the three categories of similar artists were analyzed (similar artist,followed and influenced by), in order to broaden the spectrum of analysis as muchas possible.

    Result 4. In DS-4, the comparison between the artists similarity indicated byAllMusic with the similarity indicated by the Last.fm system results in highdiscrepancy, as shown in Figure 5.

    Among the 822 artists analyzed in DS-4 (which correspond to 100 percent of theartists analyzed in this data set), only 92 are similar in both domains (10.5 percent). Intotal, 447 artists are similar only on Last.fm, whereas 283 are indicated as similar only onAllMusic. This means that the list of artists considered similar by the industry, whichbelong to the same market niche and/or audience segment differs significantly to theartists associated by the users and appreciated by the same audiences.

    Figure 5 shows that, on average, 89.5 percent of the AllMusic recommendationsdiffer from those created by the users listening habits on Last.fm, showing acompatibility of only 10.5 percent between both domains. This result indicates that the

    similarity perceived by Last.fm users is based primarily on their music taste and lesson the industrys market niche criteria, which organizes artists by audiences thatdo not necessarily correspond to the real social uses.

    Result 5. In this section, the results of DS-4 are again presented, but this time brokendown by genre. Figure 6 shows the comparison between the sets of similar artistsindicated by Last.fm and AllMusic, and is organized in a decreasing order: from thegenre with the highest number of similar artists in common to genre with the lowestnumber. The intersection between the two domains (artists in common) is representedby the gray area on the graph.

    On the far left of the graph is the genre country, followed by R&B. As indicatedin Figure 4,country, R&B, alternative and rap account for 50 percent of globalrecord sales in 2008 (Nielsen Company, 2009, p. 35). Considering that this study did not

    Figure 5.Compatibility betweenLast.fm and AllMusic

    recommendations

    Collaborativeclassification of

    popular music

    231

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    23/38

    analyze alternative and rap genres, the two best-selling genres are also those withthe highest degree of compatibility between the similar artist sets generated by the

    users and by the industry.While record sales for most genres have been declining since 2000, country music

    experienced one of its best years for sales in 2006, with a record market growth of 126percent - the highest sales increase among all genres. Meanwhile, R&B was the genre

    with fastest growing sales in 2007, with a 54 percent rise (Nielsen Company, 2009).

    Interpretation of the data in DS-4 reinforces the hypothesis that the greater theinfluence of a given genre on the market, the stronger the influence of the commercial

    classification in terms of supply and demand. On the far right of Figure 6 is the genreelectronic. Of all the genres, electronic music reports the lowest commercial

    profitability, considering sales of both physical records and songs available online,

    accounting for just a 3 percent market share. Alongside categories such as children,gospel, classical and new age, this genre is one of the minority segments of themusic industry, as shown by the figures in Figure 7.

    Figure 6.Comparison of similarartists between Last.fmand AllMusic in DS-4

    Figure 7.Worldwide music sales(physical and digital) bygenre in 2008

    OCLC27,3

    232

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    24/38

    Thus, when analyzing the data, one is faced with a paradoxical situation: althoughelectronic music is the fourth most listened-to genre on Last.fm (see Table VI)[18], it isone of the industrys smallest markets a situation that offers a few clues as to whythe organization of similar artists, grouped within this genre by Last.fm users, does not

    correspond to the set of artists indicated as similar by AllMusic.One possible explanation for this phenomenon is the electronic music production

    and distribution chain goes beyond the exclusive field of the traditional recordingindustry. For example, the electronic genre boasts the highest number of independentartists who make their songs available on Last.fm[19].

    One last observation is needed here, about the rock genre. As Figure 7 shows,rock is the best-selling genre both in physical and digital sales when consideredalone. However, comparative analysis carried out over the course of this research showthat rock occupies neither of the extremes on the Figures presented; in other words, ithas neither the highest nor the lowest degree of compatibility between the domainsstudied.

    This genre involves two contradictory factors that, when combined, might explainits position in the middle of the Figure when comparing user classification withindustry classification. Despite being the best-selling genre which, according to theargument developed in this research, should signify that its characteristics favor asimilarity between the classifications of both domains rock is an incredibly genericmusical category according to the Industrys classification and to the statisticspresented by the music market.

    According to the listeners, rock represents a music style with limited sub-genres that is, it has highly differentiated ritual classifications when it comes to culturalpractices, especially when it forms hybrid combinations with other genres, such aspop-rock, hard-rock, punk-rock etc.

    Crossing these factors generates a clash between two trends; while the greater

    market penetration brings user classification and commercial classification closer, thedelicate boundaries that define this genre tend to make it highly differentiated, withmultiple sub-divisions and hybrid associations with other genres. The combination ofthese two variables explains why rock occupies a median position in the results ofthe comparative analysis.

    The social dimensions of music genres: why commercial andnon-commercial classification systems varyPopular music classification, whether defined by the industry or by internet users, hasimportant implications on cultural practices that come and go in the social field. Thechallenge is to understand the process by which the similarities are perceived and the

    music genres defined. That is, the intention is to analyze why the commercialprinciples of categorization of music vary if compared to the criteria used by Last.fmlisteners, and which social consequences this phenomenon might point to for the socialuses of music emerging on the web.

    According to DiMaggio (1987), the procedures by which different genres are createdand inserted into public habits or deconstructed, are entirely related to the processes bywhich tastes are produced; firstly, as part of the construction of meaning for culturalproducts and secondly as structuring mechanisms for the activities that define the

    Collaborativeclassification of

    popular music

    233

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    25/38

    boundaries between social groups. This means that the social classification of genres isimplied in the process by which music is classified and, at the same time, classifies.

    These two social processes come together in what DiMaggio (1987) calls an ArtisticClassification System (ACS). The ACS refers to the way in which works of music or

    artists are divided, both in terms of cognitive perception and of consumer habits, byinstitutions involved in the music market that organize and set boundaries for theproduction and distribution of isolated genres.

    An ACS indicates the principles of relationship established between the genres andalso between the artists in a particular environment. By doing so, an ACS reflects boththe structure of the taste of a particular community and also the production anddistribution of cultural goods (DiMaggio, 1987, p. 441).

    Four aspects of the ACS stand out in the relationship that exists between socialorganization and classification systems:

    (1) differentiation;

    (2) hierarchy;

    (3) universality; and

    (4) ritual classification.

    First of all, classification systems vary according to what extent the music isdifferentiated in the established genres. Second, they differ in the degree by which thegenres are classified in a hierarchical manner, according to prestige. Third, the ACSvaries according to what extent the classification can be considered universal, ordiffers between sub-groups and/or its members. Finally, the systems vary according tothe power of the boundaries inside which the genres are ritualized, that is, sociallycultivated. The ritualization of boundaries between genres is followed by the formationof groups or communities of taste whose social confines hinder the free circulation of

    music between genres and of genres between groups (DiMaggio, 1987).Each one of the dimensions has a cognitive and an organizational component. The

    highly differentiated ACSs are characterized by the identification of a broad variety ofgenres and, consequently, by the intense fragmentation of the music offered. In highlyhierarchical ACSs, the genres vary according to social prestige, which is equivalent tothe inequality of resources one may have for accessing and consuming music. Thegenres with the highest degrees of prestige are those that demand equally higheconomic power and/or cultural competence, thus excluding the less favored strata ofsociety. Consequently, the genres with little prestige are those considered to be lessimportant by those with better economic resources.

    Universal ACSs are characterized by the homogeneous way people recognize andclassify the works. Finally, ACSs that have genres strongly determined by boundaries set

    by ritual classifications are characterized by agglomerations or social groups based ontaste. The boundaries between these groups are socially ritualized through themaintenance of such boundaries between the genres, making it more difficult for songs,artists and consumers to move between them. This means that the power of the ritualclassifications varies according to the rigidity of the boundaries between genres, and theseboundaries correspond to a stratification of public taste in well-defined social groups.

    These four dimensions are closely related, but it is important to highlight therelationship between the first and the last: differentiation and ritual classification.

    OCLC27,3

    234

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    26/38

    Differentiation refers to the existence of a multiplicity and a variety of distinct genres,that is, a fragmented supply. However, according to DiMaggio (1987), the moredifferentiated the ACS, the weaker the genre division and, consequently, the socialgroups based on taste.

    Ritual classification is connected to the strength of the boundaries in the social field.A well-defined division between genres is cultivated by social groups that limit and arelimited by taste. The boundaries between these groups correspond to a segmenteddemand, which in turn reinforces the limits of the genres at the production level.Therefore, the ritual strength of classification intensifies the fixation of the boundariesbetween works of music and people.

    As a consequence, the dimensions known as differentiation and ritualclassifications are inversely proportional. The more differentiated they are, thesmoother the circumscription between the genres, and the weaker the ritualclassifications which in turn correspond to the weakest boundaries between thesocial groups based on taste (DiMaggio, 1987).

    These four dimensions differentiation, hierarchy, universality and ritualclassifications are used as comparative parameters between the classifications foundin the AllMusic and Last.fm domains. In order to adapt the nomenclature to the aims ofthis article, the expression Artistic Classification System (ACS) will be adopted torefer to the classification systems specifically in the popular music domain.

    The ACS found on AllMusic is called commercial ACS, for it represents a systemthat is broadly used by the music industry. The classification found on Last.fm iscalled social or non-commercial ACS, created and used by a network of listeners of thisRS and therefore based on the social uses of online music.

    Having defined these concepts and the parameters for the analysis of the ACSs, onequestion comes to mind: why is it that the prospect of social uses of music collectiveefforts based on cognitive and sensorial principals that are common to the cultural

    perception implies an artistic classification theory? In this case, it can be argued thatin the same way that people can be divided based on the music they like, the songsavailable for the public can also be separated into groups or genres, based on thepeople that choose and consume them.

    Classification into genres allows customers to invest in specialized knowledge andartists to allocate their work in the correct market. As demonstrated by Becker(1982), artists work based on kindred areas that form institutionalized art worlds,both in terms of supply and demand, with conventions that make production possible.Therefore, genre classifications socialize the infrastructure costs of artisticproduction (DiMaggio, 1987, p. 445).

    According to DiMaggio (1987), commercial interests usually strengthen ritualclassifications, dividing society into groups, segments or niches, which aids the

    organization and social constitution of the genres. The most relevant example is that ofthe recording industry, with its age/class/race strata. The invention of musiccategories such as adult contemporary, tastemakers and Latin music by themusic market has little to do with the need for new music genres and much more to dowith commercial strategies of segmentation (or classification) of the audience.

    Although the arguments on which the genres are organized have variousimplications, the dynamic that springs from the ritual classifications is therelationship established between a socio-structural factor that influences cultural

    Collaborativeclassification of

    popular music

    235

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    27/38

    demand, the ways in which this demand is organized and how the cultural goods areclassified by genres bestowed with social meaning.

    Blau (1977) and Schwartz (1981) have many propositions for each dimension of theartistic classification system differentiation, hierarchy, universality and ritual

    classifications which establish relationships between the ACSs and the formalaspects of the social structure. These propositions organized by DiMaggio (1987),will be used to examine the variations between AllMusics commercial ACS andLast.fms social ACS. Table XV shows the results of the comparative analysis of thefour ACS dimensions between the two domains.

    The results found in analysis of the tags on Last.fm indicate that its ACS is highlydifferentiated, but not very hierarchical and universal, factors that reduce the strength ofthe ritual classifications between the genres and the social groups created around tastes.

    The high degree of differentiation of the categories found on the folksonomy is alsorelated to the categories fashioned by Last.fm listeners: global, diversified, cultural andsocially heterogeneous. According to Blau and Schwartz (1997), the high degree of

    differentiation of an artistic classification system corresponds to culturalheterogeneity, which is associated to social heterogeneity. This means that thegreater the degree of social heterogeneity and status diversity in a social system, themore differentiated its ACS (DiMaggio, 1987, p. 447).

    The classification systems also vary according to the hierarchical organization ofthe genres by prestige. Conversely, the ACSs that do not show a hierarchy arecharacterized by the perception of the genres as different, although they have the samevalue. The degree of hierarchy determines the value of the cultural capital (Bourdieu,1984) attributed to the cultural goods with higher prestige, and is related to thecultural authority of some social segments that consume such symbolic products.

    Hierarchical distinction of genres happens when producers and commercialdistributors control the means of access to the works with the highest level of prestige.

    This scenery is typical of a market economy in which access to culture is mediated bythe cultural industries. The influence of such resources tends to be greater when thereis higher inequality of purchasing power. However, in a context of free access to all theworks and styles as occurs on the Last.fm RS the hierarchical distinction betweengenres tends to be non-existent.

    The buying conditions of music and development of certain musical tastes havechanged radically in the modern world. In new contexts of use such as the internet,music consumption is no longer an adequate predictive indicator of social status, assuggested by Pierre Bourdieu in the late 1970s (Peterson and Ryan, 2003).

    The process of weakening the value and status of music accelerates with the rapidlygrowing access to all kinds of music on the web. The provision of music free of charge

    eliminates any inequality of resources between internet users. It should also be

    Dimensions of ACS AllMusic/music industry Last.fm/users

    Differentiation Low HighHierarchy Tends to be low LowUniversality High LowRitual classifications Medium/low Low

    Table XV.Comparison of the ACSdimensions of Last.fmand AllMusic

    OCLC27,3

    236

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    28/38

    considered that a high degree of differentiation between genres reduces the hierarchyor the competition for prestige between them (DiMaggio, 1987, p. 447).

    On the other hand, ritualized classifications produced, reproduced and reinforcedas regards habits and social uses can be shared either universally or restricted to

    certain groups. The classification needs to be broadly cultivated in order to beuniversally understood.

    The complexity of the classification system bears direct influence on the possibilityof a common understanding about genres between users. Where there are few andstrongly limited genres, the classification system is broadly shared. When the genresare in greater number and have undefined boundaries, the classification is lessuniversal. This means that differentiation and universality are inversely proportional.

    Ritual classifications vary according to the intensity with which the boundaries aresocially defended, both in terms of artistic production and consumption. In order forthe boundaries between genres to be defended, they must first be completelyunderstood. The more differentiated the classification system, the weaker the ritualstrength of classifications (DiMaggio, 1987, p. 449).

    Meanwhile, the symbolic resources allow individuals to communicate more easily(Collins, 1979, pp. 65-71). The increased offer and availability of diversified culturalproducts stimulate the process of cultural displacement between groups with moreflexible boundaries, and drive social demand for different cultural forms (DiMaggio,1987).

    With this outlook in mind and considering the set of propositions organized byDiMaggio (1987), it may seem implicit that the ACSs perfectly reflect the existingdivisions in society. However, this is not a consistent correspondence due to the factthat organization of musical works within society is mediated by the commercialclassification system, which operates at the level of production and distribution ofcultural products.

    Conversely, the commercial ACSs are to a certain extent subject to the actualprocesses of ritualization of the genres at a social level (DiMaggio, 1987). The culturalindustries strive to reproduce and stabilize the previously existing boundariesbetween social groups in order to maximize their profit margins in the market andreduce their business risk.

    Therefore, the commercial principles of cultural categorization differ from the socialprinciples in a fundamental way: the ritual classifications answer to the socio-structuraldemand of the consumers at the social usage level. Meanwhile, the commercialclassification reflects the production and distribution structure of the cultural industries.

    The effectiveness of the commercial ACSs depends on their correspondence with thesocial circuits of use and, in parallel, with the workings of their production anddistribution system. Therefore, the music market adapts, as much as possible, the

    updates of its music classification system according to the flux of social uses providedthat this process is aligned to the interests and limitations of the cultural industries.

    When this alignment fails to occur, the commercial classification may be usurped bysocial groups for their own purposes, as happens with the classification rationaleexistent on Last.fm, which corrupts some of the traditional market principles. On theother hand, the cultural industries constantly usurp, deactivate and promote ritualclassifications opposed or contradictory to the social uses, according to their owncommercial purposes (Hebdige, 1979, pp. 92-99).

    Collaborativeclassification of

    popular music

    237

  • 8/4/2019 Collaborative Classificacion of Popular Music_Rose Marie Santini_2011

    29/38

    The commercial artistic classification systems correspond to the boundariesimposed by profit-driven companies in a market economy. The commercial ACSsemerge through the process of identification and segmentation of the markets based onthe profit maximization strategies executed by the companies in question. With the aid

    of advertizing and specialized channels for disseminating information that serves themarket, the cultural industries create different levels of perception and access to thegenres among distinct segments of the public (DiMaggio, 1977). The groups withhigher social status tend to monopolize the symbolic goods in order to intensify theirrituals of inclusion and differentiation (Bernstein, 1973). And as DiMaggio (1987)demonstrated, under some circumstances, the commercial classification reinforcessuch ritual classifications.

    In contrast and at the same time, the commercial systems of classificationfrequently try to break away from the ritual classifications, that is, from the cultivationof boundaries between taste groups (Bernstein, 1973). Since commercial producerssearch for large markets and economies of scale, the less differentiated the genres, thebroader their markets will be and the more lucrative the business.

    Commercial producers intend to expand their markets to the maximum, even at therisk of reducing the ritualistic and social values of the products they sell. Usually, worksthat attain high percentages of large consumer audiences are more profitable than thosethat attract small groups of loyal fans. Thus, the discrepancy between the commercialand the symbolic value creates a clash between the principles of the socially ritualizedclassification and the commercial criteria in the competition for markets and culturalstatus (Weber, 1968; Peterson, 1978; Bourdieu, 1985; DiMaggio, 1987).

    Based on their industrial business model, the major record companies tend todistribute