growing data, changing journalism

Upload: eric-alberts

Post on 07-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Growing Data, Changing Journalism

    1/23

    Growing Data, Changing Journalism

    An Explorative Inquiry Into The Rise Of Data Journalism

    8 July 2011

    Eric R. Alberts (3485595)

    Coding Culture (200600075)

    MA New Media & Digital Culture

    Mirko T. Schfer & Nikos Overheul

    2010-2011

  • 8/4/2019 Growing Data, Changing Journalism

    2/23

    Eric R. Alberts2

    1 Introduction

    Hal Varian, chief economist at Google, stated in an interview with McKinsey

    Quarterly in 2009 that the next ten years statisticians will have the sexiest jobaround. He motivated this statement by arguing the ability to take data to be

    able to understand it, to process it, to extract value from it, to visualize it, to

    communicate it [] is going to be a hugely important skill in the next decades

    (Manika 2009). When looking at Varians employer it is obvious why he came to

    this conclusion. A company that benefits greatly from vast amounts of data is

    Google. Every day enormous amounts of data are collected as a by-product of user

    interactions with Googles services. From this data new economic value is created.

    Google, however, is but one example of how important large data sets have becomeduring the last decade. In an attempt to map the current explosion of data and the

    challenges that derive from it, The Economist published a special report in

    February of 2010. In this special issue, Joe Hellerstein, a computer scientist at the

    University of California, is quoted, naming the current age the industrial

    revolution of data (Cukier 2010, 3). The Economist continues by stating that [t]he

    effect is being felt everywhere, from business to science, from government to the

    arts (ibid.) giving this phenomenon the label of big data.

    In this paper an exploratory inquiry is conducted into one specific area in which big

    data is also becoming ever more important: journalism. Several interesting news

    reports and cases have already emerged, which ensued directly from delving

    through large quantities of data. A prime example is the large-scale investigation

    into British politicians expenses, invigorating the debate on government

    expenditure. It is likely that in the near future similar data-driven news reports

    will see the light of day, as large data sets and sophisticated tools, which allow for

    journalists to make sense of this data, are becoming more and more dispersed over

    the Internet. Access to and the use of databases are no longer proviso to IT-

    specialists or investigative journalists conducting expensive research.

    According to the European Journalism Centre (EJC), which organized a roundtable

    conference on data journalism in Amsterdam last year, [d]eveloping the know-

    how to use the available data more efficiently, to understand it, communicate and

    generate stories based on it, could be a huge opportunity to breathe new life into

    journalism ( Data-driven Journalism 2010, 6). The role of the reporter can even

    be expanded or changed to that of sense-maker by digging deep into data, making

    journalism more socially relevant (ibid.). This paper critically analyses these

  • 8/4/2019 Growing Data, Changing Journalism

    3/23

    Growing Data, Changing Journalism 3

    opportunities but also discusses the challenges that derive from data journalism.

    By critically examining matters such as algorithms, data visualisation and

    participatory journalism this paper makes an effort to further contextualise this

    new phenomenon.

    The inquiry into the properties of data journalism will be conducted on two levels.

    First, this paper discusses data journalism on a material level, as databases are

    becoming the new valuable sources for journalists. The issue of materiality can be

    placed within an already active debate on how the materiality of books is

    transforming in the digital age. Companies like Google and Amazon are digitizing

    books into data objects, bringing on changes to our relationship with books and

    databases. Materiality in this paper also relates to the use of software tools and the

    choice of specific algorithms to structure these large amounts of data. Also, visualisation of data is becoming a widely accepted and applied manner for

    comprehending the vast and abstract data but not without making decisions that

    have consequences for the displayed data.

    Second, data journalism will be discussed on a social level as news organisations

    are actively involving their readers in analysing large data sets. The practice of

    crowdsourcing, for example, has the potential to change traditional producer-

    consumer relationships within journalism. The inquiry into the properties of data

    journalism on a material and social level will be related to various recently

    published examples of data journalism. An inquiry into the opportunities and

    challenges that derive from data journalism, however, cannot take place before

    giving a theoretical framework. This paper will therefore first embed data

    journalism in a larger context, describing how the rise of this new phenomenon can

    be seen within an already changing world of journalism under the processes of

    media convergence and participatory culture.

    This paper is intended to offer an insight into a new journalistic practice, which is

    drawing more and more attention to it as digital information is rapidly expanding,

    becoming widely available and publicly accessible. This paper contextualizes data

    journalism and critically engages assumptions and consequences in an attempt to

    offer a nuanced overview of this emerging branch in the already changing field of

    journalism in the digital age, which, according to Adam Westbrook, author of The

    Next Generation Journalist , is one of the big potential growth areas in the future

    of journalism (ctd. in Data-driven Journalism 2010, 3).

  • 8/4/2019 Growing Data, Changing Journalism

    4/23

    Eric R. Alberts4

    2 Theoretical framework of data journalism

    As stated above, before discussing the opportunities and challenges it is of

    importance to first get a grip of what data journalism is and how this practicerelates to journalism in general and the trends in todays digital culture. Because

    data journalism is only recently taking shape it has not been fully theorised yet.

    There is, however, extensive theory to be found on participatory journalism, the

    blurring of traditional relationships between news media and consumers of media.

    In this chapter, data journalism is understood in terms of participatory journalism

    but with a much more prominent technological aspect: its reliance on databases.

    Data journalism as a practice blends technology and culture in a way that it is not

    feasible to separate the technological aspects from its social context. Before this

    argument is further elaborated, it is, for the sake of contextualizing, necessary tofirst give an overview of the technological and cultural processes from which

    participatory journalism and, thus, data journalism have emerged.

    We live in a world that is witnessing a revolution in information technology, a

    converging set of technologies, which is penetrating all domains of human activity

    (Castells 2000). This revolution, however, is not just a technological process

    amplified through digitization, or what Nicholas Negroponte referred to as the

    transformation of atoms into bytes (Negroponte 1995). The digital age is also

    becoming a unified environment in which computer hardware and software define

    possibilities for action and conditions of expression (Rieder & Schfer 2008, 2).

    The new human condition is characterized by what Henry Jenkins refers to as

    convergence culture, enabling new forms of participation and collaboration

    (Jenkins 2006, 245). According to Jenkins [c]onvergence is both a top-down

    corporate driven process and a bottom-up consumer-driven process (Jenkins

    2004, 37) and is taking place on a global scale in acts of media production and

    consumption. Previously set borders between making media and using media, but

    also between media industries, continue to blur.

    This idea of convergence, the blurring of boundaries set by the conditions of

    digitization (Jenkins 2006, 11), was initially enthusiastically welcomed in the field

    of journalism by authors like Dan Gillmor (2004) naming it grassroots or

    participatory journalism. The proliferation of networked communication

    technologies enables people to launch independent news organisations as a direct

    response to what were perceived as shortcomings in mainstream news coverage

    (Deuze 2008). A few of these alternative websites produced by amateurs/citizens

    are Indymedia, Wikinews and Ohmynews, the latter being an alternative to the

  • 8/4/2019 Growing Data, Changing Journalism

    5/23

    Growing Data, Changing Journalism 5

    highly conservative mainstream press in South Korea (Kahney 2003). Studies on

    these cases show how citizen media offer interesting bottom-up alternatives to

    conventional top-down practices of news making (Paulussen & Ugille 2008, 26).

    At first glance these success stories back the more utopian beliefs of Dan Gillmor. If

    this trend would continue, independent participatory journalism might be able to

    replace top-down news media and its traditional news media-user relationships.

    Closer analysis of participatory journalism which is still a rather ill defined term

    (Hermida 2008) shows, however, that it is fair to say that the impact of weblogs

    and citizen media on traditional, professional journalism has thus far been rather

    limited (Paulussen & Ugille 2008, 26). This is partially due to the prevailing

    tendency among journalists to see themselves as the defining actors in the process

    of making news (Heinonen 2011). The main conclusion of a study on thedevelopment of participatory journalism on a global scale conducted by David

    Domingo et al. reveals that professional newsrooms appear to be rather reluctant to

    open up the news production process to the active involvement of citizens

    (Domingo et al. 2008). The primary question posed by researchers such as Wilson

    Lowrey (2005) of whether participatory journalism is in a way substituting

    professional journalism is thus losing relevance. Instead, the focal point of research

    on participatory journalism has shifted towards how mainstream news media are

    adopting citizen contributions in the process of news production (Paulussen &

    Ugille 2008, 26).

    In 2003 J. D. Lasica, senior editor of the Online Journalism Review, also stated

    readers want to be part of the news process (Lasica 2003, 74). But Lasica

    supplemented this statement by noting that instead of looking at participatory

    journalism and traditional journalism as rivals for readers eyeballs, we should

    recognize that were entering an era in which they complement each other,

    intersect with each other, play off one another (73). He continued by stating we

    are starting to see a mixture of commentary and analysis from grassroots as

    ordinary people find their voices and contribute to the media mix. Blogs wont

    replace traditional news media, but they will supplement them in important ways

    (74). Although Lasica wrote this essay almost ten years ago, we can see now that

    traditional news media indeed continue to dominate the news media landscape and

    are becoming ever more capable to harvest the potential of an active audience. For

    instance, in the Netherlands Dutch news organisation NRC integrates a successful

    weblog with the physical newspaper NRC Next and the Dutch public news

    organisation NOS invites young people to contribute to news on its website NOS op

    3 (formerly known as NOS Headlines).

  • 8/4/2019 Growing Data, Changing Journalism

    6/23

    Eric R. Alberts6

    Mark Deuze, a Dutch professor in communication sciences who has shed light on

    the professional identity of journalists in the context of convergence culture, makes

    a similar observation as J. D. Lasica does. According to Deuze convergence culture-

    based participatory journalism is best understood as some kind of co-creative,

    commons-based news platform that is produced when a professional media

    organisation (top-down) partners with or deliberately taps into the emerging

    participatory media culture online (bottom-up) (Deuze 2008, 109). Furthermore,

    participatory journalism is very much under construction (ibid.). The

    convergence of top-down and bottom-up journalism is a work in progress with

    more or less traditional makers and users of news cautiously embracing its

    potential which embrace is not without problems both for the producers and

    consumers (ibid.). Mark Deuze and J. D. Lasica offer a far more nuanced

    consideration of participatory journalism in the context of convergence culture.This contrasts early utopian-like considerations, which were initially all-too-easily

    taken for granted (Domingo 2008, 680).

    Early studies on participatory journalism have also been criticized because of

    underlying technological determinism (Paulussen & Ugille 2008, 28). Changes in

    journalism were explained as caused by technological developments influencing

    the work of journalists from the outside (Deuze 2008, 110). Pablo Boczkowski,

    underscores the limitations of a sole focus on the effects of new technologies by

    showing that although technologies do produce effects, they can only be

    understood in the dynamics of technology adoption processes (Boczkowski 2004,

    208). Technology must be seen in terms of its implementation, and therefore how

    it extends and amplifies previous ways of doing things (Deuze 2008, 110). Changes

    occurring in the field of journalism are therefore better understood as a mutual

    shaping of technological and social developments rather than as the effects of

    technological processes (Paulussen & Ugille 2008, 28).

    At the beginning of this chapter I stated that data journalism as a practice blends

    technology and culture in a way that it is not feasible to separate the technological

    aspects from its social context. With the use of the theory on participatory

    journalism I would like to argue that data journalism is the gradual outcome of a

    converging culture, which introduces a constantly changing mix of features,

    contexts, processes and ideas into the work of individual news workers (Deuze

    2008, 112). This means that convergence culture in this particular context is not

    merely technologically (Negroponte 1995) nor solely socially driven. Data

    journalism should rather be seen in line of Paulussen and Ugille, as an outcome of

    the mutual shaping of technological and social developments.

  • 8/4/2019 Growing Data, Changing Journalism

    7/23

    Growing Data, Changing Journalism 7

    Critical analysis, however, shows that convergence is a slow and problematic

    process and that its true effects are rather limited. Independent weblogs have not

    replaced news corporations and professional journalists remain to have control

    over a news story. In extending Paulussen and Ugilles line of thought I would

    therefore argue that convergence regarding participatory and data journalism is

    taking place horizontally (between technological and social aspects) rather than

    vertically (between top and bottom, between professionals and amateurs). In other

    words, if convergence culture is generally seen as the process of blurring borders

    then the borders regarding data journalism are blurring between the technological

    and social contexts. The process of convergence in the case of data journalism

    should be captured by a lens that emphasizes actors agency as much as

    technologys capabilities (Boczkowski 2004, 210).

    At the end of this chapter I would like to extend this lens metaphor by Boczkowski

    somewhat further. I believe that this lens can be viewed in terms of Actor-Network-

    Theory (Latour 1999) in which human and non-human actants combine to form

    hybrid actors. When applying this view to data journalism we do in fact see that in

    general the technologies, the data and the software tools, are responsible for larger

    parts of the action chains, rendering actions intrinsically hybrid (Rieder & Schfer

    2008, 161). The digital environment of the database together with the software

    tools that enable access to and structure this environment define possibilities for

    action and conditions of expression (160). According to Rieder and Schfer,

    software is responsible for extending [] the role that technology plays in the

    everyday practices that make up modern life (161). In other words, through the

    lens of Actor-Network-Theory data journalism can be seen as a network that

    consists of linkage between technological, social and cultural actors making data

    journalism a hybrid practice.

    3 Exploring the core properties of data journalism

    In the context of data journalism as a hybrid practice, a network of human and

    non-human actors, we can now explore some of the core elements of data

    journalism through the use of different case studies. This chapter tries to flesh out

    the elements of data journalism by largely following the chain of value creation (see

    illustration 1 below). This chain consists of four properties: raw data, structuring or

    filtering data, visualising data and storytelling. As a closure to this chapter and

    supplementary to these four elements, the aspect of participation, to which the

  • 8/4/2019 Growing Data, Changing Journalism

    8/23

    Eric R. Alberts8

    theoretic background is given in the prior chapter, will be discussed. It will become

    clear that data journalism seems to open up to participatory possibilities in specific

    ways.

    3.1 Properties of data

    It is a truism that the amount of digital data is currently growing faster than

    anything else. A 2008 study by marketing research firm International Data

    Corporation (IDC) revealed that around 1200 exabytes (1 exabyte is 1 million

    terabyte) of digital data was produced that year (Cukier 2010, 5). The majority of

    this data consists out of photos, logs, phone calls and other database-to-database

    information from which only 5% of the information [] is structured, meaning it

    comes in a standard format of words and numbers that can be read by computers

    (ibid.). Data and information are epistemologically different, as information ismade up of a collection of data but data and information are increasingly difficult

    to tell apart. Raw data is interwoven with todays algorithms and powerful

    computers, which can reveal new insights that would previously have remained

    hidden (Cukier 2010, 3-4).

    Gannett, the holding behind newspapers USA Today and The Indianapolis Star,

    has been leader in the area of database applications. Gannett realised early on that

    data should be a driving force in online journalism, for a number of reasons. First,

    data is evergreen content so its value to users does not end after twenty-four

    hours. Second, because of its sheer size, data can be best delivered in a medium

    without space constraints. The data is much more valuable if it is accessible and

    searchable at the users convenience. Third, Gannett realised that data is much

    more applicable to interactive media than, say, in print form. Data is suited for

    research and interaction, not so much for passive activities like reading or viewing

    (Gordon 2007). Supplementary to Gannetts list of data properties, which are

    relevant to journalism, data in general is transmitted and shared in the form of

    text, sound, or images without tangible loss. Because of its freedom from physical

    constraints data is however easy to manipulate. With a simple click large amounts

    of (personal) information can be copied or permanently deleted (Rieder & Schfer

    2008, 163).

    3.2 Structuring data

    In 2002 Wiebke Loosen, assistant lecturer at the Institute of Journalism and

    Communications at the University of Hamburg, concluded that the abundance of

    information on the Internet, in terms of its storage, management, multiple use and

    unlimited possibilities, are challenging journalism regarding its own processes of

  • 8/4/2019 Growing Data, Changing Journalism

    9/23

    Growing Data, Changing Journalism 9

    rationalizing information (Loosen 2002, 5). In structuring the vast amounts of

    data lies the biggest challenge for businesses, governments and journalists alike.

    When structured, data is a potential goldmine. Google is probably one of the most

    obvious examples of a company that knows how to generate economic value from

    large amounts of data. This is largely the reason why companies like Google and

    Amazon choose to also transform physical objects into data objects. Books, for

    instance, are being scaled so that various statistical properties can be analysed for

    other purposes. Bernhard Rieder calls this computational potential or the value

    of the data of millions of scanned books. According to Rieder the book in the age

    of the database adds a contemporary wave of new embedded practices and logistics

    of what do we read and how we read it (Yudin 2011).

    In Rieders view three new practices emerge when books are translated into dataobjects. First, the whole text can be statistically projected that allow various

    explorations of the catalogues content. Second, books can be connected with other

    books through data, and books can also be connected to other data like the Internet

    or Google Scholar. Third, user gestures and practices, such as tagging, clicking,

    number of reads, sales and reviews, can be captured through the use of digital

    books. In the latter case, user data can be used to create navigational experiences

    and opportunities leading to the personalization of reading. In other words, Google

    and Amazon, with their systems to digitize books, transform books into

    information, and then unbind and rebind it again as an interactive, social and

    semantic interface (Yudin 2011).

    The transformation of physical books into data objects by Google and Amazon

    paves the road to structure information and generate new value from it. In the field

    of journalism a similar underlying motive can be found when we look at the recent

    investigation by many news organisations into the emails of former governor of

    Alaska, Sarah Palin i. The state of Alaska released the emails following a two-and-a-

    half year freedom of information process. The emails date from her inauguration as

    governor in 2006 through to 2008 and were released in printed form to the news

    organisations. The emails had to be digitized in order to successfully structure

    them. This is also the case with the documents of British politicians expenses ii. On

    The Guardians specially made homepage it says they have 458.832 pages of

    documents in their possession and 234.877 pages are yet to be analysed. All of the

    nearly 460 thousand pages of receipts and claim forms were uploaded onto The

    Guardians servers as images, which then could be structured in the form of

    tagging. Yet tagging alone is inadequate to distillate a news story out of data.

  • 8/4/2019 Growing Data, Changing Journalism

    10/23

    Eric R. Alberts10

    3.3 Visualising data

    Mirko Lorenz, EJC-member and project leader of the Data Driven Journalism

    initiative (DDJ), states that raw data needs to be transformed into something

    meaningful. As a result the value to the public grows, especially when complex facts

    are boiled down into a clear story that people can easily understand and

    remember ( Data-driven Journalism 2010, 12). Illustration 1 shows besides

    structuring or filtering the raw data, visualisations play an important role in

    generating value as well.

    Illustration 1: Data-driven journalism as a process.

    According to mash-up artist Tony Hirst an important thing to remember about

    data is that it can be used to tell stories, and that it may hide a great many patterns.

    Some of these patterns are self-evident if the data is visualised appropriately

    (Townend 2009). For data journalism visualisation is an important shackle in thechain of value creation.

    An example of how raw data can be visualised and contribute to journalism is The

    New York Times visualisation of President Obamas 2011 budget proposal and how

    it is spent iii. The interactive squares on their website immediately show how the

    Obama administration has planned to spend their budget and how each part of the

    budget relates to other parts. Another example is a visualisation by David

    McCandless for The Guardian, depicting the emergency budget proposal by British

    Chancellor George Osborne iv . McCandless also did a visualisation of the data

  • 8/4/2019 Growing Data, Changing Journalism

    11/23

    Growing Data, Changing Journalism 11

    gathered from opinion polls during the general elections of 2010 in Britain v .

    Another visualisation that has received a lot of attention is the so-called Homicide

    Map by The LA Times, showing Los Angeles County homicide victims vi. The Google

    Maps mash-up shows groups of homicides based on the number of homicides in an

    area. When clicked on a specific homicide the reader is automatically referred to

    the article in the LA Times reporting on the murder. Of course also the British MP

    expenses are being visualised by The Guardian, offering a clear-cut overview of

    what the newspaper has found so far. It is likely that The Guardian will do the same

    when more data about the Sarah Palin emails trickles in.

    The importance of visualisation within data journalism raises the question what

    visualisation exactly is and what risks it brings along. Lev Manovich defines

    information visualisation as a mapping between discrete data and a visualrepresentation (Manovich 2010, 2). He does, however, state that this definition

    does not cover all aspects of information visualisation such as the distinctions

    between static, dynamic (i.e. animated) and interactive visualization [sic] (ibid.).

    While these differences are very important, I would like to follow Manovich in his

    argument that the core idea of visualisation has not changed when we switched

    from pencils to computers (Manovich 2010, 5). So whether the visualisation is

    static or interactive, the core idea still evolves around mapping some properties of

    the data into a visual representation (ibid.).

    With the use of present-day software it is possible to generate visualisations of

    much larger data sets than previously possible. As stated above, this does not mean

    that at its core, visualisations have changed over the last three hundred years.

    Manovich defines two key principles underlying commonplace information

    visualisations: reduction and space. Reduction includes the use of graphical

    primitives, such as points and lines, to reveal patterns and structures in the data.

    The price being paid for this extreme schematization is the loss of %99 of what is

    specific about each object to represent only %1 in the hope of revealing patterns

    across this %1 of objects characteristics (Manovich 2010, 5-6). The use of spatial

    variables, such as position, size and shape, is another core element typical for

    information visualisation. These spatial variables have long been preferred over

    other symbols such as color, tone and transparency.

    Edward R. Tuftes book Visual Explanations (1997) reveals a case that exemplifies

    how reduction and spatial preferences in visualisations can be problematic. The

    case is about the cholera epidemic in London in 1854 and shows how the choice of

    different intervals to display the data gathered by dr. John Snow give very different

  • 8/4/2019 Growing Data, Changing Journalism

    12/23

    Eric R. Alberts12

    representations of this data. If Snow would have chosen a different interval or had

    not been so aware of the data and as thorough in his logical thinking he might have

    never discovered the origin of the epidemic. This case also shows how popular

    journalisms choice to aggregate or over-compress data can lead to misleading

    graphical representations. In their article How Not To Lie With Visualisations

    Bernice Rogowitz and Lloyd Treinish demonstrate how different representation of

    a MRI scan of a human head can influence the interpretation of the data. They

    argue that [i]n order to accurately represent the structure in the data, it is

    important to understand the relationship between data structure and visual

    representation (Rogowitz & Treinish 1995, 4). They conclude by stating that

    although nowadays non-experts can create meaningful representation of their data

    it is still not easy enough because the visual effects are not well understood by the

    user (Rogowitz & Treinish 1995, 14).

    Lev Manovich, however, emphasises that new visualisation techniques and projects

    developed since the middle of the 1990s seem to no longer strictly take data that is

    not visual and map it into a visual domain (Manovich 2010, 11). According to

    Manovich the development of computers and the progress in their media capacities

    has made it possible to visualise data without reduction: While graphical

    reduction will continue to be used, this no longer [sic] the only possible method

    (23). This new method of visualisation or direct visualisation can be exemplified

    by the use of tag clouds. The tag cloud is an example of a reorganisation of data into

    a new representation that preserves its original form: text remains text (12). A good

    example of a tag cloud used in journalism is the word cloud by John Schwenkler, at

    the time a graduate student in philosophy at the University of California, which got

    published in The Boston Globe vii. The cloud revealed that the official weblog of

    John McCain, the republican candidate for presidency, used the word Obama

    more often than any other word. Even more than Obamas own official blog.

    With the use of direct visualisation patterns in the data can be highlighted without

    having to reduce or spatially arrange the data with the use of abstract graphical

    elements. However, in the case of information visualisation, direct visualisation is

    still not that common as in scientific, medical and geovisualisation. During the

    1990s and 2000s the speed and processing power of personal computers

    progressively increased, but still information visualisation remained to depend on

    static vector graphics. Only very recently are sophisticated tools allowing for

    interactive constructions of direct visualisation appearing. Manovich concludes

    that the ability to show artefacts in full detail is crucial to humanities, as it helps

  • 8/4/2019 Growing Data, Changing Journalism

    13/23

    Growing Data, Changing Journalism 13

    the researcher to understand meaning and/or cause behind the pattern she may

    observe, as well as discover additional patterns (23).

    One can say that this ability is crucial to journalism as well. Visualisation is a key

    element for revealing patterns in raw or structured data and making it

    understandable for a large audience. For journalists and their employers it is, for

    the sake of objectively informing their audience, crucial that these visualisations

    display the actual facts. As Manovich has shown, however, information

    visualisation is not the same as scientific visualisation and it has a long history of

    reducing data to graphical primitives and specific spatial preference. Incorrect

    visualisations, which give a distorted view of the actual data, could have large-scale

    negative consequences. Direct visualisation, as introduced by Lev Manovich, seems

    to offer a solution to this problem. Now it is possible to visualise large quantities of data without reduction and the software tools that make these direct visualisations

    possible are rapidly dispersed across the Internet. For instance, ManyEyes,

    Tableau, Yahoo Pipes, the University of Amsterdam, Open Calais, and of course

    Google offer (free) tools for data visualisation, paving the road for objective data

    journalism.

    3.4 Storytelling with data

    A large part of the EJC roundtable conference in Amsterdam focused on how to tell

    stories with data. Surprisingly none of the speakers really questioned if data

    necessarily needs to tell a story at all. Adrian Holovaty, a pioneer in data-driven

    journalism with a background in both journalism and computer programming,

    does question that, suggesting newspapers need to make an important shift and

    stop the story-centric worldview (Holovaty 2006). Holovaty claims the daily

    processes of journalists are, in practical terms, inefficient, wasting too much of the

    powerful raw data at the root of the stories. Instead, news should be orientated

    toward computers thereby hoping journalists and data will meet in the middle

    (Kiss 2008). If so, structured data remains structured and no longer has to be

    deconstructed for the purpose of writing a traditional news story. From his

    experience as a journalist Holovaty knows that newspaper organisations

    traditionally already collect lots of information, which is relentlessly structured. It

    just takes somebody to start storing it in a structured format (ibid.).

    Holovatys argument is best understood through the use of examples. For instance,

    Faces of the Fallen is a public and searchable database of all the U.S. service

    members who died in Operation Iraqi Freedom and Operation Enduring

    Freedom viii

    . Reporters at The Washington Post already were keeping a detailed

  • 8/4/2019 Growing Data, Changing Journalism

    14/23

    Eric R. Alberts14

    database of the deceased service members but this data was most of the time sitting

    around unused. In two weeks time Holovaty and his co-workers built the data into

    a powerful tool for the public and it was a catalyst for further reporting and used by

    activist groups to protest against the war (Kiss 2008). Holovaty also created a

    public and searchable database named Everyblock which made it possible to find

    crimes committed in the city of Chicago ix. The data comes from CLEARMap, the

    crime mapping website of the Chicago Police Department and includes information

    on where and when each crime occurred, thereby again using available but unused

    data.

    Although Holovatys manifesto for computer orientated journalism has inspired

    many, including the founder of Pulitzer Prize winning website Politifact x which

    compares political statements with actual facts (Waite 2007), there are examples where news organisations use data more in the classical journalistic tradition.

    Examples are the news stories based on the Afghanistan War Logs xi, which were

    made available by independent organisation WikiLeaks to several news

    organisations. Meanwhile the documents have all been structured and are available

    through news organisation websites. The New York Times, however, primarily uses

    this data to bring regular news stories. Reporters Cynthia OMurchu and Carola

    Hoyos of The Financial Times seem to have stayed somewhat closer to Holovatys

    view, as they produced several interactive graphics, including an interactive chart

    on oil and gas chief executives and their salaries xii. In turn, the graphics serve as the

    basis for traditional (follow-up) news stories. These examples show that there seem

    to be different views among large news organisations when it comes to

    implementing and using data. Holovaty-like data journalism is praised and

    sometimes pursued, but also often questioned. Given the fact that Holovaty-like

    examples are quite scarce it is fair to say data mainly stands in service of the news

    story.

    3.5 Participatory aspects of data journalism

    Chapter 2 elaborated on the potential change of traditional news media-user

    relationships under the process of convergence, the blurring of boundaries set by

    the conditions of digitization. Whether this potential is called citizen, grassroots or

    participatory journalism, it all boils down to the emergence of bottom-up initiatives

    as counterweight to the large top-down news organisations. Comparison between

    early writings on participatory journalism and its current status reveals that

    participatory journalism should not be regarded as replacement of top-down news

    organisations but rather as collaboration between these organisations and their

    audiences. News organisations have learned and continue to learn to optimally

  • 8/4/2019 Growing Data, Changing Journalism

    15/23

    Growing Data, Changing Journalism 15

    utilise new media affordances and to tap into the desire of readers to be part of the

    news making process. Data journalism can be considered as an outcome of this

    utilisation and as a testing ground for further collaboration between news

    organisations and consumers. In the specific case of data journalism citizens are

    not replacing journalists but they are adding to the chain of value creation as they

    canalise raw material, such as documents, videos or photos and help journalists

    tackle the problem of structuring the vast amounts of data.

    Crowdsourcing, a term coined by Jeff Howe in an article for Wired, is something

    with which news organisation are increasingly experimenting and can best be

    described as tapping into the latent talent of the crowd (Howe 2006) or using

    the crowds as an investigative ancillary force (Howe 2009, xxiv). For instance, in

    April of 2009 The New York Times release a press release in which it invited theirreaders to comb through the full schedules of Timothy F. Geithner when he was

    president of the Federal Reserve Bank of New York xiii. Also, in February of that year

    The Huffington Post called upon its readers to help dig through the U.S. Senate

    stimulus bill xiv . Other prime examples are, again, the British politicians expenses

    scandal and most recently the investigation of the Sarah Palin emails. In all of these

    examples news organisations use the combined analytical strength of their

    audience with the aim to generate stories out of large data sets. News organisations

    use their audience for investigative work, to swift through piles of documentation.

    The journalists role in this process is to collate and analyse the findings, making

    the journalist the central point of direction.

    As Alfred Hermida points out there are also examples of crowdsourcing without

    central direction (Hermida 2010). One of these examples is the Kenian open source

    platform Ushahidi xv , which was founded in 2008 by a group of bloggers who

    wanted to give a response to the wave of ethnic violence sweeping the country in

    the wake of elections (Buntling 2011). Ushahidis next project, Huduma (Swahili

    for service), will use crowdsourcing in Kenya to monitor the effectiveness of

    services such as health and education (ibid.). Hermida also refers to social network

    Twitter allowing crowdsourcing to happen on a distributed, asynchronous

    manner, with individuals acting independently yet collectively at the same time

    (Hermida 2010). An example where the mass collaboration of total strangers on

    the web (ibid.) worked was when multinational Trafigura legally banned The

    Guardian from reporting on the alleged dumping of toxic waste off the shores of

    Ivory Coast. Trafigura became a trending topic on Twitter as the topic was widely

    discussed and in less than 24 hours Tragifura backed down (ibid.).

  • 8/4/2019 Growing Data, Changing Journalism

    16/23

    Eric R. Alberts16

    The latter example shows how crowdsourcing can be beneficial for journalism in

    other ways than for investigative work but does not necessarily apply to data

    journalism specifically. When looking at the given data-driven examples, audiences

    are primarily used to contribute to information structuring. The Guardian does

    however also outsource visualisation tasks. In a specially made group on photo

    community Flickr, users can post graphical translations of large data sets, which

    can be downloaded from The Guardians Datastore xvi. It remains the question,

    though, if crowdsourcing data and data visualisations means data journalism is

    intrinsically participatory. News organisations are increasingly implementing so-

    called data desks to the work floor as extension to the editorial office. Eric Ulken, a

    former reporter of The LA Times, published an article in which he describes the

    process of assembling the data desk. According to Ulken the data desk can be seen

    as a cross-functional team of journalists responsible for collecting, analysing andpresenting data online and in print (Ulken 2008). Furthermore, the report of the

    EJC roundtable conference shows data journalism can profit greatly from applying

    the know-how of graphic designers and IT-specialists ( Data-driven Journalism

    2010). Adding multiple disciplines to the data desk may imply that participation of

    the public in the process of creating news stories is just as likely to stagnate.

    4 Conclusion

    Questioning if data journalism in intrinsically participatory is one of many

    questions still open for debate concerning a new form of journalism, which is

    slowly taking form under continuously changing conditions set in a world that is

    increasingly relying on technology and digital information. This paper has tried to

    give context to this new phenomenon and has explored its core properties using a

    variety of examples. At this point, however, it is too difficult to tell what the

    implications of data journalism will be. The assumption that a website such as

    Politifact, which checks U.S. politicians if their statements are based on facts, will

    increase the publics trust in, say, journalism, politics or democracy, is yet to be

    proven. For instance, the University of Michigan found in a series of research that

    misinformed people, who were exposed to corrected facts in news stories, rarely

    changed their minds. Political partisans particularly became even more strongly set

    in their beliefs. Facts can make misinformation even stronger (Keohane 2010).

    Instead of focussing on possible implications, this paper has tried to place data

    journalism within a broader context and has tried to flesh out its core properties in

    order to further comprehend this new phenomenon. The theoretical framework

  • 8/4/2019 Growing Data, Changing Journalism

    17/23

    Growing Data, Changing Journalism 17

    tells us that data journalism can be placed against the background of journalism

    wherein traditional borders have continued to blur over the last decade. Set by the

    conditions of digitization, readers have also become users that are able to add value

    in the process of news making. This process is, however, a slow process and unlike

    the ideas posed at the beginning of this century, participatory journalism has not

    yet been able to crumble the power large news organisations. This does not mean

    the voice of the public is not being heard. The dispersion of increasingly

    sophisticated and free-to-use software and data sets enables people to contribute to

    journalism in a new way. Top-down journalism is in some aspects meeting bottom-

    up, grassroots journalism but it remains to be work in progress that often offers

    more questions than answers.

    Against the backdrop of convergence culture I have argued data journalism can beregarded as the outcome of the mutual shaping of technological and social

    developments. Besides vertical top-down-meets-bottom-up, convergence is also

    taking place on a horizontal axis, between the technological and social contexts of

    journalism. Technological aspects are becoming inseparably intertwined with social

    aspects, as reporters are coming to rely on databases as fertile soil for the creation

    of news stories. In terms of Actor-Network-Theory these human and non-human

    actants combine to form hybrid actors. In general the technologies, the data and

    the software tools, are responsible for larger parts of the action chains, rendering

    actions intrinsically hybrid. Data journalism can therefore be regarded as a hybrid

    practice.

    Exploration of the core properties of data journalism amplifies its relation with

    data and shows the path journalists have to take in order to distillate a story out of

    data. On the one hand structuring and visualising data can be crucial shackles in

    getting from raw data to story. Sophisticated software tools make it easier than ever

    to structure large quantities of data and to visualise data without reducing crucial

    data. On the other hand these shackles are not part of a fixed chain or the only

    road to deriving news stories from data. Moreover, journalist and computer

    specialist Adrian Holovaty argues that nowadays making sense of complicated data

    for an audience alone is just as important as telling a story.

    Whatever path journalists will walk, whether it is through visualisations, telling

    stories, crowdsourcing or building databases, it almost goes without say that the

    future for journalism lies in analysing big data. This is a standpoint shared by Sir

    Tim Berners-Lee, founder of the World Wide Web. According to Berners-Lee the

    responsibility lies with journalists to hold governments, or any one else,

  • 8/4/2019 Growing Data, Changing Journalism

    18/23

    Eric R. Alberts18

    accountable, as information increasingly is made available on the Internet (Arthur

    2010). How long it will take before the interdisciplinary data desk, with computer

    specialists, graphic designers and journalists working together, becomes a full-

    grown and respected part of the editorial office remains to be seen. Whatever the

    implications will be, as databases keep on growing, culture and technology keeps

    on converging and audiences keep on participating, there will be a role for data

    journalism out there, somewhere.

    References Arthur, Charles. Analysing Data is the Future for Journalists, Says Tim Berners-

    Lee. The Guardian 22 Nov. 2010. 3 Jul. 2011

    .

    Boczkowski, Pablo J. The Processes of Adopting Multimedia and Interactivity in

    Three Online Newsrooms. Journal of Communication 54.2 (2004): 197:213.

    Bunting, Madeleine. Crowdsourcing Put to Good Use in Africa. The Guardian 19

    May 2011. 3 Jun. 2011 < http://www.guardian.co.uk/global-

    development/poverty-matters/2011/may/19/crowdsourcing-good-use-in-

    africa >.

    Castells, Manuel. The Information Age: Economy, Society and Culture. Malden

    MA: Blackwell, 3 volumes, first published in 1996.

    Cukier, Kenneth N. Data, Data Everywhere. The Economist Special Report 27

    Feb. 2010: 3-18.

    Data-driven Journalism: What is there to Learn? Amsterdam: European

    Journalism Centre, 2010.

    Deuze, Mark. The Professional Identity of Journalists in the Context of

    Convergence Culture. Observatorio Journal 7 (2008): 103-117.

    Domingo, David. Interactivity in the Daily Routines of Online Newsrooms:

    Dealing with an Uncomfortable Myth. Journal of Computer-Mediated

    Communication 13.3 (2008): 680-704.

  • 8/4/2019 Growing Data, Changing Journalism

    19/23

    Growing Data, Changing Journalism 19

    Domingo, David et al. Participatory Journalism Practices in the Media and

    Beyond: An International Comparative Study of Initiatives in Online

    Newspapers. Journalism Practice 2.3 (2008): 326-342.

    Gillmor, Dan. We the Media: Grassroots Journalism by the People, for the People.

    Sebastopo, CA: OReilly Media, 2004.

    Gordon, Rich. Data as Journalism, Journalism as Data. Readership Institute 14

    Nov. 2007. 3 Jul. 2011 < http://getsmart.readership.org/2007/11/data-as-

    journalism-journalism-as-data.html >.

    Heinonen, Ari. The Journalists Relationship with Users: New Dimensions to

    Conventional Roles. Participatory Journalism: Guarding Open Gates at Online Newspapers, Eds. Jane B. Singer et al. Malden, MA: Wiley-Blackwell,

    2011.

    Hermida, Alfred. How the MSN is Tackling Participatory Journalism. Reportr.net

    24 May 2008. 3 Jul. 2011 < http://www.reportr.net/2008/05/24/how-the-

    msm-is-tackling-participatory-journalism/ >.

    Hermida, Alfred. The Impact of Crowdsourcing on Journalism. Reportr.net 15

    Oct. 2010. 3 Jun. 2011 < http://www.reportr.net/2010/10/15/impact-

    crowdsourcing-journalism/ >.

    Holovaty, Adrian. A Fundamental Way Newspaper Sites Need to Change.

    Holvaty.com 6 Sep. 2006. 3 Jul. 2011

    .

    Howe, Jeff. The Rise of Crowdsourcing. Wired 14 Jun. 2006. 3 Jul. 2011

    .

    Howe, Jeff. Crowdsourcing: Why the Power of the Crowd is Driving the Future of

    Business. New York: Three Rivers Press, 2008.

    Jenkins, Henry. The Cultural Logic of Media Convergence. International Journal

    of Cultural Studies 7.1 (2004): 33-43.

    Jenkins, Henry. Convergence Culture: Where Old and New Media Collide. New

    York: New York UP, 2006.

  • 8/4/2019 Growing Data, Changing Journalism

    20/23

    Eric R. Alberts20

    Kahey, Leander. Citizen Reporters Make the News. Wired 17 May 2003. 3 Jul.

    2011 < http://www.wired.com/culture/lifestyle/news/2003/05/58856 >.

    Keohane, Joe. How Facts Backfire. Boston.com 11 Jul. 2010. 3 Jul. 2011

    .

    Kiss, Jemima. Future of Journalism: Adrian Holovatys Vision for Data-friendly

    Journalists. The Guardian 6 Jun. 2008. 3 Jul. 2011

    .

    Lasica, J. D. Blogs and Journalism Need Each Other. Nieman Reports 57 (2003):

    70-74.

    Latour, Bruno. Pandoras Hope: Essays on the Reality of Science Studies.

    Cambridge, MA: Harvard UP, 1999.

    Loosen, Wiebke. The Second-Level Digital Divide of the Web and Its Impact on

    Journalism. First Monday 7.8 5 Aug. 2002. 3 Jul. 2011

    .

    Lowrey, Wilson and William Anderson. The Journalist Behind the Curtain:

    Participatory Functions on the Internet and their Impact on Perceptions of the

    Work of Journalism. Journal of Computer-Mediated Communication 10.3

    (2005).

    Manika, James. Hal Varian on How the Web Challenges Managers. McKinsey

    Quarterly January 2009. 3 Jul. 2011

    .

    Manovich, Lev. What is Visualization? Manovich.net 25 Oct. 2010. 3 Jul. 2011

    .

    Negroponte, Nicholas P. Being Digital. New York: Vintage Books, 1995.

  • 8/4/2019 Growing Data, Changing Journalism

    21/23

    Growing Data, Changing Journalism 21

    Paulussen, Steve and Pieter Ugille. User Generated Content in the Newsroom:

    Professional and Organisational Constraints on Participatory Journalism.

    Westminister Papers in Communication and Culture 5.2: 2008, 24-41.

    Rieder, Bernhard and Mirko Tobias Schfer. Beyond Engineering: Software

    Design as Bridge over the Culture/Technology Dichotomy. Philosophy and

    Design. Eds. Pieter E. Vermaas et al. Springer, 2008.

    Rogowitz, Bernice E. and Lloyd A. Treinish. How Not to Lies with Visualizaton.

    IBM Research 1995. 3 Jul. 2011

    .

    Townsend, Judith. #DataJourn Part 2: Q&A with Data Juggler Tony Hirst. Journalism.co.uk 8 Apr. 2009. 3 Jul. 2011

    .

    Tufte, Edward R. Visual Explanations: Images and Quantities, Evidence and

    Narrative. Cheshire, Conneticut: Graphics Press, 1997.

    Ulken, Eric. Building the Data Desk: Lessons from the L.A. Times. The Online

    Journalism Review 21 Nov. 2008. 3 Jul. 2011.

    .

    Waite, Matt. Announcing Politifact. Matt Waite 22 Aug. 2007. 3 Jul. 2011

    .

    Yudin, Ekaterina. Bernhard Rieder: 81,498 Words: the Book as Data Object. The

    Unbound Book 21 May. 2011. 3 Jul. 2011. < http://e-

    boekenstad.nl/unbound/index.php/bernhard-rieder-81498-words-the-book-as-

    data-object/ >.

    Examples of data journalism used in this paper

    i The Guardian Crowdsourcing the Sarah Palin emails

    .

  • 8/4/2019 Growing Data, Changing Journalism

    22/23

    Eric R. Alberts22

    ii The Guardian British MP expenses

    .

    iiiThe New York Times Obamas budget and how it is spent

    .

    iv The Guardian - Emergency budget proposal 2010

    .

    v The Guardian General election opinion polls 2010

    .

    vi The LA Times The Homicide Report

    .

    vii The Boston Globe Portrait of the candidate as a pile of words

    .

    viii The Washington Post Faces of the fallen

    .

    ix Everyblock Make your block a better place

    .

    x Politifact Sorting out the truth in politics

    .

    xi The New York Times The Afghan war logs

    .

    xii The Financial Times Oil and gas chief executives

    .

  • 8/4/2019 Growing Data, Changing Journalism

    23/23

    Growing Data, Changing Journalism 23

    xiii The New York Times The schedules of Timothy F. Geithner

    .

    xiv The Huffington Post The Senate stimulus bill

    .

    xv Ushahidi - information collection, visualization and interactive mapping .

    xvi The Guardian Data store on Flickr

    .