github as transparency device in data journalism, open data and data activism

73
GitHub as Transparency Device in Data Journalism, Open Data and Data Activism Digital Methods Initiative Summer School 2015 Liliana Bounegru, Jonathan Gray & Stefania Milan

Upload: lilianabounegru

Post on 14-Aug-2015

651 views

Category:

Education


3 download

TRANSCRIPT

Page 1: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

GitHub as Transparency Device in Data Journalism, Open Data and Data Activism

Digital Methods Initiative Summer School 2015"Liliana Bounegru, Jonathan Gray & Stefania Milan

Page 2: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Part of a broader research collaboration: !• Data Journalism (Liliana Bounegru) • Open Data (Jonathan Gray) • Data Activism (Stefania Milan) • Digital Methods (Richard Rogers & Erik Borra)

Page 3: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

How is GitHub reconfiguring… data journalism? open data? data activism?

Page 4: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

How is GitHub reconfiguring… data journalism? open data? data activism?

Page 5: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Our work on data journalism includes…

Page 6: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Witnessing and Auditing Journalism in the Making with GitHub

Page 7: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

Page 8: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

Page 9: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

An example of the role of GitHub in open data journalism.

Page 10: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

New York Times (2014) “War Gear Flows to Police Departments”"http://www.nytimes.com/2014/06/09/us/war-gear-flows-to-police-departments.html

Page 11: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

New York Times (2014) “Mapping the Spread of the Military’s Surplus Gear”"http://www.nytimes.com/interactive/2014/08/15/us/surplus-military-equipment-map.html

Page 12: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

New York Times (2014) “What Military Gear Your Local Police Department Bought”"http://www.nytimes.com/2014/08/20/upshot/data-on-transfer-of-military-gear-to-police-departments.html

Page 13: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

The Upshot on GitHub: https://github.com/TheUpshot/Military-Surplus-Gear

Page 14: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

The Upshot on GitHub: https://github.com/TheUpshot/Military-Surplus-Gear

Page 15: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

http://earino.shinyapps.io/Military-Surplus/

Page 16: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

https://github.com/cinquemb/1033-program-quick-drill-down

Page 17: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Charleston Daily Mail (2014) “Federal programe sends military equipment to WV law enforcement” http://www.charlestondailymail.com/article/20140819/DM01/140819135/1420

Page 18: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

GitHub as a device for multiplying witnessing around police acquisition of military equipment.

Page 19: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

What does openness mean in the context of journalism?

Page 20: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Alex Howard, The Art and Science of Data-Driven Journalism

“The embrace of open source software and agile development practices, coupled with a growing

open data movement, have breathed new life into traditional computer-assisted reporting.”

Page 21: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Advocates discuss openness in terms of: !• Transparency • Collaboration and participation

Page 22: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Simon Rogers, “Hey Wonk Reporters, Liberate Your Data!”

“Data journalism only matters when it's transparent.” ”  

Page 23: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Mathew Ingram, “Open journalism also means opening up your data, so others can use and improve it”

“Open journalism … means opening up your data, so others can use and improve it.”

Page 24: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Simon Rogers, “Journalist datastores: where can you find them? A list”

“It’s a pretty core tenet of open journalism that you share your sources; i.e., you write a story about data then you make numbers available to download ”

Page 25: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

––Simon Rogers, “Hey Wonk Reporters, Liberate Your Data!”

“Journalism today is at least as much about working with the community as it is telling the world what

you think happened. The ethos of open journalism is that reporting becomes better by gathering the expertise of the world and helping to curate it.”

”  

Page 26: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Openness in the service of what?

Page 27: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Advocates associated openness with: !• Trust, credibility and accountability • Fact-checking and optimisation

(“many eyes make shallow bugs”) • Innovation and reusability • Democratising data and levelling the

playing field

Page 28: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

– Nicolas Kayser-Bril in Scott Nesbitt’s “Is open data living up to the hype? One data journalist weighs in”

“Open source makes an organization more transparent and, therefore, more trustworthy.

Newsrooms are moving towards open source; just look at the number of journalists using GitHub now!”

Page 29: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Mathew Ingram, “Open journalism also means opening up your data, so others can use and improve it”

“As with the code behind software programs — the original use for things like GitHub — there are

a host of benefits to opening up the data that provides the foundation for news stories,

including the fact that more eyeballs on the data means a greater likelihood of finding errors

and/or misinterpretations of that data.”

Page 30: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Alex Plough, “The Evolution of Data Journalism: from CAR to fivethirtyeight”

“ GitHub lets users duplicate others’ code and re-purpose it for their own needs. This feature lets data journalism teams across the world quickly replicate

each other’s projects, spurring innovation with increasingly sophisticated news applications.”

Page 31: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Alex Salkever, “Open Source Journalism: Data and the New News”

“Open source journalism levels the playing field. Every neighborhood blogger in California or New

York or London can now post visualization using the very same data that the biggest news organizations in the world have use. And the blogger can focus

that data down on the local impact.”

Page 32: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

What is the role of GitHub in open data journalism?

Page 33: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Alex Plough, “The Evolution of Data Journalism: from CAR to fivethirtyeight”

“Another trend is the use of software code-hosting platform Github by news organizations. Typically used by the open-source software development

community to store and share their code online (in “repositories”), GitHub lets users duplicate others’

code and re-purpose it for their own needs.”

Page 34: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Tom Giratikanon, Erin Kissane, Jeremy Singer-Vine,“When the news calls for raw data”

“Why post it on GitHub? … As journalists marshall more data than ever, collect it from a

wider range of sources, and analyze it in increasingly complex ways, it’s important (and

interesting!) to be transparent about those processes. I think about it in three ways:

verifiability …, reproducibility …, reusability.”

Page 35: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

– Emily Ferber, “Getting GitHub: Why journalists should know and use the social coding site”

“As more journalists embrace GitHub as a way to improve stories, they’ll develop a new kind of news community, centered around collaboration

and code – truly a news nerd’s nirvana.”

Page 36: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

Page 37: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)"

3. Research design: Mapping open data journalism with GitHub (our project)

Page 38: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

To understand what is at stake we turn to studies of openness and transparency in the

context of government and activism.

Page 39: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Some points raised by this research in relation to studied openness or transparency programmes: !• Uncoupling of data and code from politics • Witnessing data publics/subjects • Presumption of absence of trust • Anticipation of moral failings

Page 40: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Clare Birchall in “Data.gov-in-a-box” on Obama’s data-driven transparency programme:

• “post-political solution” • “data in lieu of politics” • emphasis on individual rather than collective

political agency • “only reveals that which is conducive of

maintaining the status-quo.”

Page 41: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Clare Birchall, “‘Data.gov-in-a-box’: Delimiting transparency”

“The openness of all this data is obviously meaningless until it is witnessed.”

Page 42: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Clare Birchall, “‘Data.gov-in-a-box’: Delimiting transparency”

“What kind of publics, subjects, and indeed, politics it [data-driven transparency model] will

produce?”

Page 43: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

The “auditor–entrepreneurial–consumer subjectivity”

Page 44: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Clare Birchall, “‘Data.gov-in-a-box’: Delimiting transparency”

“The data subject is therefore called upon to be auditor (to monitor the granular transactions

of the state in the name of accountability), entrepreneur (to make data profitable through apps and visualizations) and consumer (as the

market for such apps and visualizations).”

Page 45: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Clare Birchall, “Data.gov-in-a-box”:

• The burden of monitoring the state moves from the state to the citizens.

• “A subject who is monitored while being asked to monitor; acted upon as data while being asked to act on data.”

• Agency is reliant on technological competence.

Page 46: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Because visibility is about gaining trust a transparency device presumes that there is

an absence of trust in the first place.(Harvey, Reeves & Ruppert, 2012)

Page 47: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Penny Harvey, Madeleine Reeves & Evelyn Ruppert, “Anticipating failure”

“It is to past moral failures of wrongdoing, conflict or corruption that these [transparency]

devices react and consequently it is the anticipation of future moral failings towards

which they are then oriented.”

Page 48: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

–Penny Harvey, Madeleine Reeves & Evelyn Ruppert, “Anticipating failure”

“As such rather than alleviating uncertainty they come to amplify it.”

Page 49: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

Page 50: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

1. What is open data journalism and what does GitHub have to do with it? (the advocates)

2. How has openness been studied as a political concept? (the critics)

3. Research design: Mapping open data journalism with GitHub (our project)

Page 51: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Questions.

Page 52: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

How can we use these studies to make sense of the move to make journalism more trustworthy

and accountable through the opening up of data and code?

Page 53: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

To be meaningful journalistic data and code need to be witnessed.

Page 54: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

What kinds of publics are mobilised around open journalism data and code through GitHub?

Page 55: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

What forms of trust and accountability are produced by the opening up of data and code?

Page 56: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

How does GitHub mobilise and format engagement with journalism and with what effects?

Page 57: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Research design

Page 58: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

How to map “open data journalism”with GitHub?

Page 59: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

GitHub not only as device for witnessing and auditing journalism in the making

Page 60: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

But also a source of data about such practises

Page 61: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism
Page 62: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism
Page 63: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism
Page 64: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Over 200 programmer-journalist GitHub accounts. Over 60 journalism organisations GitHub accounts.

Page 65: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Five studies: !

1. Situating GitHub in the journalism ecology 2. Mapping journalism data publics with

GitHub 3. Profiling journalism practises and product

repertoires through the “distant reading” of code

4. Mapping open data on GitHub 5. Mapping data activism on GitHub

Page 66: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Five studies: !

1. Situating GitHub in the journalism ecology"2. Mapping journalism data publics with

GitHub"3. Profiling journalism practises and product

repertoires through the “distant reading” of code"

4. Mapping open data on GitHub 5. Mapping data activism on GitHub

Page 67: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

1. Situating GitHub in the journalism ecology This study will locate GitHub in the data journalism space in terms of its resonance. It will trace the issues associated with it, particularly exemplary projects, programming languages, tools, analytical techniques, visions and values. !The data journalism space will be demarcated through a three-year collection of tweets containing related keywords and hashtags, as well as through associated mailing lists and events.

Page 68: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

2. Mapping journalism data publics with GitHubThis study profiles the journalism publics, practises and product repertoires active on GitHub. The focus is on functions, modes of engagement, as well as trust and accountability mechanisms and how they are mediated and reconfigured through GitHub, open code and data. !To do so it uses custom-made GitHub scrapers to extract data around users and repositories, and analyses such data manually and by means of network analysis tools.

Page 69: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

3. Profiling journalism practises and product repertoires through the “distant reading” of codeThis study scopes out possibilities for using digital traces of code from GitHub to inform a “distant reading” of the ideals and practises of emergent data publics in journalism and civil society. !In addition to the tracing of actor networks and their modes of engagement through the analysis of GitHub metadata in study 1, this study enquires into the possibilities and methods for undertaking an analysis of the actual code in the journalism repositories to examine the epistemological commitments, horizons, styles of reasoning and action repertoires of journalism data publics.

Page 70: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Same approach will be used to studydata activism and open data.

Page 71: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

The Team

Facilitators: • Liliana Bounegru (@bb_liliana / lilianabounegru.org) • Jonathan Gray (@jwyg / jonathangray.org) • Stefania Milan (@annliffey / stefaniamilan.net) !Programmer-analyst: • Sam Leon (@noel_mas)

Page 72: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Who should join us?"!• Anyone active around or interested in data

journalism, data activism and/or open data. • GitHub users or people familiar with the

platform. • Designers and programmers.

Page 73: GitHub as Transparency Device in  Data Journalism, Open Data and Data Activism

Join Us!