data driven journalism
TRANSCRIPT
Carol Perruso
Journalism Librarian
Feb. 12, 2013
DATA-DRIVEN JOURNALISM: THE BASICS
WHAT IS DATA-DRIVE JOURNALISM?
• "Data-driven journalism enables reporters to tell untold stories, find new angles or complete stories via a workflow of finding, processing and presenting significant amounts of data….”
• Henk van Ess, Dutch reporter
ANOTHER WAY OF LOOKING AT IT
FIRST: DATA OR STORY IDEA?
• “Data journalism begins in one of two ways: either you have a question that needs data, or a dataset that needs questioning.” –Paul Bradshaw
WHAT’S INVOLVED?
• Data has to be found, which may involve computer research skills or good old reporting or FOI requests.
• Reporter has to get to know the data.
• Analysis: What story does the data tell?
• Make data accessible/understandable by readers: Story/graphics
FINDING THE DATA• Bradshaw outlines the ways you might get data. They might be:
• Supplied by an organization (“how long until we see ‘data releases’ alongside press releases?”)
• “Found through using advanced search techniques to plough into the depths of government websites”
• “Compiled by scraping databases hidden behind online forms or pages of results” using specialized tools.
• Converted from documents into a form that can be analyzed
• Pulled from APIs (application programming interfaces)
• Collected by the reporter
GETTING TO KNOW THE DATA• CLEAN IT UP:
• Removing human error:
• Removing duplicate entries;
• Deleting blanks
• Converting descriptions to a uniform format/language (e.g. BBC or B.B.C or British Broadcasting Corporation)
• Converting the data into a format that is consistent with other data you are using.
• TOOLS: Find and Replace in Excel or Google Refine
INTERVIEW THE DATA
• Do you speak the same language?
• Where do you come from?
• Who created you?
• How were you gathered?
• What are your goals?
• Do they match yours?
ANALYSIS: SOME EXAMPLES
• Sort by scale: highest to lowest e.g. highest to lowest paid public employees
• Adding it up: e.g. Total amount of salaries paid to players of a professional baseball team
• Average: Average pay for an employee in a certain job category
• Geographical groupings and distribution
TOOLS: WHAT ARE REPORTERS USING?• Excel• Google Fusion• SPSS• Access • Google Refine• Social Explorer www.socialexplorer.com • Python• Tableau Public
VISUALIZATION: EXAMPLES
• New York Times: The 2012 Budget, How $3.7 trillion is spent.
• Immigration trends: New York Times
• Netflix rental patterns: New York Times
• Pay patterns: Sacramento Bee
• Gas prices: Los Angeles Times
DATA TO PLAY WITH
• Earthquake data
• Earthquakes
• Survey on gun ownership vs. gun control
• Rights to own guns survey