enhancing c-span video archive with practice capital metadata and data journalism apis

17
ENHANCING THE C-SPAN ARCHIVE WITH COMMUNICATIVE METADATA: A PRACTICE CAPITAL PROPOSAL Sorin Adam Matei Associate Professor Discovery Park and Polytechnic Institute Fellow Director of Research for Computational Social Science, CyberCenter BRIAN LAMB SCHOOL OF COMMUNICATION

Upload: sorin-adam-matei

Post on 21-Jun-2015

130 views

Category:

News & Politics


0 download

DESCRIPTION

The presentation argues that the C-Span archive is not a mere repository of moving pictures. It can also be seen as a one of a kind “big data” repository. If processed from a “practice capital” perspective with quantitative and network analytic tools, such data can significantly extend the capabilities of C-Span archives by identifying the central actors in a debate and their ability to sway it. The proposed approach may serve the public interest though API tools that support third party development of visualization and analytic apps, which can lead to more informed debates and new forms of data driven journalism.

TRANSCRIPT

Page 1: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

ENHANCING THE C-SPAN ARCHIVE WITH COMMUNICATIVE METADATA: A PRACTICE CAPITAL PROPOSAL

Sorin Adam MateiAssociate ProfessorDiscovery Park and Polytechnic Institute FellowDirector of Research for Computational Social Science, CyberCenter

BRIAN LAMB SCHOOL OF COMMUNICATION

Page 2: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

DATA EVERYWHERE

• The C-Span Archive is a Big Data repository• Social and Political Big Data• Captures not just words or moving images

but INTERACTIONS

Page 3: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

AN INTERACTION REPOSITORY

• The C-Span archive captures who said, what, to whom

• Sender Message Receiver• Concatenated, such chains of interaction

become SOCIAL NETWORKS OF DEBATE

Page 4: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

COMMUNICATIVE META-DATA

• Each member of the network can be evaluated for his or her role, importance, and impact

• The role, importance and impact can be turned into search and visualization criteria both for the speakers and for what was said

• Meta-data is data that describes the context of the speech-act and can extend the search past tags, keywords, author, or time

Page 5: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

SOCIAL NETWORKS – THE BRIEFEST INTRO• Mapping people as members of a network

reveals things that are not immediately apparent

• What is important is not how much you talk to other people, but how central you are in the debate

Page 6: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

THE IMPORTANCE OF BEING CENTRAL• Centrality

– Simple• How many conversation partners you have• Follow the distribution of contributions

– Complex and subtle• How important are you in the network of

communications• If you were not there, would the network be poorer

Page 7: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

THE MAGIC OF BETWEENNESS CENTRALITY

1 is the most central node, although it is not the most directly connected

It might even be a very unimportant (by attributes) node or even ignored

It is potentially a bridge maker and connector

Page 8: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

PRACTICE CAPITAL

• Practice: working together within a human space• Co-work ties are practice ties, not necessarily

communicative• Practice ties can be detected via network

analysis• High betweenness in practice space = high

practice capital

Page 9: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

HOW DOES THIS MATTER?

• Mapping social conversations as networks

• Reveals the unseen powerbrokers or bridgemakers

• Suggests new information cues and selection criteria for browsing the videos

• Facilitates a new kind of “data journalism”

Page 10: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

AN EXAMPLE: JOINT SELECT COMMITTEE ON BUDGET DEFICIT REDUCTION HEARINGS • November, October 2011

• 17 speakers representatives, senators, former presidential administration staffers/players

• 280 minutes of conversation

• Over 115 turns of speech

http://c-spanvideo.org/topic/85

Page 11: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

TURNING CONVERSATIONS INTO NETWORKS• Analyze who is speaking to whom

• Create conversation ties that decay the longer the time that passed between turns of speech

• Speakers that are closest to each other are the most connected, those more distant are exponentially less connected

• Highest connection as defined by centrality in practice space, higher practice capital

Page 12: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

TECHNOLOGY WAS TESTED

• Methodology already applied to Wikipedia

• We created a network of 3 million nodes

• Code is written in JAVA, is open source and will be released soon

Page 13: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

TEST ANALYSIS APPLIED TO A C-SPAN DEBATE

Two groups, several central talkers. Solid lines the strongest relationships.

BaucusBecerraBowlesCampClyburnDomeniciElmendorfHensarlingKerryKylMurrayPortmanRivlinSimpsonToomeyUptonVan Hollen

Page 14: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

HOW DOES CENTRALITY CHANGE THE STORY?

Clyburn Bowles Domenici Rivlin Elmendorf0

10

20

30

40

50

60

70

80

90

100

Betweeness Centrality Speech minutes

Clyburn Domenici Rivlin Bowles Elmendorf0

10

20

30

40

50

60

70

80

6

32.5

39.25

49.86

72.3

Speech Minutes

Highest talkers are are not the most central practice capital members of the debate

Page 15: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

THE MODEST PROPOSAL

Add search criteria for centrality, verbosity (amount), and persistence (turns of speech)

Page 16: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

BRIAN LAMB SCHOOL OF COMMUNICATION

LOOKING FORWARD

• Analyze all C-Span video corpus, generate centrality, verbosity, persistence for each debater

• Store info, create service that serves data alongside other metadata

• Allow third-parties to create visualization tools and apps that indicate degree of connectedness of speakers in practice space

• Visualize practice capital

Page 17: Enhancing C-Span Video Archive with Practice Capital Metadata and data journalism APIs

QUESTIONS? COMMENTS?Thank you!