enhancing c-span video archive with practice capital metadata and data journalism apis
DESCRIPTION
The presentation argues that the C-Span archive is not a mere repository of moving pictures. It can also be seen as a one of a kind “big data” repository. If processed from a “practice capital” perspective with quantitative and network analytic tools, such data can significantly extend the capabilities of C-Span archives by identifying the central actors in a debate and their ability to sway it. The proposed approach may serve the public interest though API tools that support third party development of visualization and analytic apps, which can lead to more informed debates and new forms of data driven journalism.TRANSCRIPT
ENHANCING THE C-SPAN ARCHIVE WITH COMMUNICATIVE METADATA: A PRACTICE CAPITAL PROPOSAL
Sorin Adam MateiAssociate ProfessorDiscovery Park and Polytechnic Institute FellowDirector of Research for Computational Social Science, CyberCenter
BRIAN LAMB SCHOOL OF COMMUNICATION
BRIAN LAMB SCHOOL OF COMMUNICATION
DATA EVERYWHERE
• The C-Span Archive is a Big Data repository• Social and Political Big Data• Captures not just words or moving images
but INTERACTIONS
BRIAN LAMB SCHOOL OF COMMUNICATION
AN INTERACTION REPOSITORY
• The C-Span archive captures who said, what, to whom
• Sender Message Receiver• Concatenated, such chains of interaction
become SOCIAL NETWORKS OF DEBATE
BRIAN LAMB SCHOOL OF COMMUNICATION
COMMUNICATIVE META-DATA
• Each member of the network can be evaluated for his or her role, importance, and impact
• The role, importance and impact can be turned into search and visualization criteria both for the speakers and for what was said
• Meta-data is data that describes the context of the speech-act and can extend the search past tags, keywords, author, or time
BRIAN LAMB SCHOOL OF COMMUNICATION
SOCIAL NETWORKS – THE BRIEFEST INTRO• Mapping people as members of a network
reveals things that are not immediately apparent
• What is important is not how much you talk to other people, but how central you are in the debate
BRIAN LAMB SCHOOL OF COMMUNICATION
THE IMPORTANCE OF BEING CENTRAL• Centrality
– Simple• How many conversation partners you have• Follow the distribution of contributions
– Complex and subtle• How important are you in the network of
communications• If you were not there, would the network be poorer
BRIAN LAMB SCHOOL OF COMMUNICATION
THE MAGIC OF BETWEENNESS CENTRALITY
1 is the most central node, although it is not the most directly connected
It might even be a very unimportant (by attributes) node or even ignored
It is potentially a bridge maker and connector
BRIAN LAMB SCHOOL OF COMMUNICATION
PRACTICE CAPITAL
• Practice: working together within a human space• Co-work ties are practice ties, not necessarily
communicative• Practice ties can be detected via network
analysis• High betweenness in practice space = high
practice capital
BRIAN LAMB SCHOOL OF COMMUNICATION
HOW DOES THIS MATTER?
• Mapping social conversations as networks
• Reveals the unseen powerbrokers or bridgemakers
• Suggests new information cues and selection criteria for browsing the videos
• Facilitates a new kind of “data journalism”
BRIAN LAMB SCHOOL OF COMMUNICATION
AN EXAMPLE: JOINT SELECT COMMITTEE ON BUDGET DEFICIT REDUCTION HEARINGS • November, October 2011
• 17 speakers representatives, senators, former presidential administration staffers/players
• 280 minutes of conversation
• Over 115 turns of speech
http://c-spanvideo.org/topic/85
BRIAN LAMB SCHOOL OF COMMUNICATION
TURNING CONVERSATIONS INTO NETWORKS• Analyze who is speaking to whom
• Create conversation ties that decay the longer the time that passed between turns of speech
• Speakers that are closest to each other are the most connected, those more distant are exponentially less connected
• Highest connection as defined by centrality in practice space, higher practice capital
BRIAN LAMB SCHOOL OF COMMUNICATION
TECHNOLOGY WAS TESTED
• Methodology already applied to Wikipedia
• We created a network of 3 million nodes
• Code is written in JAVA, is open source and will be released soon
BRIAN LAMB SCHOOL OF COMMUNICATION
TEST ANALYSIS APPLIED TO A C-SPAN DEBATE
Two groups, several central talkers. Solid lines the strongest relationships.
BaucusBecerraBowlesCampClyburnDomeniciElmendorfHensarlingKerryKylMurrayPortmanRivlinSimpsonToomeyUptonVan Hollen
BRIAN LAMB SCHOOL OF COMMUNICATION
HOW DOES CENTRALITY CHANGE THE STORY?
Clyburn Bowles Domenici Rivlin Elmendorf0
10
20
30
40
50
60
70
80
90
100
Betweeness Centrality Speech minutes
Clyburn Domenici Rivlin Bowles Elmendorf0
10
20
30
40
50
60
70
80
6
32.5
39.25
49.86
72.3
Speech Minutes
Highest talkers are are not the most central practice capital members of the debate
BRIAN LAMB SCHOOL OF COMMUNICATION
THE MODEST PROPOSAL
Add search criteria for centrality, verbosity (amount), and persistence (turns of speech)
BRIAN LAMB SCHOOL OF COMMUNICATION
LOOKING FORWARD
• Analyze all C-Span video corpus, generate centrality, verbosity, persistence for each debater
• Store info, create service that serves data alongside other metadata
• Allow third-parties to create visualization tools and apps that indicate degree of connectedness of speakers in practice space
• Visualize practice capital
QUESTIONS? COMMENTS?Thank you!