INTRODUCTION TO COMPUTATIONAL SOCIAL SCIENCE
LECTURE 1 192015
INTRODUCTION TO COMPUTATIONAL SOCIAL SCIENCE (CSS01)
LAURI ELORANTA
LAURIELORANTA
DATA MINING
DATA AND SOCIETY
BIG DATA
PREDICTIVE ANALYSIS
DIGITAL METHODS
DIGITAL HUMANITIES
SOCIAL NETWORK ANALYSIS
PROGRAMMING IN SOCIAL SCIENCE
IT IS A JUNGLE OUT THERE
COMPLEX SYSTEMS
DATA SCIENCE
HADOOPMAP REDUCE
REACTIVE PROGRAMMING
PERSONAL DATA
MY DATA
OPEN DATA
IOT WEARABLES
BUZZ
HYPE
BUZZ
HYPE
BUZZ
HYPE
THE BACKGROUND IMAGE ldquoJUNGLErdquo BY LUKE JONESIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT THAT MUCH TALKING ANDEVEN LESS DOINGONLY A FEW PIONEERS IN THE DESERTED CSS SCENE IN FINLAND
THE BACKGROUND IMAGE ldquoDESERTrdquo BY MOYAN BRENNIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
bull Practicalities
bull What is computational social science
bull Areas of Computational Social Science
bull (Big) Data amp automated information extraction
bull Social Networks
bull Social Complexity
bull Simulation
bull Research examples
bull Lecture 1 Reading
LECTURE 1OVERVIEW
PRACTICALITIES
bull The slides and all materials will be online at
httpblogshelsinkificomputationalsocialscience
bull Course consists of
bull 8 Lectures
bull A Research Plan Assignment (required if you want study credits 5op)
bull Any questions
bull Contact lecturer Lauri Eloranta at firstname dot lastname helsinkifi
PRACTICALITIESGENERAL
bull LECTURE 1 Introduction to Computational Social Science [TODAY]
bull Tuesday 0109 1600 ndash 1800 U35 Seminar room114
bull LECTURE 2 Basics of Computation and Modeling
bull Wednesday 0209 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 3 Big Data and Information Extraction
bull Monday 0709 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 4 Network Analysis
bull Monday 1409 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 5 Complex Systems
bull Tuesday 1509 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 6 Simulation in Social Science
bull Wednesday 1609 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 7 Ethical and Legal issues in CSS
bull Monday 2109 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 8 Summary
bull Tuesday 2209 1700 ndash 1900 U35 Seminar room 114
LECTURESSCHEDULE
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
LAURIELORANTA
DATA MINING
DATA AND SOCIETY
BIG DATA
PREDICTIVE ANALYSIS
DIGITAL METHODS
DIGITAL HUMANITIES
SOCIAL NETWORK ANALYSIS
PROGRAMMING IN SOCIAL SCIENCE
IT IS A JUNGLE OUT THERE
COMPLEX SYSTEMS
DATA SCIENCE
HADOOPMAP REDUCE
REACTIVE PROGRAMMING
PERSONAL DATA
MY DATA
OPEN DATA
IOT WEARABLES
BUZZ
HYPE
BUZZ
HYPE
BUZZ
HYPE
THE BACKGROUND IMAGE ldquoJUNGLErdquo BY LUKE JONESIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT THAT MUCH TALKING ANDEVEN LESS DOINGONLY A FEW PIONEERS IN THE DESERTED CSS SCENE IN FINLAND
THE BACKGROUND IMAGE ldquoDESERTrdquo BY MOYAN BRENNIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
bull Practicalities
bull What is computational social science
bull Areas of Computational Social Science
bull (Big) Data amp automated information extraction
bull Social Networks
bull Social Complexity
bull Simulation
bull Research examples
bull Lecture 1 Reading
LECTURE 1OVERVIEW
PRACTICALITIES
bull The slides and all materials will be online at
httpblogshelsinkificomputationalsocialscience
bull Course consists of
bull 8 Lectures
bull A Research Plan Assignment (required if you want study credits 5op)
bull Any questions
bull Contact lecturer Lauri Eloranta at firstname dot lastname helsinkifi
PRACTICALITIESGENERAL
bull LECTURE 1 Introduction to Computational Social Science [TODAY]
bull Tuesday 0109 1600 ndash 1800 U35 Seminar room114
bull LECTURE 2 Basics of Computation and Modeling
bull Wednesday 0209 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 3 Big Data and Information Extraction
bull Monday 0709 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 4 Network Analysis
bull Monday 1409 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 5 Complex Systems
bull Tuesday 1509 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 6 Simulation in Social Science
bull Wednesday 1609 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 7 Ethical and Legal issues in CSS
bull Monday 2109 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 8 Summary
bull Tuesday 2209 1700 ndash 1900 U35 Seminar room 114
LECTURESSCHEDULE
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
DATA MINING
DATA AND SOCIETY
BIG DATA
PREDICTIVE ANALYSIS
DIGITAL METHODS
DIGITAL HUMANITIES
SOCIAL NETWORK ANALYSIS
PROGRAMMING IN SOCIAL SCIENCE
IT IS A JUNGLE OUT THERE
COMPLEX SYSTEMS
DATA SCIENCE
HADOOPMAP REDUCE
REACTIVE PROGRAMMING
PERSONAL DATA
MY DATA
OPEN DATA
IOT WEARABLES
BUZZ
HYPE
BUZZ
HYPE
BUZZ
HYPE
THE BACKGROUND IMAGE ldquoJUNGLErdquo BY LUKE JONESIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT THAT MUCH TALKING ANDEVEN LESS DOINGONLY A FEW PIONEERS IN THE DESERTED CSS SCENE IN FINLAND
THE BACKGROUND IMAGE ldquoDESERTrdquo BY MOYAN BRENNIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
bull Practicalities
bull What is computational social science
bull Areas of Computational Social Science
bull (Big) Data amp automated information extraction
bull Social Networks
bull Social Complexity
bull Simulation
bull Research examples
bull Lecture 1 Reading
LECTURE 1OVERVIEW
PRACTICALITIES
bull The slides and all materials will be online at
httpblogshelsinkificomputationalsocialscience
bull Course consists of
bull 8 Lectures
bull A Research Plan Assignment (required if you want study credits 5op)
bull Any questions
bull Contact lecturer Lauri Eloranta at firstname dot lastname helsinkifi
PRACTICALITIESGENERAL
bull LECTURE 1 Introduction to Computational Social Science [TODAY]
bull Tuesday 0109 1600 ndash 1800 U35 Seminar room114
bull LECTURE 2 Basics of Computation and Modeling
bull Wednesday 0209 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 3 Big Data and Information Extraction
bull Monday 0709 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 4 Network Analysis
bull Monday 1409 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 5 Complex Systems
bull Tuesday 1509 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 6 Simulation in Social Science
bull Wednesday 1609 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 7 Ethical and Legal issues in CSS
bull Monday 2109 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 8 Summary
bull Tuesday 2209 1700 ndash 1900 U35 Seminar room 114
LECTURESSCHEDULE
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
NOT THAT MUCH TALKING ANDEVEN LESS DOINGONLY A FEW PIONEERS IN THE DESERTED CSS SCENE IN FINLAND
THE BACKGROUND IMAGE ldquoDESERTrdquo BY MOYAN BRENNIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
bull Practicalities
bull What is computational social science
bull Areas of Computational Social Science
bull (Big) Data amp automated information extraction
bull Social Networks
bull Social Complexity
bull Simulation
bull Research examples
bull Lecture 1 Reading
LECTURE 1OVERVIEW
PRACTICALITIES
bull The slides and all materials will be online at
httpblogshelsinkificomputationalsocialscience
bull Course consists of
bull 8 Lectures
bull A Research Plan Assignment (required if you want study credits 5op)
bull Any questions
bull Contact lecturer Lauri Eloranta at firstname dot lastname helsinkifi
PRACTICALITIESGENERAL
bull LECTURE 1 Introduction to Computational Social Science [TODAY]
bull Tuesday 0109 1600 ndash 1800 U35 Seminar room114
bull LECTURE 2 Basics of Computation and Modeling
bull Wednesday 0209 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 3 Big Data and Information Extraction
bull Monday 0709 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 4 Network Analysis
bull Monday 1409 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 5 Complex Systems
bull Tuesday 1509 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 6 Simulation in Social Science
bull Wednesday 1609 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 7 Ethical and Legal issues in CSS
bull Monday 2109 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 8 Summary
bull Tuesday 2209 1700 ndash 1900 U35 Seminar room 114
LECTURESSCHEDULE
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Practicalities
bull What is computational social science
bull Areas of Computational Social Science
bull (Big) Data amp automated information extraction
bull Social Networks
bull Social Complexity
bull Simulation
bull Research examples
bull Lecture 1 Reading
LECTURE 1OVERVIEW
PRACTICALITIES
bull The slides and all materials will be online at
httpblogshelsinkificomputationalsocialscience
bull Course consists of
bull 8 Lectures
bull A Research Plan Assignment (required if you want study credits 5op)
bull Any questions
bull Contact lecturer Lauri Eloranta at firstname dot lastname helsinkifi
PRACTICALITIESGENERAL
bull LECTURE 1 Introduction to Computational Social Science [TODAY]
bull Tuesday 0109 1600 ndash 1800 U35 Seminar room114
bull LECTURE 2 Basics of Computation and Modeling
bull Wednesday 0209 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 3 Big Data and Information Extraction
bull Monday 0709 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 4 Network Analysis
bull Monday 1409 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 5 Complex Systems
bull Tuesday 1509 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 6 Simulation in Social Science
bull Wednesday 1609 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 7 Ethical and Legal issues in CSS
bull Monday 2109 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 8 Summary
bull Tuesday 2209 1700 ndash 1900 U35 Seminar room 114
LECTURESSCHEDULE
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
PRACTICALITIES
bull The slides and all materials will be online at
httpblogshelsinkificomputationalsocialscience
bull Course consists of
bull 8 Lectures
bull A Research Plan Assignment (required if you want study credits 5op)
bull Any questions
bull Contact lecturer Lauri Eloranta at firstname dot lastname helsinkifi
PRACTICALITIESGENERAL
bull LECTURE 1 Introduction to Computational Social Science [TODAY]
bull Tuesday 0109 1600 ndash 1800 U35 Seminar room114
bull LECTURE 2 Basics of Computation and Modeling
bull Wednesday 0209 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 3 Big Data and Information Extraction
bull Monday 0709 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 4 Network Analysis
bull Monday 1409 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 5 Complex Systems
bull Tuesday 1509 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 6 Simulation in Social Science
bull Wednesday 1609 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 7 Ethical and Legal issues in CSS
bull Monday 2109 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 8 Summary
bull Tuesday 2209 1700 ndash 1900 U35 Seminar room 114
LECTURESSCHEDULE
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull The slides and all materials will be online at
httpblogshelsinkificomputationalsocialscience
bull Course consists of
bull 8 Lectures
bull A Research Plan Assignment (required if you want study credits 5op)
bull Any questions
bull Contact lecturer Lauri Eloranta at firstname dot lastname helsinkifi
PRACTICALITIESGENERAL
bull LECTURE 1 Introduction to Computational Social Science [TODAY]
bull Tuesday 0109 1600 ndash 1800 U35 Seminar room114
bull LECTURE 2 Basics of Computation and Modeling
bull Wednesday 0209 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 3 Big Data and Information Extraction
bull Monday 0709 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 4 Network Analysis
bull Monday 1409 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 5 Complex Systems
bull Tuesday 1509 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 6 Simulation in Social Science
bull Wednesday 1609 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 7 Ethical and Legal issues in CSS
bull Monday 2109 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 8 Summary
bull Tuesday 2209 1700 ndash 1900 U35 Seminar room 114
LECTURESSCHEDULE
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull LECTURE 1 Introduction to Computational Social Science [TODAY]
bull Tuesday 0109 1600 ndash 1800 U35 Seminar room114
bull LECTURE 2 Basics of Computation and Modeling
bull Wednesday 0209 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 3 Big Data and Information Extraction
bull Monday 0709 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 4 Network Analysis
bull Monday 1409 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 5 Complex Systems
bull Tuesday 1509 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 6 Simulation in Social Science
bull Wednesday 1609 1600 ndash 1800 U35 Seminar room 113
bull LECTURE 7 Ethical and Legal issues in CSS
bull Monday 2109 1600 ndash 1800 U35 Seminar room 114
bull LECTURE 8 Summary
bull Tuesday 2209 1700 ndash 1900 U35 Seminar room 114
LECTURESSCHEDULE
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Course Book
bull Cioffi-Revilla Claudio (2014) Introduction to
Computational Social Science Springer-
Verlag London
bull Further
Reading
LITERATURECOURSE BOOK
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull The full eBook is available via Helsinki
University Library
httpshelkalinneanetficgi-
binPwebreconcgiBBID=2753081
LITERATURECOURSE BOOK
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
LITERATUREADDITIONAL READING
bull There will be additional reading given for each lecture
bull Research articles on the topic at hand some will be given for ldquohomework
readingrdquo
bull The full list of articles can be found at
httpblogshelsinkificomputationalsocialscience
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Write a short research plan where you apply a computational social
science method to a research problem
bull Length 8 pages for Masterrsquos students 10 pages for PhD students
bull Focus on research method lt-gt research data lt-gt research problem
bull How to write a research plan general instructions
bull httpwwwutaficmtendoctoralstudiesapplyTutkimussuunnitelmaohje
et_EN5B15Dpdf
bull httpsintoaaltofidisplayendoctoraltaikResearch+Plan
ASSIGNMENTGENERAL
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Assignment DL is Friday 2102015 at EODMidnight
bull All assignments are returned in PDF-format
bull How to save my work in pdf-format You can rdquoSave as PDFrdquo or rdquoPrint to PDFrdquo in MS
Word
bull Include your name student ID and contact details
bull Assignments are returned to the lecturer Lauri Eloranta via email
firstname dot lastname helsinkifi
bull Grading is done in one monthrsquos time and you will receive the study
credits on or before 30102015
ASSIGNMENTHOW TO RETURN THE ASSIGNMENT
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Contains six course covering different aspects of computational social
science
bull Full stydy block 25-30 op
bull Basic courses (mandatory)
bull Introduction to Computational Social Science (5 op) (I period)
bull Introduction to Programming in Social Science (5 op) (II period)
bull Special courses
bull Data extraction (5 op) (IV period)
bull Network Analysis (5 op) (in 2016 ndash 2017)
bull Complex Systems (5 op) (III period)
bull Simulation (5 op) (in 2016 ndash 2017)
COMPUTATIONAL SOCIAL SCIENCE STUDY BLOCK
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
WHAT IS COMPUTATIONAL SOCIAL SCIENCE
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
ldquoIn short a computational social science is
emerging [field] that leverages the capacity
to collect and analyze data with an
unprecedented breadth and depth and
scalerdquo (Lazer et al 2009)
Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull ldquoIn short a computational social science is emerging [field] that
leverages the capacity to collect and analyze data with an
unprecedented breadth and depth and scalerdquo
bull Lazer D et al 2009 Computational Social Science Science 6 February
2009 Vol 323 no 5915 pp 721-723
LAZER ET AL 2009
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull ldquoThe increasing integration of technology into our lives has created
unprecedented volumes of data on societyrsquos everyday behaviour Such
data opens up exciting new opportunities to work towards a quantitative
understanding of our complex social systems within the realms of a
new discipline known as Computational Social Science Against a
background of financial crises riots and international epidemics the
urgent need for a greater comprehension of the complexity of our
interconnected global society and an ability to apply such insights in
policy decisions is clear (Conte et al 2012)
bull Conte R 2012 Manifesto of Computational Social Science The
European Physical Journal Special Topics November 2012 Vol 214
Issue 1 pp 325-346
CSS MANIFESTO(CONTE ET AL 2012)
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull ldquoComputational social science refers to the academic sub-disciplines
concerned with computational approaches to the social sciences Fields
include computational economics and computational sociology
It is a multi-disciplinary and integrated approach to social survey
focusing on information processing by means of advanced information
technology The computational tasks include the analysis of social
networks and social geographic systemsrdquo
bull (Wikipedia 2015 httpenwikipediaorgwikiComputational_social_science)
WIKIPEDIA
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull ldquoThe new field of Computational Social Science can be
defined as the interdisciplinary investigation of the social
universe of many scales ranging from individual actors to
the largest groupings through the medium of computationrdquo
(Cioffi-Revilla 2014)
CIOFFI-REVILLA 2014
Cioffi-Revilla Claudio (2014) Introduction to Computational Social Science Springer-Verlag London
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
INCREASINGLY COMPLEX SOCIETY
THE BACKGROUND IMAGE ldquoPOINT AND LINE TO (MULTIPLE) PLANE(S)rdquo RODRIGO CARVALHO
IS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
INSTRUMENTAL REVOLUTION
THE BACKGROUND IMAGE ldquoTATEL TELESCOPErdquo BY EP_JHUIS UNDER NON COMMERCIAL CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
IT IS FOREMOST AN
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
COMPUTER SCIENCE
SOCIAL SCIENCE
STATISTICS
COMPUTATIONAL SOCIAL SCIENCE
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
Time
More
Less
bull Speed and performance of IT (CPU RAM Network)
bull Access to IT Internet
bull Amount of data generated
bull Cost of IT
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
FUNDAMENTAL CHANGES IN RESEARCH SETUP
THE BACKGROUND IMAGE ldquoHOME VISITrdquo BY NICOLAS NOVAIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
MAJOR QUESTIONS REGARDING RESEARCH ETHICS THE BACKGROUND IMAGE ldquoCAMEacuteRA DE SURVEILLANCErdquo BY TRISTAN NITOT
IS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
NOT A SILVER BULLET
COMPUTATIONAL SOCIAL SCIENCE IS
THE BACKGROUND IMAGE ldquo9MM BULLET BWrdquo BY AN NGUYENIS UNDER CREATIVE COMMONS LICENSE
SEE ORIGINAL IMAGE HERE SEE LICENSE TERMS HERE
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
Computational Social Science
proposes revolutionary opportunities
for the social sciences but it has still
some challenges in relation to
methods interdisciplinary
cooperation and research ethics
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
1 Solving increasingly complex problems The problems of global
world are complex computational methods might be able to solve
these complex issues
2 The rise of data The amounts of data has exploded during the 21st
century
3 IT and Instrumental revolution all the new tools and possibilities
4 Complex systems modeling our dynamic organisations and societies
5 Social networks modeling human behavior as networks
6 Making predictions and simulations predicting future from the past
7 Interdisciplinary field (social sciences math computer sciencehellip)
8 Many problems and challenges especially regarding research
ethics
CSS COMPONENTS
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Information processing paradigm has two aspects in relation
to CSS
1 Information processing is substantive to the complex
systems of society that CSS researches This means that
information processing is takes part in forming and
evolution of complex systems
2 Information processing is methodological in the sense
that it serves as the core instrument of CSS
COMPUTATIONAL PARADIGM OF SOCIETY
(Cioffi-Revilla 2014)
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
BIG DATA amp AUTOMATED INFROMATION EXTRACTION
SOCIAL NETWORK ANALYSIS
COMPLEX SYSTEMS amp MODELING
SIMULATION
1
2
3
4THE MAIN AREAS OF CSS
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Areas of Computational Social Science
1 (Big) Data amp automated data extraction
bull Generate retrieve sort modify transform hellip data
2 Social Networks
bull Network analysis and social networks
3 Social Complexity
bull Social complexity complex adaptive systems complex
systems modeling
4 Simulation
FOUR MAIN AREAS OF CSS
(Cioffi-Revilla 2014)
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Data and automated information extraction can be seen as foundation
for the other areas of CSS
bull Raw data can be used as
1 Data for its own sake as research data -gt data is the subject of
research
2 Data for modeling or validating other phenomena via eg network
analysis complex systems analysis or simulation
bull Data is generated retrieved modified transformedhellip for research
purposes via computational automation
BIG DATA amp AUTOMATED INFORMATION EXTRACTION
(Cioffi-Revilla 2014)
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull A long tradition in network analysis (much older field than CSS)
bull Social Networks (Facebook Twitter etc) just one part of network
analysis
bull Many other social interactions can be modeled as networks -gt thus
social networks are not technology dependent as such
bull -gt eg modeling family as network
bull -gt eg modeling a project as network
SOCIAL NETWORKS
(Cioffi-Revilla 2014)
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Society seen as a complex adaptive system
bull Phase transitions
bull Adaptation (multi stage process)
bull Need -gt intent -gt capacity -gt implementation
bull Goal
bull Information processing in many parts of Complex adaptive systems
bull To help adaptation allocating resources coordination hellip
bull Family as and complex adaptive system
bull Development hardships births deaths successes failures
bull Adaptation over decades
SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Three types of systems
1 Natural systems
2 Human systems
3 Artificial systems
bull Artificial systems (or artifacts) exist because they have a function they
serve as adaptive buffers between humans and nature
bull Humans pursue the strategy of building artifacts to achieve goals
bull Two kinds of artificial systems working in synergy
bull Tanglible (eg roads buildings)
bull Intanglibe ( eg organisations social structures)
SIMONrsquoS THEORY OF ARTIFACTS AND SOCIAL COMPLEXITY
(Cioffi-Revilla 2014)
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Large (and old) research field
bull Two main areas of simulation
1 Variable-Oriented Models
bull System Dynamics Models (eg modeling a nuclear plant)
bull Queuing Models (eg modeling how a box office line behaves)
2 Object-Oriented Models
bull Cellular automate (eg Game of life httpenwikipediaorgwikiConway27s_Game_of_Life
httppmaveustuffjavascript-game-of-life-v311)
bull Agent based models (eg Modeling the communication of a project
organisation of many individuals)
bull Also Evolutionary Models
SIMULATION
(Cioffi-Revilla 2014)
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull 4 main areas of Computational Social Science
1 Big data and automatic information extraction
2 Social networks
3 Social complexity
4 Simulation
bull Typically all of these working together
bull CSS has a lot of problems especially concerning privacy and ethics
bull CSS is not a silver bullet and it does not replace other social science
fields or methods Instead CSS complements other research fields and
methods
SUMMARY
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
SOME RESEARCH EXAMPLES
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Tracking and predicting how flu or other contagious diseases spread
bull Based on network and social media analysis and modeling
bull Many different variations one of the first Google Flu Trends based on
flu related search queries
bull For example
bull Achrekar H Gandhe A Lazarus R Ssu-Hsin Yu Benyuan Liu 2011 Predicting Flu
Trends using Twitter data Computer Communications Workshops (INFOCOM
WKSHPS) 2011 IEEE Conference on vol no pp702707 10-15 April 2011
MODELING THE SPREAD OF DISEASESALREADY AN EPIDEMOLOGY CLASSIC
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull httpwwwgoogleorgflutrendsintlen_us
GOOGLE FLU TRENDS
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Leskovec J Backstrom L Kleinberg J 2009 Meme-tracking and the dynamics of
the news cycle Proceedings of the 15th ACM ACM SIGKDD international conference
on Knowledge discovery and data mining Pages 497-506 2009 - dlacmorg
bull Tracking new topics ideas and memes across the Web has been an issue of considerable interest
Recent work has developed methods for tracking topic shifts over long time scales as well as abrupt
spikes in the appearance of particular named entities However these approaches are less well suited to
the identification of content that spreads widely and then fades over time scales on the order of days -
the time scale at which we perceive news and events
bull We develop a framework for tracking short distinctive phrases that travel relatively intact through on-line
text developing scalable algorithms for clustering textual variants of such phrases we identify a broad
class of memes that exhibit wide spread and rich variation on a daily basis
MODELING NEWS CYCLE DYNAMICS
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Athanasiadis I N Mentes A K Mitkas P A Mylopoulos Y A 2005 A Hybrid Agent-
Based Model for Estimating Residential Water Demand SIMULATION March 2005 81
175-187 doi1011770037549705053172
bull Picardi C and Saeed K 1979The dynamics of water policy in southwestern Saudi
Arabia Anthony SIMULATION October 1979 vol 33 4 pp 109-118
SUSTAINABLE WATER DEMAND MANAGEMENT MODELING
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Venturini T Laffite N B Cointet J-P Gray I Zabban V De Pryck K 2014Three
maps and three misunderstandings A digital mapping of climate diplomacy Big Data
amp Society July-December 2014 1 2053951714543804 first published on August 5 2014
doi1011772053951714543804
CLIMATE DIPLOMACY MAPPING
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Can electoral popularity be predicted using socially generated big
data Information Technology Volume 56 Issue 5 Pages 246ndash253
ISSN (Online) 2196-7032 ISSN (Print) 1611-2776 DOI 101515itit-
2014-1046 September 2014
bull Today our more-than-ever digital lives leave significant footprints in cyberspace Large scale collections
of these socially generated footprints often known as big data could help us to re-investigate different
aspects of our social collective behaviour in a quantitative framework In this contribution we discuss one
such possibility the monitoring and predicting of popularity dynamics of candidates and parties through
the analysis of socially generated data on the web during electoral campaigns Such data offer
considerable possibility for improving our awareness of popularity dynamics However they also suffer
from significant drawbacks in terms of representativeness and generalisability In this paper we discuss
potential ways around such problems suggesting the nature of different political systems and contexts
might lend differing levels of predictive power to certain types of data source We offer an initial
exploratory test of these ideas focussing on two data streams Wikipedia page views and Google
search queries On the basis of this data we present popularity dynamics from real case examples of
recent elections in three different countries
PREDICTING ELECTIONS
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull DIGIVAALIT 2015
bull httpwwwhiitfidigivaalit-2015
bull Researching the parliamentary elections 2015 in Finland focusing on
digital media data (Twitter Facebook)
bull Trying to understand how media is used and how public agenda is set
bull CITIZEN MINDSCAPES
bull httpchallengehelsinkifiblogcitizen-mindscapes-kansakunnan-
mielentilabull Diving deep into the unscoped virtual territories of a nationrsquos collective consciousness may reveal something remarkable The
Finnish hugely popular Suomi24 discussion forum has 19 million monthly visitors who use the online town square to talk about
anything and everything close to their hearts If this data could be harnessed into research use what amazing things could we learn
about Finnish society A team of media professionals at the forums owner company Aller and researchers at the National Consumer
Research Center plan to make use of this immense database
DIGIVAALIT 2015 amp CITIZENMINDSCAPES
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Listen the ldquoThe Trust Engineersrdquo podcast by Radiolab
bull httpwwwradiolaborgstorytrust-engineers
bull Think about and discuss different ethical research issues in relation to
what you heard
ETHICS
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
bull Lazer D et al 2009 Computational Social Science Science 6 February 2009 Vol 323 no 5915 pp 721-723
bull Conte R 2012 Manifesto of Computational Social Science The European Physical Journal Special Topics November 2012 Vol 214 Issue 1 pp 325-346
bull Anderson C 2008 The End of Theory The Data Deluge Makes the Scientific Method Obsolete Wired httparchivewiredcomsciencediscoveriesmagazine16-07pb_theory
bull Einav L and Levin J 2014 The Data Revolution and Economic Analysis In Innovation Policy and the Economy edited by Josh Lerner and Scott Stern httpwebstanfordedu~leinavpubsIPE2014pdf
bull King G 2011 Ensuring the Data-Rich Future of the Social Sciences Science 11 February 2011 Vol 331 no 6018 pp 719-721
bull Wallach H 2014 Big Data Machine Learning and the Social Sciences Fairness Accountability and Transparency Mediumcom httpsmediumcomhannawallachbig-data-machine-learning-and-thesocial-sciences-927a8e20460d
LECTURE 1 READING
Thank You
Questions and comments
twitter laurieloranta
Thank You
Questions and comments
twitter laurieloranta