simulation in social sciences - lecture 6 in introduction to computational social science
TRANSCRIPT
SIMULATION IN SOCIAL SCIENCES
LECTURE 6, 16.9.2015
INTRODUCTION TO COMPUTATIONAL SOCIAL SCIENCE (CSS01)
LAURI ELORANTA
• LECTURE 1: Introduction to Computational Social Science [DONE]
• Tuesday 01.09. 16:00 – 18:00, U35, Seminar room114
• LECTURE 2: Basics of Computation and Modeling [DONE]
• Wednesday 02.09. 16:00 – 18:00, U35, Seminar room 113
• LECTURE 3: Big Data and Information Extraction [DONE]
• Monday 07.09. 16:00 – 18:00, U35, Seminar room 114
• LECTURE 4: Network Analysis [DONE]
• Monday 14.09. 16:00 – 18:00, U35, Seminar room 114
• LECTURE 5: Complex Systems [DONE]
• Tuesday 15.09. 16:00 – 18:00, U35, Seminar room 114
• LECTURE 6: Simulation in Social Science [TODAY]
• Wednesday 16.09. 16:00 – 18:00, U35, Seminar room 113
• LECTURE 7: Ethical and Legal issues in CSS
• Monday 21.09. 16:00 – 18:00, U35, Seminar room 114
• LECTURE 8: Summary
• Tuesday 22.09. 17:00 – 19:00, U35, Seminar room 114
LECTURESSCHEDULE
• PART 1: Simulation Methodology
• PART 2: Variable-Oriented Simulation
• System Dynamics Models
• Queuing Models
• PART 3: Object-Oriented Simulation
• Cellular Automata Models
• Agent Based Models
• PART 4: Simulation Software
LECTURE 6OVERVIEW
SIMULATION METHODOLOGY
• 1. “Simulation is the imitation of the operation of a real-world process or
system over time.
• 2. The act of simulating something first requires that a model be
developed;
• 3. This model represents the key characteristics or behaviors/functions
of the selected physical or abstract system or process.
• 4. The model represents the system itself, whereas the simulation
represents the operation of the system over time.”
• (Wikipedia 2015, Simulation)
• “A computer simulation is a simulation, run on a single computer, or a
network of computers, to reproduce behavior of a system. The simulation
uses an abstract model (a computer model, or a computational model) to
simulate the system.” (Wikipedia 2015, Computer Simulation)
SIMULATION DEFINED
• Large (and old) research field
• Two main areas of simulation
1. Variable-Oriented Models
• System Dynamics Models (e.g. modeling a nuclear plant)
• Queuing Models (e.g modeling how a box office line behaves)
2. Object-Oriented Models
• Cellular automate (e.g. Game of life: http://en.wikipedia.org/wiki/Conway%27s_Game_of_Life,
http://pmav.eu/stuff/javascript-game-of-life-v3.1.1/)
• Agent based models (eg. Modeling the communication of a project
organisation of many individuals)
SIMULATION
(Cioffi-Revilla, 2014.)
• To investigate social complexity in ways that are not possible with other methods
• Versatility: systems that cannot be modeled mathematically can be simulated
• High dimensionality: simulations enable us to research systems and interactions with high number of variable (states)
• Non-linearities: simulations are better at handling complex nonlinear dynamics
• Coupled systems: simulations offer a way to couple human, natural and artificial systems under research
• Stochasticity: simulations enable us to research probability / stochastic dynamics and how they create complex phenomena
• Incompleteness: Social Science is incomplete in the sense that not all parts of social world are known with the same depth. Simulation gives us possibility to research and try out these alternative social universes.
• Experimentation: it is hard and ethically questionable to experiment with a real life social system. With social system simulations experimentation is possible.
• Policy analysis: simulation enables us to research the effects of policies in ways that traditional policy analysis cannot
WHY SIMULATE?
(Cioffi-Revilla, 2014.)
SIMULATION OVERVIEW
Empirical data
Referent / target
system in real
world
Conceptual
model of
target system
Formal modelSimulation
model
Simulation system
(software)
Observation
Abstraction
Formalization Computational implementation
Testable predictions
Feedback
(Cioffi-Revilla, 2014.)
“The Model” “The Simulation”
“The Real World”
• 1. Highly abstract simulations (only minor
qualitative resemblance, no quantitative
resemblance)
• 2. Applied abstract models: convincing
qualitative fit and some quantitative fir to real
world. Mostly theoretical models, but provide
some applied insight.
• 3. Medium fidelity simulations, provide
extensive qualitative and quantitative fit to real
world system
• 4. High fidelity simulations, where qualitative
and quantitative level of modeling is closest to the
real world system.
FIDELITY OF REPRESENTATION
Level of Fidelity & Closeness to real world system
Abstract
“Realistic”
(Cioffi-Revilla, 2014.)
Research question
Conceptual design:
abstraction
Simulation implementation
(in software)Verification Validation Analysis
SIMULATION RESEARCH PROCESS
(Cioffi-Revilla, 2014.)
• Research questions and its relationship to referent system define
1. Borders of your referent system
2. Level of fidelity of the simulation
3. Simulation approach (variable vs. object orientation)
• Practical effects of research questions
• What’s the relation of empirical data in relation to research question
(so that it is not just and abstract simulation)
• What’s the level of computationality of the problem in relation to
computing power
• Research questions typically quite interdisclipinary
1. RESEARCH QUESTION AND THE REFERENT SYSTEM
(Cioffi-Revilla, 2014.)
• The process of selecting given set of features from a referent system for
modeling purposes is called abstraction. Produces simplified
conceptual representation.
• Guided by the research question solely, done in a way that do not
consider technical simulation details yet.
• Typical pitfall: In practice the particular simulation implementation may
guide the abstraction in some way (as is with any other method)
2. ABSTRACTION
(Cioffi-Revilla, 2014.)
• Major decision in relation how to map the referent system and formal
model into software and code
• Variable vs. agent based modeling
• Selection of software systems vs. coding from scratch
• Expertise of the team in relation to the needs of the simulation
• Potential future use of the simulation model and implications of this use
(e.g. should the simulation model be maintained and with what
process)
• End result is the first initial version of the simulation model
• Implementation is iterative process where one improves the
simulation model from version to version
3. SIMULATION IMPLEMENTATION
(Cioffi-Revilla, 2014.)
• A good summary on how to write, refactor and manage code and data:
• Gentzkow, Matthew and Jesse M. Shapiro. 2014. Code and Data for the
Social Sciences: A Practitioner’s Guide. University of Chicago mimeo,
http://faculty.chicagobooth.edu/matthew.gentzkow/research/CodeAndDat
a.pdf
• Handles matters such as:
• Automation
• Version Control!
• Directories
• Data Keys
• Abstraction
• Documentation!
• Management
REMEMBER: MANAGING AND REFACTORING CODE &DATA
• The process of making sure that the simulation model works as planned,
i.e. it follows the conceptual model / research questions (internal validity)
• In practice this means making sure that the implementation is bug and
error free (debugging)
• Methods for achieving high quality software simulations
• Pair coding
• Automated (unit) testing
• Code-walkthroughs
• Profiling
4. VERIFICATION OF THE SIMULATION
(Cioffi-Revilla, 2014.)
• Does the results of the simulation match with empirical data?
• Pattern matching between simulation output and observed real world
system patterns
• Histograms, distribution moments, time series…
• Some simulation software have already features for model validation
testing
• Requires data in relation to the simulation results: sometimes this means
that new data needs to be gathered in order for simulation validation
• May be also part of the iterated calibration of the simulation model
5. VALIDATION OF THE SIMULATION
(Cioffi-Revilla, 2014.)
• Finding out what the simulation is able to tell about the referent system
• Typically consists of
• Formal analysis (e.g. sensitivity analysis)
• What-if-questions (what happens when we change some parts,
attributes or rules of the simulation)
• Scenario analysis (analysis of larger scenarios that combine the effect
of many parts of the system)
6. ANALYSIS OF THE SIMULATION
(Cioffi-Revilla, 2014.)
• Openly describing and then evaluating the process is the key!
• Especially the reasons behind different research decisions
• You should evaluate each step of the process in some way
• Formulation: clarity, originality & and significance of the research
question. Motivation based on earlier research.
• Implementation: technical quality, tools selection and use, code
quality, algorithms used…
• Verification: is the model verified and in what way?
• Validation: is the model validated and in what way?
• Analysis: what is the overall level of model reliability and “confidence”
based on the process and verification and validation?
• Dissemination: How well the model is communicated / can be
communicated for different purposes?
ASSESSING THE SIMULATION PHASES
(Cioffi-Revilla, 2014.)
• “Truth-beauty-justice” (TBJ) evaluation criterion given by Lave and
March (1993):
• Truth: models ability to produce causal understanding of real world
(internal and external validation)
• Beauty: How well the model is presented (formal style, parsimony,
simplicity, syntactical structure, elegance)
• Justice: Does the model contribute to a better world. As such it is a
normative criterion.
TBJ-EVALUATION CRITERION FOR SIMULATION MODELS
(Lave & March, 1993.)
Formulation
Implementation
Verification
Validation
BUILDING SIMULATION MODELS IS ANITERATIVE AND INCREMENTAL PROCESS
(Cioffi-Revilla, 2014.)
VARIABLE-ORIENTED MODELS
SYSTEM DYNAMICS
• “System dynamics is an approach to understanding the nonlinear
behaviour of complex systems over time using stocks and flows,
internal feedback loops and time delays. “(Wikipedia 2015, System
Dynamics)
• “System dynamics (SD) is a methodology and mathematical modeling
technique for framing, understanding, and discussing complex issues
and problems. Originally developed in the 1950s to help corporate
managers improve their understanding of industrial processes, system
dynamics is currently being used throughout the public and private
sector for policy analysis and design.” (Wikipedia 2015, System
Dynamics)
1. SYSTEM DYNAMICS MODELS
• Both deterministic and stochastic implementations
• A lot of applications, traditionally in business management domain
• For example Forrester (1998) mentions several generic applied system
dynamic models:
• Modeling stability and fluctuation in distribution systems,
• Modeling pricing and capital investment as they determine growth,
• Modeling promotion chains showing evolution into a top-heavy
distribution of management personnel when growth slows,
• Modeling imbalances between design, production, marketing, and
service as they influence market share.
SYSTEM DYNAMICS APPLICATIONS
• System dynamics research focuses typically on some or many of the
following aspects of the referent system:
• Variables and their dynamic change through time
• Causal relations between the variables of a system
• The role of noise in the system
• Macro level (variable) trajectories of change
• Emergent properties of the system caused by micro level variable
interactions
SYSTEM DYNAMICS RESEARCH FOCUS
(Cioffi-Revilla, 2014.)
• The system dynamics modeling starts typically from modeling the
relationships of the system variables in Causal Loop Diagrams
• Causal Loop Diagram basically describes the relationship between a
variables and how their values change in relation to other variables
• The relationship can be positive meaning that when variable x increases
or decreases variable y does the same
• The relationship can be negative meaning that when variable x
increases or decreases variable y does the opposite
CAUSAL LOOP DIAGRAMS
(Cioffi-Revilla, 2014.)
• Closed connections in Causal Loop Diagrams are called feedback loops
• Feedback loops can be positive or negative
• Positive feedback loops increase the value of the variable
• increasing & reinforcement of change
• Negative feedback loops decrease the value of the variable
• decreasing and dampening change balances the system
• When two variables change in relation to each other but there is needs
to be a certain period of time between these changes, a delay is
occurring in the relationship One is able to model delay also in Causal
Loop Diagrams with notation ||
FEEDBACK LOOPS & DELAY
(Cioffi-Revilla, 2014.)
Diagram is from blog “Systems and Us”: “Learn to Read Causal Loop Diagrams”. By Jenny Zhou 2012. http://systemsandus.com/2012/08/15/learn-to-read-clds/
• The second step in system dynamics modeling is producing a stock and
flow diagram of the system, which is more quantified way of representing
the variables (stocks) and their relationships (flows)
• Stock: A variable (rectangle) is represented by a stock that models the
cumulative amount of the variable in a given point in time
• For example bank balance could be understood as “stock”
• Flow: The rate of change of a variable (bow tie) is represented by a flow
that models the change of the variable over a finite period of time
(second, minute, hour, day, month, year…)
• For example deposits and interests are inbound flows for bank balance
and withdrawals are outbound flows
STOCKS AND FLOWS
(Cioffi-Revilla, 2014.)
Causal Loop Diagram
Stock and Flow Diagram
Diagram is from blog “MetaSD”: “Are causal loop diagrams useful?”. By Tom Fiddaman 2010. http://blog.metasd.com/2010/04/are-causal-loop-diagrams-useful/
VARIABLE-ORIENTED MODELS
QUEUING MODELS
• “Queueing theory is the mathematical study of waiting lines, or queues.
In queuing theory a model is constructed so that queue lengths and
waiting time can be predicted. Queueing theory is generally considered a
branch of operations research because the results are often used when
making business decisions about the resources needed to provide a
service. (Wikipedia 2015, Queuing Theory)
• “Queue: A System consisting of one or more units or stations that service
or process a strem of incoming demands or requests is called a queue.
Formally, using Kendall’s notation, a given queue Q is denoted by a
triplet A/S/C, where A describes time between arrivals to the queue, S
describes servicing or processing, and C is the number of processors,
where C= 1,2,3…”
2. QUEUING MODELS
(Cioffi-Revilla, 2014.)
• There are many applications for queuing models, for example:
• A bank or airport check-in queue (or any other queue for that matter)
• Polity: public issues arise and they are addressed with policies
• Legislative body: introduction of bills and passing of laws
• Human information processing
QUEUING MODEL APPLICATIONS
(Cioffi-Revilla, 2014.)
• The key in any model is to understand and model the dynamics between
(A) arrival time/probability of new objects, (S) Servicing time of and
object and number of (C) servicing components
• In bank queue A is the rate of new people coming into the queue, S is
the time needed for servicing one bank customer and C is the number of
bank counters.
• Real world referent systems may consists of many queues working in a
bigger queue network (objects moving from one queue to another), thus
the whole system forming on many subsystems.
QUEUING MODEL STRUCTURE
(Cioffi-Revilla, 2014.)
• 1. Determine arrival time/probability A. It is a continuous random
variable defined by a probability density function p(t).
• 2. Determine service time S. It is a continuous random variable defined
by a probability density function p(s).
• 3. Determine the number of service components C, which is a
discrete variable with finite integer values (1,2,3…)
• A and S are typically estimated from empirical data.
ABSTRACTING A QUEUING MODEL
(Cioffi-Revilla, 2014.)
• When modeling a queue, one can decide between different scheduling
policies for the queue, meaning in which order the objects in the queue
get served.
• Most well known scheduling policies:
• First-in-first-out (FIFO)
• First-in-last-out (FILO)
• Last-in-first-out (LIFO)
• Last-in-last-out (LILO)
• There are many others, shared processing, priority scheduling, fastest
job first…
SCHEDULING POLICIES
(Cioffi-Revilla, 2014.)
OBJECT-ORIENTED MODELS
CELLULAR AUTOMATA
• “A cellular automaton simulation is an object-oriented computational model for analyzing complex systems consisting of neighboring entities (x,y) called cells, that change their state S as they interact in a (typically two dimensional) grid-like landscape L using some rule set R.” (Cioffi-Revilla 2014.)
• “A cellular automaton consists of a regular grid of cells, each in one of a finite number of states, such as on and off (in contrast to a coupled map lattice). The grid can be in any finite number of dimensions. For each cell, a set of cells called its neighborhood is defined relative to the specified cell. An initial state (time t = 0) is selected by assigning a state for each cell. A new generation is created (advancing t by 1), according to some fixed rule (generally, a mathematical function) that determines the new state of each cell in terms of the current state of the cell and the states of the cells in its neighborhood.” (Wikipedia 2015, cellular automation)
CELLULAR AUTOMATA
(Cioffi-Revilla, 2014.)
• Array of cells, each of which are in finite number of states. Thus cellular automate is a discrete system.
• Neighborhoods, meaning the cell’s neighbors with which the cell is interacting
• Interaction topologies describing which cells are interacting with each others
• Rules by which the cells change their state based on cell’s neighbors’ states
• Typically global static rules, same for all cells during the whole simulation
• There is also stochastic and asynchronous cellular automata implementations
CELLULAR AUTOMATA STRUCTURE
• There are many “famous” cellular automata models such as:
• Schelling’s Urban Racial Segregation Model
• Conway’s Game of Life
• Sakoda’s Group Attitude Model
• Hegselman’s Group Attitude Model
CELLULAR AUTOMATA EXAMPLES
Vinković, D., & Kirman, A. (2006). A physical analogue of the Schelling model. Proceedings of the National Academy of Sciences, 103(51), 19261-19265.
• Cellular automata typically focuses on research questions that relate to’
• Effects of local level rules to global level phenomena
• Effects of different interaction topologies
• Dynamic behaviour of emergent patterns (stationary, fluctuating,
chaotic)
• What determines the time period for convergence or periodicity of
fluctuations
• Emergent patterns of diffusion and their dynamics
CELLULAR AUTOMATA RESEARCH FOCUS
(Cioffi-Revilla, 2014.)
• 1. Define what a cell represents (tesselation).
• Define the shape of cells (square, hexagonal, trinagular…)
• Define the attributes of cells (typically only a few)
• 2. Define the interaction topology of cells and the radius of cells
neighborhood (typically quite local neighborhoods)
• 3. Define the rules of cell behaviour
• How does cell’s state/attributes change based on neighboring cell’s
states/attributes
ABSTRACTING A CELLULAR AUTOMATA MODEL
• Landscape: 2 dimensional grid with cells having two states (live or dead)
• Each cell represents a living organism
• Interaction topology: cells interact with 8 neighboring cells that are directly linked to the cell
• Neighborhood radius is 1.
• Rules of the simulation:
1. Any live cell with fewer than two live neighbours dies, as if caused by under-population.
2. Any live cell with two or three live neighbours lives on to the next generation.
3. Any live cell with more than three live neighbours dies, as if by overcrowding.
4. Any dead cell with exactly three live neighbours becomes a live cell, as if by reproduction.
(Wikipedia 2015, Conway's_Game_of_Life)
http://pmav.eu/stuff/javascript-game-of-life-v3.1.1/
EXAMPLE: GAME OF LIFE
OBJECT-ORIENTED MODELS
AGENT BASED MODELS
• “A social agent based model (ABM) is an object oriented computational
model for analyzing a social system consisting of autonomous,
interacting, goal-oriented, bounded rational set of Actors A that use
a given rule set R and are situated in an environment E.” (Cioffi-Revilla
2014.)
• “Agent-based models are a kind of microscale model that simulate the
simultaneous operations and interactions of multiple agents in an
attempt to re-create and predict the appearance of complex phenomena.
The process is one of emergence from the lower (micro) level of systems
to a higher (macro) level. As such, a key notion is that simple behavioral
rules generate complex behavior.” (Wikipedia 2015, Agent Based
Modeling)
AGENT BASED MODELS
• ABM can be understood as a much more complex form of cell based
modeling (agent based model also typically have similar features as
interaction topologies, vision range…)
• More complex features:
• Bounded rationality: agents may make biased and non-informed
decisions
• Decision based behavior: agents behave based on some sort of
reasoning and goals
• Artifacts and Artifact system: agent based models can include also
artifact systems
• Social or physical spaces: models may containt organizational or
physical spaces
AGENT BASED MODEL FEATURES
(Cioffi-Revilla, 2014.)
• Agent based modeling typically focuses on research questions that
relate to:
• Effects of local agent based rules on emergent macro level phenomena
• How does the assumptions behind the agent’s behaviour rules change
the emergent macro level phenomena
• How does different interaction topologies and neighborhoods affect the
model?
• How do emergent macro level patterns behave over time ( stationary,
fluctuating, periodic, chaotic)
• What determines the time period for fluctuation or convergence
• Are there patterns of diffusion
• All of these are quite similar in comparison to cell based automata
research topics
AGENT BASED MODELS RESEARCH FOCUS
(Cioffi-Revilla, 2014.)
• 1. Define what are represented as agents (humans, groups etc.)
• Self-aware
• Autonomous
• Makes decision based on internal state, external inputs, goals…
• Can communicate with other agents
• Situates in an environment
• 2. Define the environment where agents are situated
• Natural environments
• Artificial environments
• 3. Define the Rules of the model
• Inter-agent rules (rules governingn agent’s interaction with other agents)
• Agent-environment rules (how agent interacts with its environment)
• Intra-environment rules (how different parts of envrionment interact)
ABSTRACTING AN AGENT BASED MODEL
(Cioffi-Revilla, 2014.)
IMPLEMENTING MODELS WITH
SIMULATION SOFTWARE
• System Dynamics
• VENSIM, NetLogo, Repasta…
• Queuing Models
• Queuing packages in GNU Octave and Java Modeling Tools
• Cellular Automata & Agent Based Models
• Swarm, NetLogo, Repast, MASON…
• There are also simulation packages for many programming languages
• SimPy in Python
• Simecol, RnetLogo
• NetLogo is a good (and simple) simulation environment to start with
• Comparison of agent based modeling software
• http://en.wikipedia.org/wiki/Comparison_of_agent-based_modeling_software
THERE ARE MANY OPTIONS FOR SIMULATION SOFTWARE
• 1. Read Thomas Schelling’s article:
• Schelling, T. C. (1971). Dynamic models of segregation. Journal of
mathematical sociology, 1(2), 143-186.
• 2. Simulate Schelling’s model in NetLogo
• 1. Install NetLogo simulation software
• https://ccl.northwestern.edu/netlogo/download.shtml
• 2. Open Models Library Social Sciences Segregation
• 3. Play around with the model. Read Info and Coding sections and try
to understand how the model is implemented.
LECTURE ASSIGNMENT
• Schelling, T. C. (1971). Dynamic models of segregation. Journal of mathematical sociology, 1(2), 143-186.
• Bonabeau, E. (2002). Agent-based modeling: Methods and techniques for simulating human systems. Proceedings of the National Academy of Sciences, 99(suppl 3), 7280-7287.
• Railsback, S. F., Lytinen, S. L., & Jackson, S. K. (2006). Agent-based simulation platforms: Review and development recommendations. Simulation, 82(9), 609-623.
• Forrester, J. W. (1994). System dynamics, systems thinking, and soft OR. System Dynamics Review, 10(2‐3), 245-256.
• Forrester, J. W. (1993). System dynamics and the lessons of 35 years. In A systems-based approach to policymaking (pp. 199-240). Springer US.
• Wolfram, S. (1984). Universality and complexity in cellular automata. Physica D: Nonlinear Phenomena, 10(1), 1-35.
LECTURE 6 READING
• Cioffi-Revilla, C. (2014). Introduction to Computational Social Science.
Springer-Verlag, London
• Lave, C. A., & March, J. G. (1993). An introduction to models in the
social sciences. University Press of America.
• Forrester, Jay (1998) Designing the Future. A presentation given at
Universidad de Sevilla, Sevilla, Spain.
REFERENCES
Thank You!
Questions and comments?
twitter: @laurieloranta