basics of computation and modeling - lecture 2 in introduction to computational social science

BASICS OF COMPUTATION AND

MODELINGLECTURE 2, 2.9.2015

INTRODUCTION TO COMPUTATIONAL SOCIAL SCIENCE (CSS01)

LAURI ELORANTA

• LECTURE 1: Introduction to Computational Social Science [DONE]

• Tuesday 01.09. 16:00 – 18:00, U35, Seminar room114

• LECTURE 2: Basics of Computation and Modeling [TODAY]

• Wednesday 02.09. 16:00 – 18:00, U35, Seminar room 113

• LECTURE 3: Big Data and Information Extraction

• Monday 07.09. 16:00 – 18:00, U35, Seminar room 114

• LECTURE 4: Network Analysis


• LECTURE 5: Complex Systems

• Tuesday 15.09. 16:00 – 18:00, U35, Seminar room 114

• LECTURE 6: Simulation in Social Science

• Wednesday 16.09. 16:00 – 18:00, U35, Seminar room 113

• LECTURE 7: Ethical and Legal issues in CSS


• LECTURE 8: Summary

• Tuesday 22.09. 17:00 – 19:00, U35, Seminar room 114

LECTURESSCHEDULE

• PART 1: COMPUTATION

• Role of Computation

• How Computers Work

• What is Programming

• PART 2: MODELING

• What is Modeling

• Unified Modeling Language (UML)

LECTURE 2OVERVIEW

• Understanding computers as information processing systems help us understand complex systems as information processing systems as well

• Knowing how computers, programs and programming languages work and what you are able to do with them helps us grasp how to approach a research problem on a practical level (e.g. which tools to choose)

• There are practical consequences on the selection of computers, programs and programming languages in relation to the research problem we are trying to solve (e.g. selection of tools affect the answers we are able to get)

MOTIVATIONWHY UNDERSTANDING COMPUTERS MATTERS

(Cioffi-Revilla 2014.)

• Computation is used in CSS as a language to formalize (1) theory and

(2) empirical research to research social complexity.

• Computation is in most cases applied computation: computation itself

is rarely researched (as is done in computer science).

• Information processing paradigm

1. Computing as a fundamental part of complex social systems

2. Computing as tools for research

ROLE OF COMPUTATION IN CSS


HOW COMPUTERS WORK

• Computers are formed of hardware and software

• Hardware: the physical parts of the computer that enable computing

• Micro-processor, physical memory (= electronic physical machines)

• Hardware provides the physical means for information processing

• Software: the (textual) instructions for that tell the hardware what to do

• The non-physical parts: e.g. MS. Word is a software program

• Example: your physical iPhone is hardware, the apps you run on it are software.

HARDWARE & SOFTWARE

(Hennessy & Patterson 2013.)

COMPUTER ARCHITECTURE

CPU

MAIN

MEMORY

(RAM)

SECONDARY

MEMORY

(e.g. Hard Disk)

INPUT

DEVICE

OUTPUT

DEVICE


• CPU = Central Processing Unit, Does all the computing work

• Processes the instuctions on an program

• Controller, Registers, Arithmetic & Logic Unit

• Main Memory (RAM, Random Access Memory)

• Fastest memory, close to CPU (so that CPU-Memory-can work well together)

• Instructions for computing are loaded to Main memory and executed from there by the CPU

• Secondary memory (Hard Disk)

• Slower memory with big volume

• Able to store big amounts of data, but the access is much slower

• Programs fetched first to Main memory from secondary memory, before they are run by the CPU

• Input and Output, I/O devices

• Screen, mouse, keyboard, network connections, …

COMPUTER ARCHITECTURE


• Computers run programs according the instruction cycle, also called as

fetch-execute-cycle (or fetch-decode-execute cycle)

• Basically it is about cycling two steps

• 1. Fetch the next instruction of the program from main memory to CPU

• 2. Execute that instruction in CPU

• Repeat steps 1 & 2

INSTRUCTION CYCLE


• In relation to the information processing paradigm, computers can be

seen quite similar to complex adaptive social systems

• Computers are formed of information processing system (CPU,

Memory) and its environment (via I/O devices)

• Can a social system be seen as an information processing

“computer”?

AN ANALOGY BETWEEN COMPUTERS AND COMPLEX SYSTEMS


WHAT IS PROGRAMMING

• Programming is the act of writing the instructions for the Computer/CPU

to execute

• A Program is a set of those instructions

• An iPhone app is a program

• The textual form of those instructions is called CODE and it is separated

from the DATA, which is to information the CODE is computing via CPU

• Programs are written in special languages called programing languages

PROGRAMMING


• Central Processing Unit (CPU) can only understand instructions that are

written in its “native” language

• This CPU language is called Machine Code, and it varies from CPU to

CPU, based on make and model

• For example ARM <-> Intel X86 machine codes

• Machine language is not (or is hardly) human readable. The closest

correspondent is low-level Assembly Language

• Machine code or machine language is a set of instructions executed directly by a computer's central

processing unit (CPU). Each instruction performs a very specific task, such as a load, a jump, or an ALU

operation on a unit of data in a CPU register or memory. Every program directly executed by a CPU is

made up of a series of such instructions. (Wikipedia 2015, Machine code)

CPU HAS ITS OWN LANGUAGE


• People write code in “human readable programming languages” (or

semi-human-readable, as assembly)

• One is able to see what the program does from the code

• CPU does not understand human readable languages & code, as it only

understands Machine Code

• Human readable programming languages needs to be translated to

machine code so that CPU is able to execute the code

• There are two ways to do this:

1. Compiling

2. Interpreting

PROGRAMMING IS DONE IN HUMAN READABLE LANGUAGES


• Compiling code: The human readable code is transformed (=compiled)

once to machine code. After this the machine code program can be run

many times.

• -> This is equivalent in translating a book to a foreign language

(machine code), After the translation, book can be read many times.

• Interpreting code: The human readable code is interpreted to Machine

code at the same time it is executed by the CPU. This means, that the

interpretation/translation is happening at the same time the instructions

are executed.

• ->This is equivalent of having a real life conversation via a human

interpreter.

• Whether a language is compiled or intepreted has practical effects

• Speed, how variables are resolved, etc.

HELPING MACHINES READ CODE:COMPILING AND INTERPETING


• The abstraction level of a programming language depends on how “far” it

is from Machine Code & dealing with hardware related specifics (such as

memory management)

• Languages can be compiled/interpreted to other languages

ABSTRACTION LEVEL OF THE LANGUAGE

Low level language High level language

Machine Code Assembly C Java RVisual

Programming

C++ Scala

• There are hundreds of programming languages

• http://en.wikipedia.org/wiki/List_of_programming_languages

• Languages differ in

• Syntax = how they are written, rules of writing instructions

• Semantics = what different words and concepts mean

• Pragmatics = what the language is used for

• Languages also differ in that are they compiled or interpreted to

machine code

PROGRAMMING LANGUAGES

http://en.wikipedia.org/wiki/List_of_programming_languages

IN C LANGUAGE:

#include<stdio.h>

main()

{

printf("Hello World");

}

SYNTAX & SEMANTICSHELLO WORLD -EXAMPLE

IN JAVA LANGUAGE:

public class HelloWorld {

public static void main(String[] args) {

System.out.println("Hello, World");

}

}

IN SCHEME -LANGUAGE:

(define hello-world

(lambda ()

(begin

(write ‘Hello-World)

(newline)

(hello-world))))

IN PYTHON LANGUAGE:

print "Hello, World!"

• Data types:

• Most basic type of information in the language

• integer, real, boolean…

• Data structures:

• More complex structures of data.

• list, stack, array, tree

• Variables: places to store functions and data

• Assignments: a way to tie a certain value to certain variable

• X = 5 + 2;

• Functions:

• A command that performs certain functionality

• Takes arguments and retunrs a value

• Print(“Hello World”) “Hello World”

• Control Structures:

• Control the flow of the program

• Loop, skip, iterate, do something while certain conditions hold

PROGRAMMING LANGUAGES INCLUDE


• There are many different paradigms in the ways people do programs;

below are the three most common:

• Procedural / Imperative Programming

• Line-by-line telling what the program should do:

• 1 Do This

• 2 Do that

• 3 Do those things

• Object-Oriented Programming (OOP)

• Based on objects that contain functions and data

• Objects preserve state

• Functional Programming (FP)

• Functions as first class citizens

PARADIGMS OF PROGRAMMING

• An algorithm is a self-contained set of step-by-step operations to achieve a desired result.

• There are different algorithms for different purposes

• Search algorithms

• Sort algorithms

• Image processing algorithms

• Etc..

• A real life algorithm might be: how people get study credits

• Sign up for a course

• Participate lectures

• Do lecture assignments and final work

• Return lecture assignments and final work

• If your work passes the grading, you get study credits

ALGORITHMS


• The way one writes code matters, because you or someone else needs

to be able to easily understand & modify the code

• This may happen after long periods of time (after one has forgotten

how the program works)

• Good coding style produces code that is simple, readable,

understandable, concise and well structured

• Code is also a way to communicate how the program works

• Documenting your code is a crucial part of programming!

• General principles according Cioffi-Revilla 2014

• Readability

• Commenting

• Modularity

• Defensive coding

CODING STYLE

• A good summary on how to write, refactor and manage code and data:

• Gentzkow, Matthew and Jesse M. Shapiro. 2014. Code and Data for the

Social Sciences: A Practitioner’s Guide. University of Chicago mimeo,

http://web.stanford.edu/~gentzkow/research/CodeAndData.pdf

• Handles matters such as:

• Automation

• Version Control

• Directories

• Data Keys

• Abstractation

• Documentation

• Management

MANAGING AND REFACTORING CODE&DATA

http://web.stanford.edu/~gentzkow/research/CodeAndData.pdf

• You learn programming by doing!

• Start with something small

• University of Helsinki: Many Computer Science Courses

• CSS02 – Introduction to Programming in Social Sciences (II period, 2015).

• MOOC Courses Online:

• Coursera

• Data Science Specialization (Highly recommended)https://www.coursera.org/specialization/jhudatascience/1?utm_medium=catalog

• CodeAcademy

• http://www.codecademy.com/learn

• MIT Open Course WARE

• http://ocw.mit.edu/courses/intro-programming/

• Udemy

• https://www.udemy.com/courses/Development/

WHERE TO LEARN PROGRAMMING

https://www.coursera.org/specialization/jhudatascience/1?utm_medium=catalog

http://www.codecademy.com/learn

http://ocw.mit.edu/courses/intro-programming/

https://www.udemy.com/courses/Development/

WHAT IS MODELING

• Model is a formal and purposeful representation and abstraction of

reality

• Scientific Modeling is a scientific activity, the aim of which is to make a particular

part or feature of the world easier to understand, define, quantify, visualize, or

simulate by referencing it to existing and usually commonly accepted knowledge.

It requires selecting and identifying relevant aspects of a situation in the real

world and then using different types of models for different aims, such as

conceptual models to better understand, operational models to operationalize,

mathematical models to quantify, and graphical models to visualize the subject.

(Wikipedia 2015, Scientific Modeling)

• Reality Abstraction Model of the Phenomena

MODEL

1. Models of Phenomena: model based on real world phenomena (e.g.

how ants collect food)

2. Models of Data: modeling based on raw data (e.g. plotting)

3. Models of Theory: model is the structural and formal presentation of

a textual theory

• Different Modeling Perspectives (Ontological)

• Physical models (e.g. miniature buildings)

• Fictional models (e.g. Bohr model of atom)

• Mathematical models: set-theory models, equations..

• Descriptions

• Mixed models

• A good summary on scientific modeling:

• http://plato.stanford.edu/entries/models-science/

MODELS AS REPRESENTATIONS

(Stanford Encyclopedia 2015.)

http://plato.stanford.edu/entries/models-science/#ModThe

• Ontology is the philosophical study of the nature of being,

becoming, existence, or reality, as well as the basic categories of

being and their relations. Traditionally listed as a part of the major

branch of philosophy known as metaphysics, ontology deals with

questions concerning what entities exist or can be said to exist, and how

such entities can be grouped, related within a hierarchy, and subdivided

according to similarities and differences. (Wikipedia 2015, Ontology)

• In computer science and information science, an ontology is a

formal naming and definition of the types, properties, and

interrelationships of the entities that really or fundamentally exist

for a particular domain of discourse. It is thus a practical application

of philosophical ontology, with a taxonomy. (Wikipedia 2015, Ontology

information science)

ONTOLOGY

• Entire social world consists of social systems and their environments

• These systems form of

• Classes

• Objects (of a certain class, called instances)

• Associations between classes and objects (e.g relationships between

entities)

• Real World (Referent Social System) Model (abstracted Social

System)

ONTOLOGY & SOCIAL SYSTEMS


CAN YOU FIND CLASSES, OBJECTS AND ASSOCIATIONS?

This Image is Public Domain. From: http://www.publicdomainpictures.net.

FAMILY, PARENT, CHILD, GENDER/SEX, PARENT-CHILD-RELATIONSHIP, HETERONORMATIVITY, PHOTO STUDIO…


• Deep epistemological and philosophy of science related questions,

which are not unproblematic

• What is the true relationship between the model and reality?

• What can be actually researched with models?

• What questions the models are actually able to answer?

• Modeling takes also a certain stance on the philosophy of science,

leaning towards empiricism & positivism, or at least critical realism.

MODELING IS PROBLEMATIC

• A really good primer on model thinking is the course given by Scott E.

Page at the University of Michigan. One is able to participate the course

for free in Coursera: https://www.coursera.org/course/modelthinking

• Why Model?

• To be an intelligent citizen of the world

• To be a clearer thinker

• To understand and use data

• To better decide, strategize, and design

• Course videos also freely available in YouTube:

• https://www.youtube.com/watch?v=K-

gxhxGwJ38&index=2&list=PLGqc26s6O0E2P2BnK73JWXk4YYTgl3dm

b

MODEL THINKING

https://www.coursera.org/course/modelthinking

https://www.youtube.com/watch?v=K-gxhxGwJ38&index=2&list=PLGqc26s6O0E2P2BnK73JWXk4YYTgl3dmb

MODELING WITH UNIFIED MODELING LANGUAGE (UML)

• The Unified Modeling Language (UML) is a general-purpose modeling

language in the field of software engineering, which is designed to

provide a standard way to visualize the design of a system. (Wikipedia

2015, UML)

• UML is a standardized notational system for graphically representing

complex systems consisting of classes, objects, associations among

them, dynamic interactions and other scientifically important features.

(Cioffi-Revilla 2014)

• Developed during the 1990s

• Is part of the ISO standard

• Static Modeling: Models the static structure of the system

• Dynamic Modeling: Models the dynamic behavior of the system

UNIFIED MODELIN LANGUAGE (UML)

• Use Case Diagrams

• Class Diagrams

• Sequence Diagrams

• State Diagrams

• Component Diagrams

• Deployment Diagrams

• Most useful for Social Science modeling might be the Class, State,

Sequence diagrams

MAIN TYPES OF UML MODELS

(Bell 2004.)

• Class diagram represents the static structure of a complex system

• Class diagram forms of

• Rectangles representing classes and objects (name on top)

• Classes and objects can have

• Attributes (e.g age, sex)

• Methods = a certain function the class or object is able to perform

(e.g.getMarried())

• Links between rectangles representing associations between classes

and objects

CLASS DIAGRAM


CLASSES

nameOfClass

Attributes (optional)

Methods (optional)

Family

-age

-weight

-height

Person

• Four types of associations represented by different arrowhead-links:

• Inheritance/generalization

(empty arrowhead)

• Aggregation

(empty diamond)

• Composition

(black diamond)

• Generic association

(plain link / directional arrow symbol)

CLASS DIAGRAM & ASSOCIATIONS


(Image from: http://www.javacodegeeks.com/2013/01/quick-summary-object-associations.html)

ASSOCITATIONS

Family

-age

-weight

-height

Person

belongs to

• Multiples represent the quantities in relation of association

• E.g. How many children a parent has in the particular model

• There are many different range options

• 0..1 = between 0 and 1

• 1 = exactly 1

• 0..* or * = between 0 and unspecified many

• 1..* = between 1 and unspecified many

• 0..N or N = between 0 and unspecified many

• 1..N = between 1 and unspecified many

CLASS DIAGRAM & MULTIPLES


MULTIPLES

Family

-age

-weight

-height

Person

belongs to

1..*

0..1

HOW TO MODEL THIS IN UML CLASS DIAGRAM?


• Sketch a UML Class Diagram model that represents elections

• What are the main classes, objects and relationships between the

classes?

• Do you find the model useful?

ASSIGNMENT

• Gentzkow, M.; Shapiro, J, M. 2014. Code and Data for the Social Sciences: A Practitioner’s Guide. University of Chicago mimeo, http://faculty.chicagobooth.edu/matthew.gentzkow/research/CodeAndData.pdf

• Granger, C. 2015. Coding is not the new literacy. http://www.chris-granger.com/2015/01/26/coding-is-not-the-new-literacy/

• Epstein, J. M. 2008. Why Model?. Keynote address to the Second World Congress on Social Simulation. In Why Model?: Keynote address to the Second World Congress on Social Simulation. George Mason University.

• Page, S. E. 2012. The Model Thinker: Prologue, Introduction and Chapter 1. Link provided by University of Michigan & Coursera:

• http://vserver1.cscs.lsa.umich.edu/~spage/ONLINECOURSE/R1Page.pdf

• Stanford Encyclopedia of Philosophy, 2012. Models in Science.


• Bell, D. 2003. UML basics: An introduction to the Unified Modeling Language. The Rational Edge. https://www.ibm.com/developerworks/rational/library/content/RationalEdge/sep03/f_umlbasics_db.pdf

LECTURE 2 READING

http://faculty.chicagobooth.edu/matthew.gentzkow/research/CodeAndData.pdf

http://www.chris-granger.com/2015/01/26/coding-is-not-the-new-literacy/

http://vserver1.cscs.lsa.umich.edu/~spage/ONLINECOURSE/R1Page.pdf

http://plato.stanford.edu/entries/models-science/

https://www.ibm.com/developerworks/rational/library/content/RationalEdge/sep03/f_umlbasics_db.pdf

• Cioffi-Revilla, C. 2014. Introduction to Computational Social Science. Springer-Verlag, London

• Gentzkow, M.; Shapiro, J, M. 2014. Code and Data for the Social Sciences: A Practitioner’s Guide. University of Chicago mimeo, http://faculty.chicagobooth.edu/matthew.gentzkow/research/CodeAndData.pdf

• Hennessy, J. L.; Patterson, D. A. 2013. Computer Organization and Design. Elsevier, Waltham.

• Stanford Encyclopedia of Philosophy, 2012. Models in Science.


• Bell, D. 2003. UML basics: An introduction to the Unified Modeling Language. The Rational Edge. https://www.ibm.com/developerworks/rational/library/content/RationalEdge/sep03/f_umlbasics_db.pdf

• Wikipedia 2015, Scientific Modeling. http://en.wikipedia.org/wiki/Scientific_modelling

• Wikipedia 2015, Ontology. http://en.wikipedia.org/wiki/Ontology

• Wikipedia 2015, Ontology (information science) http://en.wikipedia.org/wiki/Ontology_(information_science)

REFERENCES

http://faculty.chicagobooth.edu/matthew.gentzkow/research/CodeAndData.pdf

http://plato.stanford.edu/entries/models-science/

https://www.ibm.com/developerworks/rational/library/content/RationalEdge/sep03/f_umlbasics_db.pdf

http://en.wikipedia.org/wiki/Scientific_modelling

http://en.wikipedia.org/wiki/Ontology

http://en.wikipedia.org/wiki/Ontology_(information_science

Thank You!

Questions and comments?

twitter: @laurieloranta

basics of computation and modeling - lecture 2 in introduction to computational social science

Data & Analytics

physical memory

u35 room

selection of computers

software hardware

basics of computation

information processing

computers workcomputers

social complexity