workshop on programming in python - day ii
TRANSCRIPT
Programming in PythonA Two Day Workshop
Satyaki Sikdar
Vice ChairACM Student Chapter
Heritage Institute of Technology
April 23 2016
Satyaki Sikdar© Programming in Python April 23 2016 1 / 62
hour 6: let’s get rich!
table of contents
1 hour 6: let’s get rich!an elaborate exampleinheritancefile handling 101
2 hour 7: algo design 101
3 hours 8: data viz 101
4 hours 9 - 11 SNA 101
Satyaki Sikdar© Programming in Python April 23 2016 2 / 62
hour 6: let’s get rich! an elaborate example
another example
I There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
I Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
I We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ forranks. We’ll have trouble comparing the cards
I We use integers to encode the ranks and suitsI Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0I Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
I There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
I Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
I We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ forranks. We’ll have trouble comparing the cards
I We use integers to encode the ranks and suitsI Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0I Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
I There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
I Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
I We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ forranks. We’ll have trouble comparing the cards
I We use integers to encode the ranks and suitsI Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0I Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
I There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
I Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
I We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ forranks. We’ll have trouble comparing the cards
I We use integers to encode the ranks and suitsI Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0I Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
I There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
I Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
I We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ forranks. We’ll have trouble comparing the cards
I We use integers to encode the ranks and suitsI Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0I Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
another example
I There are 52 cards in a deck. There are 4 suits - Spades, Hearts, Diamonds and Clubs
I Each suit has 13 cards - Ace, 2, ..., 10, Jack, Queen and King
I We’ll create a card class. Attributes can be strings, like ’Spade’ for suits and ’Queen’ forranks. We’ll have trouble comparing the cards
I We use integers to encode the ranks and suitsI Spades → 3, Hearts → 2, Diamonds → 1 and Clubs → 0I Ace → 1, Jack → 11, Queen → 12 and King → 13
Satyaki Sikdar© Programming in Python April 23 2016 3 / 62
hour 6: let’s get rich! an elaborate example
the class definition
class Card:'''Represents a standard playing card'''
suit_names = ['Clubs', 'Diamonds', 'Hearts', 'Spades']rank_names = [None, 'Ace', '2', '3', '4', '5', '6', '7', '8', '9', '10','Jack', 'Queen', 'King']
def __init__(self, suit=0, rank=2):self.suit = suitself.rank = rank
def __str__(self):return '%s of %s' % (Card.rank_names[self.rank],
Card.suit_names[self.suit])>>> two_of_clubs = Card() >>> queen_of_diamonds = Card(1, 12)
Satyaki Sikdar© Programming in Python April 23 2016 4 / 62
hour 6: let’s get rich! an elaborate example
class and instance attributes
class attribute instance attributeDefined outside any method Defined inside methodsReferred by class.class_attr Referred by inst.inst_attr
One copy per class One copy per instanceEg: suit_names and rank_names Eg: suit and rank
Figure: Class and instance attributesSatyaki Sikdar© Programming in Python April 23 2016 5 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
I For built-in types, there are relational operators (<,>,==, etc.) that compare two thingsto produce a boolean
I For user-defined types, we need to override the __cmp__ method. It takes in twoparameters, self and other, returns
I a positive number if the first object is greater
I a negative number if the second object is greater
I zero if they are equal
I The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 ofDiamonds? One has a higher rank, but the other has a higher suit
I We arbitrarily choose that suit is more important, so all the Spades outrank all theDiamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
I For built-in types, there are relational operators (<,>,==, etc.) that compare two thingsto produce a boolean
I For user-defined types, we need to override the __cmp__ method. It takes in twoparameters, self and other, returns
I a positive number if the first object is greater
I a negative number if the second object is greater
I zero if they are equal
I The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 ofDiamonds? One has a higher rank, but the other has a higher suit
I We arbitrarily choose that suit is more important, so all the Spades outrank all theDiamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
I For built-in types, there are relational operators (<,>,==, etc.) that compare two thingsto produce a boolean
I For user-defined types, we need to override the __cmp__ method. It takes in twoparameters, self and other, returns
I a positive number if the first object is greater
I a negative number if the second object is greater
I zero if they are equal
I The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 ofDiamonds? One has a higher rank, but the other has a higher suit
I We arbitrarily choose that suit is more important, so all the Spades outrank all theDiamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
I For built-in types, there are relational operators (<,>,==, etc.) that compare two thingsto produce a boolean
I For user-defined types, we need to override the __cmp__ method. It takes in twoparameters, self and other, returns
I a positive number if the first object is greater
I a negative number if the second object is greater
I zero if they are equal
I The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 ofDiamonds? One has a higher rank, but the other has a higher suit
I We arbitrarily choose that suit is more important, so all the Spades outrank all theDiamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
I For built-in types, there are relational operators (<,>,==, etc.) that compare two thingsto produce a boolean
I For user-defined types, we need to override the __cmp__ method. It takes in twoparameters, self and other, returns
I a positive number if the first object is greater
I a negative number if the second object is greater
I zero if they are equal
I The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 ofDiamonds? One has a higher rank, but the other has a higher suit
I We arbitrarily choose that suit is more important, so all the Spades outrank all theDiamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
I For built-in types, there are relational operators (<,>,==, etc.) that compare two thingsto produce a boolean
I For user-defined types, we need to override the __cmp__ method. It takes in twoparameters, self and other, returns
I a positive number if the first object is greater
I a negative number if the second object is greater
I zero if they are equal
I The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 ofDiamonds? One has a higher rank, but the other has a higher suit
I We arbitrarily choose that suit is more important, so all the Spades outrank all theDiamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
comparing cards
I For built-in types, there are relational operators (<,>,==, etc.) that compare two thingsto produce a boolean
I For user-defined types, we need to override the __cmp__ method. It takes in twoparameters, self and other, returns
I a positive number if the first object is greater
I a negative number if the second object is greater
I zero if they are equal
I The ordering for cards is not obvious. Which is better, the 3 of Clubs or the 2 ofDiamonds? One has a higher rank, but the other has a higher suit
I We arbitrarily choose that suit is more important, so all the Spades outrank all theDiamonds and so on.
Satyaki Sikdar© Programming in Python April 23 2016 6 / 62
hour 6: let’s get rich! an elaborate example
writing the __cmp__ method
#inside Card classdef __cmp__(self, other):
if self.suit > other.suit: #check the suitsreturn 1
elif self.suit < other.suit:return -1
elif self.rank > other.rank: #check the ranksreturn 1
elif self.rank < other.rank:return -1
else: #both the suits and the ranks are the samereturn 0
Satyaki Sikdar© Programming in Python April 23 2016 7 / 62
hour 6: let’s get rich! an elaborate example
decks
I Now that we have Cards, we define Decks. It will contain a list of Cards
I The init method creates the entire deck of 52 cards
class Deck:'''Represents a deck of cards'''
def __init__(self):self.cards = []for suit in range(4):
for rank in range(1, 14):card = Card(suit, rank)self.cards.append(card)
Satyaki Sikdar© Programming in Python April 23 2016 8 / 62
hour 6: let’s get rich! an elaborate example
decks
I Now that we have Cards, we define Decks. It will contain a list of Cards
I The init method creates the entire deck of 52 cards
class Deck:'''Represents a deck of cards'''
def __init__(self):self.cards = []for suit in range(4):
for rank in range(1, 14):card = Card(suit, rank)self.cards.append(card)
Satyaki Sikdar© Programming in Python April 23 2016 8 / 62
hour 6: let’s get rich! an elaborate example
decks
#inside class Deckdef __str__(self):
res = []for card in self.cards:
res.append(str(card))return '\n'.join(res)
def shuffle(self):random.shuffle(self.cards)
#inside class Deckdef pop_card(self):
return self.cards.pop()
def add_card(self, card):self.cards.append(card)
def sort(self):self.cards.sort()
>>> deck = Deck()>>> print deck.pop_card()King of Spades
Satyaki Sikdar© Programming in Python April 23 2016 9 / 62
hour 6: let’s get rich! inheritance
inheritance
I The language feature most often associated with object-oriented programming isinheritance
I It’s the ability to define a new class that’s a modified version of an existing class
I The existing class is called the parent and the new class is called the child
I We want a class to represent a hand that is, the set of cards held by a player
I A hand is similar to a deck: both are made up of a set of cards, and both requireoperations like adding and removing cards
I A hand is also different from a deck; there are operations we want for hands that don’tmake sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
I The language feature most often associated with object-oriented programming isinheritance
I It’s the ability to define a new class that’s a modified version of an existing class
I The existing class is called the parent and the new class is called the child
I We want a class to represent a hand that is, the set of cards held by a player
I A hand is similar to a deck: both are made up of a set of cards, and both requireoperations like adding and removing cards
I A hand is also different from a deck; there are operations we want for hands that don’tmake sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
I The language feature most often associated with object-oriented programming isinheritance
I It’s the ability to define a new class that’s a modified version of an existing class
I The existing class is called the parent and the new class is called the child
I We want a class to represent a hand that is, the set of cards held by a player
I A hand is similar to a deck: both are made up of a set of cards, and both requireoperations like adding and removing cards
I A hand is also different from a deck; there are operations we want for hands that don’tmake sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
I The language feature most often associated with object-oriented programming isinheritance
I It’s the ability to define a new class that’s a modified version of an existing class
I The existing class is called the parent and the new class is called the child
I We want a class to represent a hand that is, the set of cards held by a player
I A hand is similar to a deck: both are made up of a set of cards, and both requireoperations like adding and removing cards
I A hand is also different from a deck; there are operations we want for hands that don’tmake sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
I The language feature most often associated with object-oriented programming isinheritance
I It’s the ability to define a new class that’s a modified version of an existing class
I The existing class is called the parent and the new class is called the child
I We want a class to represent a hand that is, the set of cards held by a player
I A hand is similar to a deck: both are made up of a set of cards, and both requireoperations like adding and removing cards
I A hand is also different from a deck; there are operations we want for hands that don’tmake sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
inheritance
I The language feature most often associated with object-oriented programming isinheritance
I It’s the ability to define a new class that’s a modified version of an existing class
I The existing class is called the parent and the new class is called the child
I We want a class to represent a hand that is, the set of cards held by a player
I A hand is similar to a deck: both are made up of a set of cards, and both requireoperations like adding and removing cards
I A hand is also different from a deck; there are operations we want for hands that don’tmake sense for a deck
Satyaki Sikdar© Programming in Python April 23 2016 10 / 62
hour 6: let’s get rich! inheritance
I The definition of a child class is like other class definitions, but the name of the parentclass appears in parenthesesclass Hand(Deck):
'''Represents a hand of playing cards'''
I This definition indicates that Hand inherits from Deck; that means we can use methodslike pop_card and add_card for Hands as well as Decks
I Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the initmethod for Hands should initialize cards with an empty list
I We can provide an init method, overriding the one in Deck#inside class Hand
def __init__(self, label=''):self.cards = []self.label = label
Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
hour 6: let’s get rich! inheritance
I The definition of a child class is like other class definitions, but the name of the parentclass appears in parenthesesclass Hand(Deck):
'''Represents a hand of playing cards'''
I This definition indicates that Hand inherits from Deck; that means we can use methodslike pop_card and add_card for Hands as well as Decks
I Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the initmethod for Hands should initialize cards with an empty list
I We can provide an init method, overriding the one in Deck#inside class Hand
def __init__(self, label=''):self.cards = []self.label = label
Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
hour 6: let’s get rich! inheritance
I The definition of a child class is like other class definitions, but the name of the parentclass appears in parenthesesclass Hand(Deck):
'''Represents a hand of playing cards'''
I This definition indicates that Hand inherits from Deck; that means we can use methodslike pop_card and add_card for Hands as well as Decks
I Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the initmethod for Hands should initialize cards with an empty list
I We can provide an init method, overriding the one in Deck#inside class Hand
def __init__(self, label=''):self.cards = []self.label = label
Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
hour 6: let’s get rich! inheritance
I The definition of a child class is like other class definitions, but the name of the parentclass appears in parenthesesclass Hand(Deck):
'''Represents a hand of playing cards'''
I This definition indicates that Hand inherits from Deck; that means we can use methodslike pop_card and add_card for Hands as well as Decks
I Hand also inherits __init__ from Deck, but it doesn’t really do what we want: the initmethod for Hands should initialize cards with an empty list
I We can provide an init method, overriding the one in Deck#inside class Hand
def __init__(self, label=''):self.cards = []self.label = label
Satyaki Sikdar© Programming in Python April 23 2016 11 / 62
hour 6: let’s get rich! inheritance
I So when you create a Hand,Python invokes it’s own init>>> hand = Hand('new hand')>>> print hand.cards[]>>> print hand.labelnew hand
I But the other methods are inherited from Deck>>> deck = Deck()>>> card = deck.pop_card()>>> hand.add_card(card) #add_card from Hand>>> print hand #using the str of HandKing of Spades
I A natural next step is to encapsulate this code in a method called move_cards#inside class Deck
def move_cards(self, hand, card):for i in xrange(num):
hand.add_card(self.pop_card())I move_cards takes two arguments, a Hand object and the number of cards to deal.
Modifies both self and hand
Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
hour 6: let’s get rich! inheritance
I So when you create a Hand,Python invokes it’s own init>>> hand = Hand('new hand')>>> print hand.cards[]>>> print hand.labelnew hand
I But the other methods are inherited from Deck>>> deck = Deck()>>> card = deck.pop_card()>>> hand.add_card(card) #add_card from Hand>>> print hand #using the str of HandKing of Spades
I A natural next step is to encapsulate this code in a method called move_cards#inside class Deck
def move_cards(self, hand, card):for i in xrange(num):
hand.add_card(self.pop_card())I move_cards takes two arguments, a Hand object and the number of cards to deal.
Modifies both self and hand
Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
hour 6: let’s get rich! inheritance
I So when you create a Hand,Python invokes it’s own init>>> hand = Hand('new hand')>>> print hand.cards[]>>> print hand.labelnew hand
I But the other methods are inherited from Deck>>> deck = Deck()>>> card = deck.pop_card()>>> hand.add_card(card) #add_card from Hand>>> print hand #using the str of HandKing of Spades
I A natural next step is to encapsulate this code in a method called move_cards#inside class Deck
def move_cards(self, hand, card):for i in xrange(num):
hand.add_card(self.pop_card())I move_cards takes two arguments, a Hand object and the number of cards to deal.
Modifies both self and hand
Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
hour 6: let’s get rich! inheritance
I So when you create a Hand,Python invokes it’s own init>>> hand = Hand('new hand')>>> print hand.cards[]>>> print hand.labelnew hand
I But the other methods are inherited from Deck>>> deck = Deck()>>> card = deck.pop_card()>>> hand.add_card(card) #add_card from Hand>>> print hand #using the str of HandKing of Spades
I A natural next step is to encapsulate this code in a method called move_cards#inside class Deck
def move_cards(self, hand, card):for i in xrange(num):
hand.add_card(self.pop_card())I move_cards takes two arguments, a Hand object and the number of cards to deal.
Modifies both self and hand
Satyaki Sikdar© Programming in Python April 23 2016 12 / 62
hour 6: let’s get rich! inheritance
#inside class Deckdef deal_hands(self, num_hands, cards_per_hand):
hands = []self.shuffle() #shuffling the deckfor i in range(num_hands):
hand = Hand('player %d' % (i))for j in range(cards_per_hand):
hand.add_card(self.pop_card())hands.append(hand)
return hands
I Now you have a proper framework for a card game, be it poker, blackjack or bridge!
Satyaki Sikdar© Programming in Python April 23 2016 13 / 62
hour 6: let’s get rich! file handling 101
the need for file handling
I Most of the programs we have seen so far are transient in the sense that they run for ashort time and produce some output, but when they end, their data disappears. If you runthe program again, it starts with a clean slate
I Other programs are persistent: they run for a long time (or all the time); they keep atleast some of their data in permanent storage (a hard drive, for example); if they shutdown and restart, they pick up where they left off
I Big input and output sizes - too big for the main memory
Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
hour 6: let’s get rich! file handling 101
the need for file handling
I Most of the programs we have seen so far are transient in the sense that they run for ashort time and produce some output, but when they end, their data disappears. If you runthe program again, it starts with a clean slate
I Other programs are persistent: they run for a long time (or all the time); they keep atleast some of their data in permanent storage (a hard drive, for example); if they shutdown and restart, they pick up where they left off
I Big input and output sizes - too big for the main memory
Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
hour 6: let’s get rich! file handling 101
the need for file handling
I Most of the programs we have seen so far are transient in the sense that they run for ashort time and produce some output, but when they end, their data disappears. If you runthe program again, it starts with a clean slate
I Other programs are persistent: they run for a long time (or all the time); they keep atleast some of their data in permanent storage (a hard drive, for example); if they shutdown and restart, they pick up where they left off
I Big input and output sizes - too big for the main memory
Satyaki Sikdar© Programming in Python April 23 2016 14 / 62
hour 6: let’s get rich! file handling 101
I Examples of persistent programs are operating systems, which run pretty much whenever acomputer is on, and web servers, which run all the time, waiting for requests to come in onthe network.
I One of the simplest ways for programs to maintain their data is by reading and writingtext files.
fp_read = open('input.txt', 'r')fp_write = open('output.txt', 'w')
Satyaki Sikdar© Programming in Python April 23 2016 15 / 62
hour 6: let’s get rich! file handling 101
I Examples of persistent programs are operating systems, which run pretty much whenever acomputer is on, and web servers, which run all the time, waiting for requests to come in onthe network.
I One of the simplest ways for programs to maintain their data is by reading and writingtext files.
fp_read = open('input.txt', 'r')fp_write = open('output.txt', 'w')
Satyaki Sikdar© Programming in Python April 23 2016 15 / 62
hour 6: let’s get rich! file handling 101
reading from files
I The built-in function open takes the name of the file as a parameter and returns a fileobject you can use to read the file>>> fin = open('input.txt', 'r')>>> print fin>>> <open file 'input.txt', mode 'r' at 0xb7eb2410>
I A few things to note: The file opened must exist. An IOError is thrown otherwise.I The exact path to the file must be provided which includes the correct filename with
extension (if any)
Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
hour 6: let’s get rich! file handling 101
reading from files
I The built-in function open takes the name of the file as a parameter and returns a fileobject you can use to read the file>>> fin = open('input.txt', 'r')>>> print fin>>> <open file 'input.txt', mode 'r' at 0xb7eb2410>
I A few things to note: The file opened must exist. An IOError is thrown otherwise.I The exact path to the file must be provided which includes the correct filename with
extension (if any)
Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
hour 6: let’s get rich! file handling 101
reading from files
I The built-in function open takes the name of the file as a parameter and returns a fileobject you can use to read the file>>> fin = open('input.txt', 'r')>>> print fin>>> <open file 'input.txt', mode 'r' at 0xb7eb2410>
I A few things to note: The file opened must exist. An IOError is thrown otherwise.I The exact path to the file must be provided which includes the correct filename with
extension (if any)
Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
hour 6: let’s get rich! file handling 101
reading from files
I The built-in function open takes the name of the file as a parameter and returns a fileobject you can use to read the file>>> fin = open('input.txt', 'r')>>> print fin>>> <open file 'input.txt', mode 'r' at 0xb7eb2410>
I A few things to note: The file opened must exist. An IOError is thrown otherwise.I The exact path to the file must be provided which includes the correct filename with
extension (if any)
Satyaki Sikdar© Programming in Python April 23 2016 16 / 62
hour 6: let’s get rich! file handling 101
reading files
I The file object provides several methods for reading, including readline, which readscharacters from the file until it gets to a newline and returns the result as a string:
>>> fin.readline()'the first line \n'If you keep on doing fin.readlines(), you’d end up reading the whole file, one line at a time.Let’s see a few examples of reading files.
Satyaki Sikdar© Programming in Python April 23 2016 17 / 62
hour 6: let’s get rich! file handling 101
writing to files
>>> fout = open('output.txt', 'w')>>> print fout<open file 'output.txt', mode 'w' at 0xb7eb2410>
I If the file already exists, opening it in write mode clears out the old data and starts fresh,so be careful! If the file doesn’t exist, a new one is created
>>> line1 = 'He left yesterday behind him, you might say he was born again,\n'>>> fout.write(line1)
Again, the file object keeps track of where it is, so if you call write again, it adds the newdata to the end
>>> line2 = 'you might say he found a key for every door.\n'>>> fout.write(line2)
Satyaki Sikdar© Programming in Python April 23 2016 18 / 62
hour 6: let’s get rich! file handling 101
using files for something meaningful
Let’s combine the knowledge of file handling with dictionaries to do some basic lexical analysis
import stringdef char_freq(filename):
counter = dict()with open(filename, 'r') as f:
raw_text = f.read()for c in raw_text:
c = c.lower()if c in string.ascii_lowercase:
if c in counter:counter[c] += 1
else:counter[c] = 1
return counter
def normalize(counter):sum_values = float(sum(counter.values()))for key in counter:
counter[key] /= sum_valuesreturn counter
Satyaki Sikdar© Programming in Python April 23 2016 19 / 62
hour 7: algo design 101
table of contents
1 hour 6: let’s get rich!
2 hour 7: algo design 101merge sort
modules
3 hours 8: data viz 101
4 hours 9 - 11 SNA 101
Satyaki Sikdar© Programming in Python April 23 2016 20 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
I One of the strong points of Python is the ease of expression
I Turning pseudocode into actual code is not difficult
I Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
I Divide: Divide the n-element sequence into two subsequences of n2 elements
I Conquer: Sort the subsequences recursively
I Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
I One of the strong points of Python is the ease of expression
I Turning pseudocode into actual code is not difficult
I Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
I Divide: Divide the n-element sequence into two subsequences of n2 elements
I Conquer: Sort the subsequences recursively
I Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
I One of the strong points of Python is the ease of expression
I Turning pseudocode into actual code is not difficult
I Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
I Divide: Divide the n-element sequence into two subsequences of n2 elements
I Conquer: Sort the subsequences recursively
I Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
I One of the strong points of Python is the ease of expression
I Turning pseudocode into actual code is not difficult
I Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
I Divide: Divide the n-element sequence into two subsequences of n2 elements
I Conquer: Sort the subsequences recursively
I Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
I One of the strong points of Python is the ease of expression
I Turning pseudocode into actual code is not difficult
I Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
I Divide: Divide the n-element sequence into two subsequences of n2 elements
I Conquer: Sort the subsequences recursively
I Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
algorithm design in Python
I One of the strong points of Python is the ease of expression
I Turning pseudocode into actual code is not difficult
I Let’s try to implement the Merge Sort algorithm in Python
A high level idea of the algorithm
I Divide: Divide the n-element sequence into two subsequences of n2 elements
I Conquer: Sort the subsequences recursively
I Combine: Merge the two sorted subsequences to produce the sorted answer
Satyaki Sikdar© Programming in Python April 23 2016 21 / 62
hour 7: algo design 101 merge sort
Algorithm 1: MERGE(left, right)
beginAppend ∞ to left and righti← 0, j ← 0merged← new listwhile len(merged) < len(left) + len(right) - 2do
if left[i] < right[j] thenmerged.append(left[i])i← i+ 1
elsemerged.append(right[j])j ← j + 1
return merged
Algorithm 2: MERGE-SORT(A)
beginif len(A) < 2 then
return A
elseleft← first n
2 elements of Aright← last n
2 elements of Aleft←MERGE − SORT (left)right←MERGE − SORT (right)
return MERGE(left, right)
Satyaki Sikdar© Programming in Python April 23 2016 22 / 62
hour 7: algo design 101 merge sort
the core idea
I The algorithm is naturally recursive
I The MERGE method takes two sorted lists and merges into a single sorted list
I MERGE − SORT sorts the list recursively by breaking it into equal sized halves andsorting them
I A list having less than 2 elements is trivially sorted - base case
I Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
the core idea
I The algorithm is naturally recursive
I The MERGE method takes two sorted lists and merges into a single sorted list
I MERGE − SORT sorts the list recursively by breaking it into equal sized halves andsorting them
I A list having less than 2 elements is trivially sorted - base case
I Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
the core idea
I The algorithm is naturally recursive
I The MERGE method takes two sorted lists and merges into a single sorted list
I MERGE − SORT sorts the list recursively by breaking it into equal sized halves andsorting them
I A list having less than 2 elements is trivially sorted - base case
I Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
the core idea
I The algorithm is naturally recursive
I The MERGE method takes two sorted lists and merges into a single sorted list
I MERGE − SORT sorts the list recursively by breaking it into equal sized halves andsorting them
I A list having less than 2 elements is trivially sorted - base case
I Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
the core idea
I The algorithm is naturally recursive
I The MERGE method takes two sorted lists and merges into a single sorted list
I MERGE − SORT sorts the list recursively by breaking it into equal sized halves andsorting them
I A list having less than 2 elements is trivially sorted - base case
I Smaller sorted lists are agglomerated to form the overall sorted list
Satyaki Sikdar© Programming in Python April 23 2016 23 / 62
hour 7: algo design 101 merge sort
Algorithm 3: MERGE(left, right)
beginAppend ∞ to left and righti← 0, j ← 0merged← new listwhile len(merged) < len(left) + len(right) - 2do
if left[i] < right[j] thenmerged.append(left[i])i← i+ 1
elsemerged.append(right[j])j ← j + 1
return merged
def merge(left, right):left.append(float('inf'))right.append(float('inf'))i = 0j = 0merged = []while len(merged) < len(left) +
len(right) - 2:if left[i] < right[j]:
merged.append(left[i])i += 1
else:merged.append(right[j])j += 1
return merged
Satyaki Sikdar© Programming in Python April 23 2016 24 / 62
hour 7: algo design 101 merge sort
Algorithm 4: MERGE-SORT(A)
beginif len(A) < 2 then
return A
elseleft← first n
2 elements of Aright← last n
2 elements of Aleft←MERGE − SORT (left)right←MERGE − SORT (right)return MERGE(left, right)
def merge_sort(A):if len(A) < 2:
return Aelse:
mid = len(A) / 2left = A[: mid]right = A[mid: ]left = merge_sort(left)right = merge_sort(right)return merge(left, right)
Satyaki Sikdar© Programming in Python April 23 2016 25 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Modules are external files and libraries that provide additional functions and classes to thebare bone Python
I Modules are files containing Python definitions and statements (ex. name.py)
I The interface is very simple. Definitions can be imported into other modules by using“import name”
I To access a module’s functions, type “name.function()”
I Each module is imported once per session
I Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Modules are external files and libraries that provide additional functions and classes to thebare bone Python
I Modules are files containing Python definitions and statements (ex. name.py)
I The interface is very simple. Definitions can be imported into other modules by using“import name”
I To access a module’s functions, type “name.function()”
I Each module is imported once per session
I Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Modules are external files and libraries that provide additional functions and classes to thebare bone Python
I Modules are files containing Python definitions and statements (ex. name.py)
I The interface is very simple. Definitions can be imported into other modules by using“import name”
I To access a module’s functions, type “name.function()”
I Each module is imported once per session
I Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Modules are external files and libraries that provide additional functions and classes to thebare bone Python
I Modules are files containing Python definitions and statements (ex. name.py)
I The interface is very simple. Definitions can be imported into other modules by using“import name”
I To access a module’s functions, type “name.function()”
I Each module is imported once per session
I Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Modules are external files and libraries that provide additional functions and classes to thebare bone Python
I Modules are files containing Python definitions and statements (ex. name.py)
I The interface is very simple. Definitions can be imported into other modules by using“import name”
I To access a module’s functions, type “name.function()”
I Each module is imported once per session
I Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Modules are external files and libraries that provide additional functions and classes to thebare bone Python
I Modules are files containing Python definitions and statements (ex. name.py)
I The interface is very simple. Definitions can be imported into other modules by using“import name”
I To access a module’s functions, type “name.function()”
I Each module is imported once per session
I Give nicknames to modules by using as
Satyaki Sikdar© Programming in Python April 23 2016 26 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
modules: extending functionalities beyondbasic Python
I Python has a lot of predefined modules - sys, __future__, math, random, re, ...
I The Zen of Python. Do import this
I Each module is highly specialized
I You have various choices when importing things from a moduleI Import the whole module, but preserve the namespace - important when dealing with a lot of
modules and keeping a track of thingsimport module_name
I Import the whole module, but bring everything to the current namespacefrom module_name import ∗
I Import only specific things - often faster.from math import pi, sin, cos
Satyaki Sikdar© Programming in Python April 23 2016 27 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
the sys module
I This module provides access to some variables used or maintained by the interpreter andto functions that interact strongly with the interpreter
I sys.argv - The list of command line arguments passed to a Python scriptI argv[0] is the script nameI Further command line args are stored in argv[1] onwards. Eg:python test_prog.py arg1 arg2 arg3, then argv = [′test_prog.py′,′ arg1′,′ arg2′,′ arg3′]
I sys.getrecursionlimit() - Return the current value of the recursion limit, the maximumdepth of the Python interpreter stack
I sys.setrecursionlimit(limit)
I Set the maximum depth of the Python interpreter stack to limitI This limit prevents infinite recursion from causing an overflow of the C stack and crashing
PythonI The highest possible limit is platform-dependent
Satyaki Sikdar© Programming in Python April 23 2016 28 / 62
hour 7: algo design 101 modules
make your own module
I Making modules are very easy - at least the basic ones anyway
I Create a script in IDLE or in a decent text editor
I Write the classes and variables you want the module to have (say, three functions f1, f2, f3and two variables v1 and v2)
I Save the script as my_mod.py
I Create another Python script, in the same directory where you’ll use the module
I Write import my_mod anywhere and you’re done!
I dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
I Making modules are very easy - at least the basic ones anyway
I Create a script in IDLE or in a decent text editor
I Write the classes and variables you want the module to have (say, three functions f1, f2, f3and two variables v1 and v2)
I Save the script as my_mod.py
I Create another Python script, in the same directory where you’ll use the module
I Write import my_mod anywhere and you’re done!
I dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
I Making modules are very easy - at least the basic ones anyway
I Create a script in IDLE or in a decent text editor
I Write the classes and variables you want the module to have (say, three functions f1, f2, f3and two variables v1 and v2)
I Save the script as my_mod.py
I Create another Python script, in the same directory where you’ll use the module
I Write import my_mod anywhere and you’re done!
I dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
I Making modules are very easy - at least the basic ones anyway
I Create a script in IDLE or in a decent text editor
I Write the classes and variables you want the module to have (say, three functions f1, f2, f3and two variables v1 and v2)
I Save the script as my_mod.py
I Create another Python script, in the same directory where you’ll use the module
I Write import my_mod anywhere and you’re done!
I dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
I Making modules are very easy - at least the basic ones anyway
I Create a script in IDLE or in a decent text editor
I Write the classes and variables you want the module to have (say, three functions f1, f2, f3and two variables v1 and v2)
I Save the script as my_mod.py
I Create another Python script, in the same directory where you’ll use the module
I Write import my_mod anywhere and you’re done!
I dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
I Making modules are very easy - at least the basic ones anyway
I Create a script in IDLE or in a decent text editor
I Write the classes and variables you want the module to have (say, three functions f1, f2, f3and two variables v1 and v2)
I Save the script as my_mod.py
I Create another Python script, in the same directory where you’ll use the module
I Write import my_mod anywhere and you’re done!
I dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hour 7: algo design 101 modules
make your own module
I Making modules are very easy - at least the basic ones anyway
I Create a script in IDLE or in a decent text editor
I Write the classes and variables you want the module to have (say, three functions f1, f2, f3and two variables v1 and v2)
I Save the script as my_mod.py
I Create another Python script, in the same directory where you’ll use the module
I Write import my_mod anywhere and you’re done!
I dir(modulename) gives a sorted list of strings of the things imported from the module
Satyaki Sikdar© Programming in Python April 23 2016 29 / 62
hours 8: data viz 101
table of contents
1 hour 6: let’s get rich!
2 hour 7: algo design 101
3 hours 8: data viz 101plottingmatplotlibmaking plots prettier
4 hours 9 - 11 SNA 101
Satyaki Sikdar© Programming in Python April 23 2016 30 / 62
hours 8: data viz 101 plotting
data visualization
I Data visualization turns numbers and letters into aesthetically pleasing visuals, making iteasy to recognize patterns and find exceptions
Figure: US Census data (2010)
I It is easy to see some general settlementpatterns in the US
I The East Coast has a much greaterpopulation density than the rest ofAmerica
I The East Coast has a much greaterpopulation density than the rest ofAmerica - racial homophily
Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
hours 8: data viz 101 plotting
data visualization
I Data visualization turns numbers and letters into aesthetically pleasing visuals, making iteasy to recognize patterns and find exceptions
Figure: US Census data (2010)
I It is easy to see some general settlementpatterns in the US
I The East Coast has a much greaterpopulation density than the rest ofAmerica
I The East Coast has a much greaterpopulation density than the rest ofAmerica - racial homophily
Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
hours 8: data viz 101 plotting
data visualization
I Data visualization turns numbers and letters into aesthetically pleasing visuals, making iteasy to recognize patterns and find exceptions
Figure: US Census data (2010)
I It is easy to see some general settlementpatterns in the US
I The East Coast has a much greaterpopulation density than the rest ofAmerica
I The East Coast has a much greaterpopulation density than the rest ofAmerica - racial homophily
Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
hours 8: data viz 101 plotting
data visualization
I Data visualization turns numbers and letters into aesthetically pleasing visuals, making iteasy to recognize patterns and find exceptions
Figure: US Census data (2010)
I It is easy to see some general settlementpatterns in the US
I The East Coast has a much greaterpopulation density than the rest ofAmerica
I The East Coast has a much greaterpopulation density than the rest ofAmerica - racial homophily
Satyaki Sikdar© Programming in Python April 23 2016 31 / 62
hours 8: data viz 101 plotting
love in the time of cholera
Figure: Tufte’s Cholera Map
Satyaki Sikdar© Programming in Python April 23 2016 32 / 62
hours 8: data viz 101 plotting
I Anscombe’s quartet comprises four datasets that have nearly identical simple statisticalproperties, yet appear very different when graphed
I Constructed in 1973 by Francis Anscombe to demonstrate both the importance ofgraphing data before analyzing it and the effect of outliers on statistical properties
Property Valuex 9
σ2(x) 11y 7.50
σ2(y) 4.122correlation 0.816regression y = 3 + 0.5x
Satyaki Sikdar© Programming in Python April 23 2016 33 / 62
hours 8: data viz 101 plotting
I Anscombe’s quartet comprises four datasets that have nearly identical simple statisticalproperties, yet appear very different when graphed
I Constructed in 1973 by Francis Anscombe to demonstrate both the importance ofgraphing data before analyzing it and the effect of outliers on statistical properties
Property Valuex 9
σ2(x) 11y 7.50
σ2(y) 4.122correlation 0.816regression y = 3 + 0.5x
Satyaki Sikdar© Programming in Python April 23 2016 33 / 62
hours 8: data viz 101 plotting
plotting the four datasets
Satyaki Sikdar© Programming in Python April 23 2016 34 / 62
hours 8: data viz 101 plotting
more reasons to visualize the data
I Visualization is the highest bandwidth channel into the human brain
I The visual cortex is the largest system in the human brain; it’s wasteful not to make use ofit
I As data volumes grow, visualization becomes a necessity rather than a luxury
"A picture is worth a thousand words"
Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
hours 8: data viz 101 plotting
more reasons to visualize the data
I Visualization is the highest bandwidth channel into the human brain
I The visual cortex is the largest system in the human brain; it’s wasteful not to make use ofit
I As data volumes grow, visualization becomes a necessity rather than a luxury
"A picture is worth a thousand words"
Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
hours 8: data viz 101 plotting
more reasons to visualize the data
I Visualization is the highest bandwidth channel into the human brain
I The visual cortex is the largest system in the human brain; it’s wasteful not to make use ofit
I As data volumes grow, visualization becomes a necessity rather than a luxury
"A picture is worth a thousand words"
Satyaki Sikdar© Programming in Python April 23 2016 35 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
I Matplotlib is a 3rd party module that provides an interface to make plots in Python
I Inspired by Matlab’s plotting library and hence the name
I pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used tomake plots
I I’ll cover two most used types of plots in some detailI line plots
I scatter plots
I histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
I Matplotlib is a 3rd party module that provides an interface to make plots in Python
I Inspired by Matlab’s plotting library and hence the name
I pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used tomake plots
I I’ll cover two most used types of plots in some detailI line plots
I scatter plots
I histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
I Matplotlib is a 3rd party module that provides an interface to make plots in Python
I Inspired by Matlab’s plotting library and hence the name
I pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used tomake plots
I I’ll cover two most used types of plots in some detailI line plots
I scatter plots
I histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
I Matplotlib is a 3rd party module that provides an interface to make plots in Python
I Inspired by Matlab’s plotting library and hence the name
I pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used tomake plots
I I’ll cover two most used types of plots in some detailI line plots
I scatter plots
I histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
I Matplotlib is a 3rd party module that provides an interface to make plots in Python
I Inspired by Matlab’s plotting library and hence the name
I pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used tomake plots
I I’ll cover two most used types of plots in some detailI line plots
I scatter plots
I histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
I Matplotlib is a 3rd party module that provides an interface to make plots in Python
I Inspired by Matlab’s plotting library and hence the name
I pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used tomake plots
I I’ll cover two most used types of plots in some detailI line plots
I scatter plots
I histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
matplotlib and pylab
I Matplotlib is a 3rd party module that provides an interface to make plots in Python
I Inspired by Matlab’s plotting library and hence the name
I pylab or equivalently matplotlib.pyplot is a module defined by matplotlib that is used tomake plots
I I’ll cover two most used types of plots in some detailI line plots
I scatter plots
I histograms
Satyaki Sikdar© Programming in Python April 23 2016 36 / 62
hours 8: data viz 101 matplotlib
line plots
# lineplot.pyimport pylab as plx = [1, 2, 3, 4, 5]y = [1, 4, 9, 16, 25]pl.plot(x, y)pl.show() # show the plot on the screen
Satyaki Sikdar© Programming in Python April 23 2016 37 / 62
hours 8: data viz 101 matplotlib
line plots
# scatterplot.pyimport pylab as plx = [1, 2, 3, 4, 5]y = [1, 4, 9, 16, 25]pl.scatter(x, y)pl.show() # show the plot on the screen
Satyaki Sikdar© Programming in Python April 23 2016 38 / 62
hours 8: data viz 101 making plots prettier
tinkering parametersMatplotlib offers a lot of customizations. Let’s look at the key ones.
I Changing the line color - different datasets can have different colors# at lineplot.py# pl.plot(x, y)pl.plot(x, y, c='r')
character colorb blueg greenr redc cyanm magentay yellowk blackw white
Satyaki Sikdar© Programming in Python April 23 2016 39 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
I Changing the marker - marks the data points# at lineplot.pypl.plot(x, y, c='b', marker='*') # gives blue star shaped markerspl.plot(x, y, marker='b*') # same plot as above
character marker shape’s’ square’o’ circle’p’ pentagon’*’ star’h’ hexagon’+’ plus’D’ diamond’d’ thin diamond
Satyaki Sikdar© Programming in Python April 23 2016 40 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
I Plot and axis titles and limits - It is very important to always label plots and the axes ofplots to tell the viewers what they are looking at
pl.xlabel('put label of x axis')pl.ylabel('put label of y axis')pt.title('put title here')
I You can change the x and y ranges displayed on your plot by:pl.xlim(x_low, x_high)pl.ylabel(y_low, y_high)
Satyaki Sikdar© Programming in Python April 23 2016 41 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
I Plot and axis titles and limits - It is very important to always label plots and the axes ofplots to tell the viewers what they are looking at
pl.xlabel('put label of x axis')pl.ylabel('put label of y axis')pt.title('put title here')
I You can change the x and y ranges displayed on your plot by:pl.xlim(x_low, x_high)pl.ylabel(y_low, y_high)
Satyaki Sikdar© Programming in Python April 23 2016 41 / 62
hours 8: data viz 101 making plots prettier
tinkering parameters
#lineplotAxis.pyimport pylab as plx = [1, 2, 3, 4, 5]y = [1, 4, 9, 16, 25]pl.plot(x, y)
pl.title(’Plot of y vs. x’)pl.xlabel(’x axis’)pl.ylabel(’y axis’)# set axis limitspl.xlim(0.0, 7.0)pl.ylim(0.0, 30.)
pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 42 / 62
hours 8: data viz 101 making plots prettier
plotting more than one plot
#lineplot2Plots.pyimport pylab as plx1 = [1, 2, 3, 4, 5]y1 = [1, 4, 9, 16, 25]x2 = [1, 2, 4, 6, 8]y2 = [2, 4, 8, 12, 16]pl.plot(x1, y1, ’r’)pl.plot(x2, y2, ’g’)pl.title(’Plot of y vs. x’)pl.xlabel(’x axis’)pl.ylabel(’y axis’)pl.xlim(0.0, 9.0)pl.ylim(0.0, 30.)pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 43 / 62
hours 8: data viz 101 making plots prettier
legen.. wait for it.. dary!
I It’s very useful to add legends to plots to differentiate between the different lines orquantities being plottedpl.legend([plot1, plot2], ('label1', 'label2'), 'best')
I The first parameter is a list of the plots you want labeled,I The second parameter is the list / tuple of labelsI The third parameter is where you would like matplotlib to place your legend. Options are
‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’
Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
hours 8: data viz 101 making plots prettier
legen.. wait for it.. dary!
I It’s very useful to add legends to plots to differentiate between the different lines orquantities being plottedpl.legend([plot1, plot2], ('label1', 'label2'), 'best')
I The first parameter is a list of the plots you want labeled,I The second parameter is the list / tuple of labelsI The third parameter is where you would like matplotlib to place your legend. Options are
‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’
Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
hours 8: data viz 101 making plots prettier
legen.. wait for it.. dary!
I It’s very useful to add legends to plots to differentiate between the different lines orquantities being plottedpl.legend([plot1, plot2], ('label1', 'label2'), 'best')
I The first parameter is a list of the plots you want labeled,I The second parameter is the list / tuple of labelsI The third parameter is where you would like matplotlib to place your legend. Options are
‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’
Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
hours 8: data viz 101 making plots prettier
legen.. wait for it.. dary!
I It’s very useful to add legends to plots to differentiate between the different lines orquantities being plottedpl.legend([plot1, plot2], ('label1', 'label2'), 'best')
I The first parameter is a list of the plots you want labeled,I The second parameter is the list / tuple of labelsI The third parameter is where you would like matplotlib to place your legend. Options are
‘upper right’, ‘upper left’, ‘center’, ‘lower left’, ‘lower right’ and ’best’
Satyaki Sikdar© Programming in Python April 23 2016 44 / 62
hours 8: data viz 101 making plots prettier
#lineplotFigLegend.pyx1 = [1, 2, 3, 4, 5]y1 = [1, 4, 9, 16, 25]x2 = [1, 2, 4, 6, 8]y2 = [2, 4, 8, 12, 16]plot1 = pl.plot(x1, y1, ’r’)plot2 = pl.plot(x2, y2, ’g’)pl.title(’Plot of y vs. x’)pl.xlabel(’x axis’)pl.ylabel(’y axis’)pl.xlim(0.0, 9.0)pl.ylim(0.0, 30.)pl.legend([plot1, plot2], ('red line','green circles'), 'best')pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 45 / 62
hours 8: data viz 101 making plots prettier
histograms
I They are very useful to plot distributionsI In Matplotlib you use the hist command to make a histogram
from numpy import random# mean, sigma, number of pointsdata = random.normal(5.0, 3.0, 1000)pl.hist(data)
pl.title('a sample histogram')pl.xlabel('data')pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 46 / 62
hours 8: data viz 101 making plots prettier
histograms
I They are very useful to plot distributionsI In Matplotlib you use the hist command to make a histogram
from numpy import random# mean, sigma, number of pointsdata = random.normal(5.0, 3.0, 1000)pl.hist(data)
pl.title('a sample histogram')pl.xlabel('data')pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 46 / 62
hours 8: data viz 101 making plots prettier
subplots
I Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy toset this up
I You need to first make a figure and then specify subplots as follows
fig1 = pl.figure(1)pl.subplot(211)
I subplot(211) - a figure with 2 rows, 1 column, and the top plot (1)
I pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2)
Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
hours 8: data viz 101 making plots prettier
subplots
I Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy toset this up
I You need to first make a figure and then specify subplots as follows
fig1 = pl.figure(1)pl.subplot(211)
I subplot(211) - a figure with 2 rows, 1 column, and the top plot (1)
I pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2)
Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
hours 8: data viz 101 making plots prettier
subplots
I Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy toset this up
I You need to first make a figure and then specify subplots as follows
fig1 = pl.figure(1)pl.subplot(211)
I subplot(211) - a figure with 2 rows, 1 column, and the top plot (1)
I pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2)
Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
hours 8: data viz 101 making plots prettier
subplots
I Matplotlib is reasonably flexible about allowing multiple plots per canvas and it is easy toset this up
I You need to first make a figure and then specify subplots as follows
fig1 = pl.figure(1)pl.subplot(211)
I subplot(211) - a figure with 2 rows, 1 column, and the top plot (1)
I pl.subplot(212) - a figure with 2 rows, 1 column, and the bottom plot (2)
Satyaki Sikdar© Programming in Python April 23 2016 47 / 62
hours 8: data viz 101 making plots prettier
Satyaki Sikdar© Programming in Python April 23 2016 48 / 62
hours 8: data viz 101 making plots prettier
handling data
I So far, we have been hard coding the data setsI Actual datasets might be very large! We use file handling
import pylab as pldef read_data(filename):
X = []Y = []with open(filename, 'r') as f:
for line in f.readlines():x, y = line.split()X.append(float(x))Y.append(float(y))
return X, Y
def plot_data(filename):X, Y = read_data(filename)pl.scatter(X, Y, c='g')pl.xlabel('x')pl.ylabel('y')pl.title('y vs x')pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 49 / 62
hours 8: data viz 101 making plots prettier
handling data
I So far, we have been hard coding the data setsI Actual datasets might be very large! We use file handling
import pylab as pldef read_data(filename):
X = []Y = []with open(filename, 'r') as f:
for line in f.readlines():x, y = line.split()X.append(float(x))Y.append(float(y))
return X, Y
def plot_data(filename):X, Y = read_data(filename)pl.scatter(X, Y, c='g')pl.xlabel('x')pl.ylabel('y')pl.title('y vs x')pl.show()
Satyaki Sikdar© Programming in Python April 23 2016 49 / 62
hours 9 - 11 SNA 101
table of contents
1 hour 6: let’s get rich!
2 hour 7: algo design 101
3 hours 8: data viz 101
4 hours 9 - 11 SNA 101Introduction to SNAModelling - Introduction and ImportanceRepresenting Networks
Satyaki Sikdar© Programming in Python April 23 2016 50 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Social Networks Analysis
I Investigation social structures through the use of network and graph theoriesI Characterizes ties among say: Friends, Webpages, disease transmissionI Analysis is crucial to understand the flow of influence, disease, or investigate patterns like
voting patterns, food preferences
Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Social Networks Analysis
I Investigation social structures through the use of network and graph theoriesI Characterizes ties among say: Friends, Webpages, disease transmissionI Analysis is crucial to understand the flow of influence, disease, or investigate patterns like
voting patterns, food preferences
Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Social Networks Analysis
I Investigation social structures through the use of network and graph theoriesI Characterizes ties among say: Friends, Webpages, disease transmissionI Analysis is crucial to understand the flow of influence, disease, or investigate patterns like
voting patterns, food preferences
Satyaki Sikdar© Programming in Python April 23 2016 51 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Citation and Email networks
Figure: Citation network Figure: Enron email network. n= 33,696, m =180,811
Satyaki Sikdar© Programming in Python April 23 2016 52 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Citation and Email networks
Figure: Citation network Figure: Enron email network. n= 33,696, m =180,811
Satyaki Sikdar© Programming in Python April 23 2016 52 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Telecommunication and Protein networks
Satyaki Sikdar© Programming in Python April 23 2016 53 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Telecommunication and Protein networks
Satyaki Sikdar© Programming in Python April 23 2016 53 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Friendship and Les Misérables
Satyaki Sikdar© Programming in Python April 23 2016 54 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Friendship and Les Misérables
Satyaki Sikdar© Programming in Python April 23 2016 54 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Blackout Aug ’96
I A hot summer day. A single transmission line fails in Portland, OregonI The power line fails. The load is distributed over the remaining lines which were operating
at almost max capacityI The system collapses. Much like a stack of dominoes.
I OR => WAI WA => CAI CA => IDI ID => UTI UT => COI CO => AZI AZ => NMI NM => NVSatyaki Sikdar© Programming in Python April 23 2016 55 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
I The skyline of San Francisco was darkI A total of 175 generating units failedI Some of the nuclear reactors tooks days to restartI Total cost of $2 billion
What caused this catastrophe?
I Sloppy maintainenceI Insufficient attention to warning signsI Pure chance - bad luckI Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
I The skyline of San Francisco was darkI A total of 175 generating units failedI Some of the nuclear reactors tooks days to restartI Total cost of $2 billion
What caused this catastrophe?
I Sloppy maintainenceI Insufficient attention to warning signsI Pure chance - bad luckI Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
I The skyline of San Francisco was darkI A total of 175 generating units failedI Some of the nuclear reactors tooks days to restartI Total cost of $2 billion
What caused this catastrophe?
I Sloppy maintainenceI Insufficient attention to warning signsI Pure chance - bad luckI Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
I The skyline of San Francisco was darkI A total of 175 generating units failedI Some of the nuclear reactors tooks days to restartI Total cost of $2 billion
What caused this catastrophe?
I Sloppy maintainenceI Insufficient attention to warning signsI Pure chance - bad luckI Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
I The skyline of San Francisco was darkI A total of 175 generating units failedI Some of the nuclear reactors tooks days to restartI Total cost of $2 billion
What caused this catastrophe?
I Sloppy maintainenceI Insufficient attention to warning signsI Pure chance - bad luckI Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
I The skyline of San Francisco was darkI A total of 175 generating units failedI Some of the nuclear reactors tooks days to restartI Total cost of $2 billion
What caused this catastrophe?
I Sloppy maintainenceI Insufficient attention to warning signsI Pure chance - bad luckI Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
I The skyline of San Francisco was darkI A total of 175 generating units failedI Some of the nuclear reactors tooks days to restartI Total cost of $2 billion
What caused this catastrophe?
I Sloppy maintainenceI Insufficient attention to warning signsI Pure chance - bad luckI Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
The Aftermath
I The skyline of San Francisco was darkI A total of 175 generating units failedI Some of the nuclear reactors tooks days to restartI Total cost of $2 billion
What caused this catastrophe?
I Sloppy maintainenceI Insufficient attention to warning signsI Pure chance - bad luckI Inadequate understanding of the interdependencies in the system
Satyaki Sikdar© Programming in Python April 23 2016 56 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks
Satyaki Sikdar© Programming in Python April 23 2016 57 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks in real life
I Advances in gene sequencing reveal that all human lives consist of about 30, 000 genesI The complexity rises from the interactions between different genes expressing different
characteristicsI The parts making up the whole don’t sum up in any simple fashionI The building blocks interact with one another thus generating bewildering behavior
Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks in real life
I Advances in gene sequencing reveal that all human lives consist of about 30, 000 genesI The complexity rises from the interactions between different genes expressing different
characteristicsI The parts making up the whole don’t sum up in any simple fashionI The building blocks interact with one another thus generating bewildering behavior
Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks in real life
I Advances in gene sequencing reveal that all human lives consist of about 30, 000 genesI The complexity rises from the interactions between different genes expressing different
characteristicsI The parts making up the whole don’t sum up in any simple fashionI The building blocks interact with one another thus generating bewildering behavior
Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Complex networks in real life
I Advances in gene sequencing reveal that all human lives consist of about 30, 000 genesI The complexity rises from the interactions between different genes expressing different
characteristicsI The parts making up the whole don’t sum up in any simple fashionI The building blocks interact with one another thus generating bewildering behavior
Satyaki Sikdar© Programming in Python April 23 2016 58 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Introduction to SNA
Open problems in SNA
I Small outbreaks of diseases becoming epidemicsI Resilience of networks - the internet, power gridsI What makes some videos go viral?I How to find clusters of nodes that are similar to each other?I How to find a seed set of nodes to maximize influence?I Key question: How does individual behavior aggregate to collective behavior?I Modelling and pattern finding is critical. Random sampling (Leskovec et al) is a way. The
other way is to form synthetic statistical models of networks
Satyaki Sikdar© Programming in Python April 23 2016 59 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
I The models discussed in the talk is simplifiedI Starting off simple is an essential stage of understanding anything complexI Results from the simple models are often intriguing and fascinatingI The cost is abstraction - the results are often hard to apply in real lifeI Models provide a simple framework for experimentationI Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
I The models discussed in the talk is simplifiedI Starting off simple is an essential stage of understanding anything complexI Results from the simple models are often intriguing and fascinatingI The cost is abstraction - the results are often hard to apply in real lifeI Models provide a simple framework for experimentationI Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
I The models discussed in the talk is simplifiedI Starting off simple is an essential stage of understanding anything complexI Results from the simple models are often intriguing and fascinatingI The cost is abstraction - the results are often hard to apply in real lifeI Models provide a simple framework for experimentationI Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
I The models discussed in the talk is simplifiedI Starting off simple is an essential stage of understanding anything complexI Results from the simple models are often intriguing and fascinatingI The cost is abstraction - the results are often hard to apply in real lifeI Models provide a simple framework for experimentationI Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
I The models discussed in the talk is simplifiedI Starting off simple is an essential stage of understanding anything complexI Results from the simple models are often intriguing and fascinatingI The cost is abstraction - the results are often hard to apply in real lifeI Models provide a simple framework for experimentationI Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Modelling - Introduction and Importance
The Need for Modelling
I The models discussed in the talk is simplifiedI Starting off simple is an essential stage of understanding anything complexI Results from the simple models are often intriguing and fascinatingI The cost is abstraction - the results are often hard to apply in real lifeI Models provide a simple framework for experimentationI Models look to emulate the properties of actual networks to some extent
Satyaki Sikdar© Programming in Python April 23 2016 60 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Network representation
Networks portray the interactions between different actors.Graphs hand us a valuable tool toprocess and handle networks.
I Actors or individuals are nodes in the graphI If there’s interaction between two nodes,
there’s an edge between themI The links can have weights or intensities
signifying connection strengthI The links can be directed, like in the web
graph. There’s a directed link between twonodes (pages) A and B if there’s ahyperlink to B from A
Satyaki Sikdar© Programming in Python April 23 2016 61 / 62
hours 9 - 11 SNA 101 Representing Networks
Please move to the pdf named tutorial_networkx for the rest of the slidesThanks!
Satyaki Sikdar© Programming in Python April 23 2016 62 / 62