zelle - chapter 11 · data collections •sequences - 11.2 •lists / arrays - 11.2 •dictionaries...

38
Data Collections Zelle - Chapter 11 Charles Severance - www.dr-chuck.com Textbook: Python Programming: An Introduction to Computer Science, John Zelle (www.si182.com)

Upload: others

Post on 11-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Data CollectionsZelle - Chapter 11

Charles Severance - www.dr-chuck.com

Textbook: Python Programming: An Introduction to Computer Science, John Zelle (www.si182.com)

Page 2: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Data Collections

• Sequences - 11.2

• Lists / Arrays - 11.2

• Dictionaries - 11.6

• For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes and objects - which we will cover later

Page 3: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Sequences

• We already are familiar with sequences

• Strings are a sequence of characters

• The string.split(“hello there bob”) returns a sequence of three strings [‘hello’, ‘there’, ‘bob]

• The range(5) gives us a sequence [0, 1, 2 3, 4]

• Sequences can be indexed seq[4] and we know their length len(seq)

Page 4: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

>>> zzz = "Hello Bob">>> print len(zzz)9>>> print zzz[4]o>>> import string>>> www = string.split(zzz)>>> print len(www)2>>> print www['Hello', 'Bob']>>> print www[0]Hello

>>> yyy = range(4)>>> print yyy[0, 1, 2, 3]>>> print len(yyy)4>>> print yyy[2]2

Sequences are structures that have lengths and we can use [position] to

get at a particular element of the sequence.

Page 5: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

• We store data in a sequence and then we can use the index [0] to retrieve the contents stored in a particular position in the list

• A sequence is like a little file cabinet or spreadsheet with numbers as the labels to retrieve data or to look up a row

• Always remember that the index labels start at zero, not 1

>>> import string>>> www = string.split(zzz)>>> print len(www)2>>> print www['Hello', 'bob']>>> print www[0]Hello

[0] Hello

[1] Bob

Index Contents

Length

Page 6: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

A List of Characters

>>> zzz = "Hello Bob">>> print len(zzz)9>>> print zzz[4]o

[0] H

[1] e

[2] l

[3] l

[4] o

[5]

[6] B

[7] o

[8] b

Index Contents

Length

Page 7: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

A List of Strings

• The string.split() function returns a list of strings

• We can find the first word on the line by retrieving the first element of the list that came back from split()

>>> zzz = "Hello Bob">>> import string>>> www = string.split(zzz)>>> print len(www)2>>> print www['Hello', 'Bob']>>> print www[0]Hello

[0] Hello

[1] Bob

Index Contents

Page 8: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

A List of Numbers[0] 0

[1] 1

[2] 2

[3] 3

Index Contents

Length>>> yyy = range(4)>>> print yyy[0, 1, 2, 3]>>> print len(yyy)4>>> print yyy[2]2

The range() function returns a list of integer numbers from

zero up to range.

Page 9: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Sequence Operators

• Sequences operations apply to all sequences

• Not just strings

z-341

Page 10: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Slicing a List of Numbers[0] 3

[1] 41

[2] 12

[3] 9

Index Contents

>>> sss = [3, 41, 12, 9, 74, 15] >>> ttt = sss[2:5]>>> print ttt[12, 9, 74]

Slicing goes from beginning to one before end.

[4] 74

[5] 15

[2] 12

[3] 9

[4] 74[0] 12

[1] 9

[2] 74

ttt [2:5]

Page 11: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Slicing Strings

• We can also look at any continuous section of a string using a colon

• The second number is one beyond the end of the slice - “up to but not including”

• If a number is omitted it is assumed to be the the beginning or end

>>> greet = "Hello Bob">>> greet[0:3]'Hel'>>> greet[5:9]' Bob'>>> greet[:5]'Hello'>>> greet[5:]' Bob'>>> greet[:]'Hello Bob' Z-81

Page 12: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Lists

Page 13: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Lists and Sequences

• A List is a Sequence that you can Edit and Modify

• Some times we actually have been using Lists without knowing it

• As an example, you can change an entry in a List but not in a Sequence

>>> import string>>> str = "Hello There Bob">>> www = string.split(str)>>> print www['Hello', 'There', 'Bob']>>> www[0] = 'Howdy'>>> print www['Howdy', 'There', 'Bob']>>> str[2] = 'x'Traceback (most recent call last): File "<stdin>", line 1, in <module>TypeError: 'str' object does not support item assignment>>>

Page 14: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

List Operations

z-343

Page 15: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

>>> lll = [ 21, 14, 4, 3, 12, 18]>>> print lll[21, 14, 4, 3, 12, 18]>>> print 18 in lllTrue>>> print 24 in lllFalse>>> lll.append(50)>>> print lll[21, 14, 4, 3, 12, 18, 50]>>> lll.remove(4)>>> print lll[21, 14, 3, 12, 18, 50]

>>> print lll[21, 14, 3, 12, 18, 50]>>> print lll.index(18)4>>> lll.reverse()>>> print lll[50, 18, 12, 3, 14, 21]>>> lll.sort()>>> print lll[3, 12, 14, 18, 21, 50]>>> lll[5] = 33>>> print lll[3, 12, 14, 18, 21, 33]

z-343

Page 16: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

>>> print lll[3, 12, 14, 18, 21, 33]>>> for xval in lll :... print xval... 31214182133>>>

z-343

Looping through Lists

Page 17: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionaries - Python’s Built-In “Database”

Page 18: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionaries

• Dictionaries are Python’s most powerful data collection

• Dictionaries allow us to do fast database-like operations in Python

• Dictionaries have different names in different languages

• Associative Arrays - Perl / Php

• Map or HashMap - Java

http://en.wikipedia.org/wiki/Associative_array

Page 19: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Keys and Values

• Dictionaries are like Lists except that they use keys instead of numbers to look up values

>>> lll = [ ]>>> lll.append(21)>>> lll.append(183)>>> print lll[21, 183]>>> lll[0] = 23>>> print lll[23, 183]

>>> ddd = { }>>> ddd["age"] = 21>>> ddd["course"] = 182>>> print ddd{'course': 182, 'age': 21}>>> ddd["age"] = 23>>> print ddd{'course': 182, 'age': 23}

Page 20: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

>>> lll = [ ]>>> lll.append(21)>>> lll.append(183)>>> print lll[21, 183]>>> lll[0] = 23>>> print lll[23, 183]

>>> ddd = { }>>> ddd["age"] = 21>>> ddd["course"] = 182>>> print ddd{'course': 182, 'age': 21}>>> ddd["age"] = 23>>> print ddd{'course': 182, 'age': 23}

[0] 21

[1] 183lll

Index Contents

[course] 183

[age] 21ddd

Index Contents

List

Dictionary

23

23

Page 21: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Operations

z-369

Page 22: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Literals

• Dictionary literals use curly braces and have a list of key : value pairs

• You can make an empty dictionary using empty curly braces

>>> jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}>>> print jjj{'jan': 100, 'chuck': 1, 'fred': 42}>>> ooo = { }>>> print ooo{}>>>

Page 23: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Patterns

• One common use of dictionary is counting how often we “see” something

Key Value

>>> ccc = { }>>> ccc["csev"] = 1>>> ccc["cwen"] = 1>>> print ccc{'csev': 1, 'cwen': 1}>>> ccc["cwen"] = ccc["cwen"] + 1>>> print ccc{'csev': 1, 'cwen': 2}

Page 24: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Patterns

• It is an error to reference a key which is not in the dictionary

• We can use the in operator to see if a key is in the dictionary

>>> ccc = { }>>> print ccc["csev"]Traceback (most recent call last): File "<stdin>", line 1, in <module>KeyError: 'csev'>>> print "csev" in cccFalse

Page 25: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Counting

• Since it is an error to reference a key which is not in the dictionary

• We can use the dictionary get() operation and supply a default value if the key does not exist to avoid the error and get our count started.

>>> ccc = { }>>> print ccc.get("csev", 0)0>>> ccc["csev"] = ccc.get("csev",0) + 1>>> print ccc{'csev': 1}>>> print ccc.get("csev", 0)1>>> ccc["csev"] = ccc.get("csev",0) + 1>>> print ccc{'csev': 2}

dict.get(key, defaultvalue)

Page 26: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Retrieving lists of Keys and Values

• You can get a list of keys or values from a dictionary

>>> jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}>>> print jjj.keys()['jan', 'chuck', 'fred']>>> print jjj.values()[100, 1, 42]>>>

Page 27: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Looping Through Dictionaries

• We loop through the key-value pairs in a dictionary using *two* iteration variables

• Each iteration, the first variable is the key and the the second variable is the corresponding value

>>> jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}>>> for aaa,bbb in jjj.items() :... print aaa, bbb... jan 100chuck 1fred 42>>>

[chuck] 1

[fred] 42

aaa bbb

[jan] 100

Page 28: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Maximum Loop

$ cat dictmax.py jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}print jjj

maxcount = Nonefor person, count in jjj.items() : if maxcount == None or count > maxcount : maxcount = count maxperson = person

print maxperson, maxcount

$ python dictmax.py {'jan': 100, 'chuck': 1, 'fred': 42}jan 100

None is a special value in Python. It is like the “absense” of a value.

Like “nothing” or “empty”.

Page 29: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionaries are not Ordered

• Dictionaries use a Computer Science technique called “hashing” to make them very fast and efficient

• However hashing makes it so that dictionaries are not sorted and they are not sortable

• Lists and sequences maintain their order and a list can be sorted - but not a dictionary

http://en.wikipedia.org/wiki/Hash_function

Page 30: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionaries are not Ordered

http://en.wikipedia.org/wiki/Hash_function

>>> dict = { "a" : 123, "b" : 400, "c" : 50 }>>> print dict{'a': 123, 'c': 50, 'b': 400}

>>> lst = [ ]>>> lst.append("one")>>> lst.append("and")>>> lst.append("two")>>> print lst['one', 'and', 'two']>>> lst.sort()>>> print lst['and', 'one', 'two']>>>

Dictionaries have no order and cannot be sorted. Lists

have order and can be sorted.

Page 31: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

<advanced-trick-stuff>

Page 32: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Tricky / Clever Code>>> jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}>>> print jjj.keys()[jjj.values().index(max(jjj.values()))]jan

Find and print the key which corresponds to the maximum value.

Page 33: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Tricky / Clever Code>>> jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}>>> print jjj.keys()[jjj.values().index(max(jjj.values()))]jan>>> print jjj.values()[100, 1, 42]>>> print max(jjj.values())100>>> print jjj.values().index(100)0>>> print jjj.keys()[0]jan

Page 34: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Maximum Loop

$ cat dictmax.py jjj = { 'chuck' : 1 , 'fred' : 42, 'jan': 100}print jjj

maxcount = Nonefor person, count in jjj.items() : if maxcount == None or count > maxcount : maxcount = count maxperson = person

print maxperson, maxcount

$ python dictmax.py {'jan': 100, 'chuck': 1, 'fred': 42}jan 100

Do it this way - it is clearer and makes more sense to someone

reading your code.

Page 35: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

## You can put anything into dictionaries, including lists and other dictionaries##Make a list of lucky numbers by personlotto = {'botimer': [1, 7, 13, 37, 42, 47]}

##The lists inside a dictionary work just like other lists##Print my fourth lotto numberprint lotto['botimer'][3]

##Set up a little address book by uniqnmaepeople = {'botimer': {'firstname': 'Noah', 'lastname': 'Botimer'}, 'csev': {'firstname': 'Charles', 'lastname': 'Severance'} }

## When you have multiple levels, you can "drill" into them by using multiple indexes## Print my first name: Noahprint people['botimer']['firstname']

Page 36: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

</advanced-trick-stuff>

Page 37: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Dictionary Operations

z-369

Page 38: Zelle - Chapter 11 · Data Collections •Sequences - 11.2 •Lists / Arrays - 11.2 •Dictionaries - 11.6 •For this lecture - avoid 11.3, 11.4, and 11.5 - these deal with classes

Summary

• Sequences

• Lists - Editable Sequences

• Dictionaries - Simple database within Python - maps keys to values