lesson 7: how to use help, list and dictionary methodsnaraehan/ling1901/lesson7.pdflist operations...

38
Lesson 7: How to Use Help, List and Dictionary Methods Fundamentals of Text Processing for Linguists Na-Rae Han

Upload: others

Post on 15-Jul-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Lesson 7: How to Use Help, List

and Dictionary Methods

Fundamentals of Text Processing for Linguists

Na-Rae Han

Page 2: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Objectives

Learning on your own

dir(), help()

Python IDLE tooltips

Using online references

List methods

Dictionary methods

2/19/2014 2

Page 3: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Teaching yourself new tricks

2/19/2014 3

Python built-in helper functions

dir()

help()

Python IDLE tool tips

Online references

Python 2.7 Quick Reference:

http://rgruet.free.fr/PQR27/PQR2.7.html

>>> range(

Page 4: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

dir() and help()

2/19/2014 4

dir(obj) Returns a list of

attributes (__xyz__) and methods that are available

for the given object.

>>> dir(str) ['__add__', '__class__', '__contains__', '__delattr__', ... '__subclasshook__', '_formatter_field_name_split', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

Page 5: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

dir() and help()

2/19/2014 5

>>> dir(str) ['__add__', '__class__', '__contains__', '__delattr__', ... '__subclasshook__', '_formatter_field_name_split', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

>>> help(str.find) Help on method_descriptor: find(...) S.find(sub [,start [,end]]) -> int Return the lowest index in S where substring sub is found, such that sub is contained within S[start:end]. Optional arguments start and end are interpreted as in slice notation. Return -1 on failure.

help(obj.method) prints out information on

the object's method

Page 6: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Self-learn!

2/19/2014 6

Using the various sources, find out what the following string methods do:

5 minutes

>>> dir(str) ['__add__', '__class__', '__contains__', '__delattr__', ... '__subclasshook__', '_formatter_field_name_split', 'capitalize', 'center', 'count', 'decode', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'index', 'isalnum', 'isalpha', 'isdigit', 'islower', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']

Try help(str.strip)

Page 7: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Additional string operations (1)

2/19/2014 7

.isalpha() returns True only if all characters

are alphabetic

.isalnum() returns True only if all characters

are a digit or an alphabet

.isdigit() returns True only if all characters

are a digit

.isspace() returns True only if all characters

are a whitespace character

>>> 'co-operate'.isalpha() False >>> 'Exercise2'.isalnum() True >>> '2013'.isdigit() True >>> ' \n\t'.isspace() True

Page 8: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

>>> ' green ideas \n'.strip() 'green ideas'

>>> 'green ideas'.find('e') 2 >>> 'green ideas'.find('ea') 8 >>> 'green ideas'.find('t') -1 >>> 'green ideas'.count('e') 3 >>> 'green ideas sleep'.count('ee') 2 >>> 'The thirty-three thieves thought that'.count('th') 5

Additional string operations (2)

2/19/2014 8

.strip() returns a string stripped of whitespaces on either edge

.find() searches for the given string

within str, and returns the first index where it begins.

Returns -1 if not found.

.count() searches for the given string and

returns the total count

Page 9: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

List operations

2/19/2014 9

List methods

Functions that are defined on the list datatype

Called on a list object, has this syntax:

listobj.method()

Lists are mutable, which means list methods modify the caller object (list) in place.

Page 10: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

>>> li = [8, 'abc', 4.5, 11]

>>> li[2]

4.5

>>> li[2] = 1000

>>> li

[8, 'abc', 1000, 11]

Lists are mutable

2/19/2014 10

We can change individual list elements

These elements are changed in place: the rest of the list is not affected

The list name 'li' still points to the same memory reference when we're done.

Because lists are mutable, they are not as fast as tuples.

Page 11: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Tuples are immutable

2/19/2014 11

You can't change a tuple.

Instead, what you should do is make a fresh new tuple and reassign the name:

>>> tu = ('Spring', 'Summer', 'Fall', 'Winter') >>> tu[2] 'Fall' >>> tu[2] = 'Autumn' Traceback (most recent call last): File "<pyshell#20>", line 1, in <module> tu[2] = 'Autumn' TypeError: 'tuple' object does not support item

assignment

Page 12: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Adding to a list

2/19/2014 12

>>> li = [1,2,3]

>>> li

[1, 2, 3]

>>> li.append(4)

>>> li

[1, 2, 3, 4]

>>> li.extend([5,6,7])

>>> li

[1, 2, 3, 4, 5, 6, 7]

>>> li.insert(1, 1.5)

>>> li

[1, 1.5, 2, 3, 4, 5, 6, 7]

Try dir(li) and help()! ??

??

??

Page 13: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

>>> li = [3, 9, 'ab', 3.5]

>>> li.append('a')

>>> li

[3, 9, 'ab', 3.5, 'a']

>>> li.extend([9, 11, 'c'])

>>> li

[3, 9, 'ab', 3.5, 'a', 9, 11, 'c']

>>> li.insert(2, 'x')

>>> li

[3, 9, 'x', 'ab', 3.5, 'a', 9, 11, 'c']

List methods

2/19/2014 13

.append(x) adds a single item

at the end

.extend(list) adds a list of items

at the end

.insert(i,x) inserts an item

at index i

Page 14: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

>>> li = [1, 2]

>>> li.append(3)

>>> li

[1, 2, 3]

>>> li.extend([4,5])

>>> li

[1, 2, 3, 4, 5]

>>> li.append([6,7])

>>> li

[1, 2, 3, 4, 5, [6, 7]]

>>> len(li)

6

.append() vs. .extend()

2/19/2014 14

List inside a list! [6,7] is appended

as a single element. li has length 6

Page 15: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

.extend() vs. +

2/19/2014 15

+ also extends a list, but it creates and returns

a NEW list. li is NOT affected.

>>> li = [1, 2, 3]

>>> li.extend([4, 5, 6])

>>> li

[1, 2, 3, 4, 5, 6]

>>> li + [7, 8, 9]

[1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> li

[1, 2, 3, 4, 5, 6]

Page 16: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

.extend() vs. +

2/19/2014 16

+ also extends a list, but it creates and returns

a NEW list. li is NOT affected.

>>> li = [1, 2, 3]

>>> li.extend([4, 5, 6])

>>> li

[1, 2, 3, 4, 5, 6]

>>> li + [7, 8, 9]

[1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> li

[1, 2, 3, 4, 5, 6]

>>> li = li + [7, 8, 9]

>>> li

[1, 2, 3, 4, 5, 6, 7, 8, 9]

>>> li += [10, 11]

>>> li

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]

To extend li itself, reassign it to the

new, returned list.

Page 17: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

List methods based on item value

2/19/2014 17

.index(x) index of first occurrence

.count(x) number of occurrences

.remove(x) remove first occurrence only

>>> li = ['a', 'b', 'c', 'b']

>>> li.index('b')

1

>>> li.count('b')

2

>>> li.remove('b')

>>> li

['a', 'c', 'b']

Page 18: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

>>> li = ['a', 'b', 'c', 'b']

>>> li.index('b')

1

>>> li.count('b')

2

>>> li.remove('b')

>>> li

['a', 'c', 'b']

List methods based on item value

2/19/2014 18

Careful – These throw an error

if 'b' is not found in the list

Use in conjunction with the in operator:

if 'b' in li : li.remove('b')

Page 19: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

.pop()

2/19/2014 19

>>> li = ['a', 'b', 'c', 'd', 'e']

>>> li.pop()

'e'

>>> li

['a', 'b', 'c', 'd']

>>> li.pop(2)

'c'

>>> li

['a', 'b', 'd']

removes the last item from the list

and returns it

removes the item at index

and returns it

Page 20: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

.pop()

2/19/2014 20

>>> li = ['a', 'b', 'c', 'd', 'e']

>>> li.pop()

'e'

>>> li

['a', 'b', 'c', 'd']

>>> li.pop(2)

'c'

>>> li

['a', 'b', 'd']

.pop() removes an item

from the list; list no longer

contains the item

Page 21: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

.pop()

2/19/2014 21

>>> li = ['a', 'b', 'c', 'd', 'e']

>>> li.pop()

'e'

>>> li

['a', 'b', 'c', 'd']

>>> li.pop(2)

'c'

>>> li

['a', 'b', 'd']

Because the popped item 'c' is returned, you can assign a name to it, e.g.,

x = li.pop(2) x's value is 'c'

Page 22: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

.pop() vs. .append()

2/19/2014 22

>>> li = ['a', 'b', 'c']

>>> li.append('x')

>>> li

['a', 'b', 'c', 'x']

>>> li.pop()

'x'

>>> li

['a', 'b', 'c']

.append('x')

& .pop()

undo each other

Page 23: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

.pop() vs. .insert()

2/19/2014 23

>>> li = ['a', 'b', 'c']

>>> li.insert(2, 'x')

>>> li

['a', 'b', 'x', 'c']

>>> li.pop(2)

'x'

>>> li

['a', 'b', 'c']

.insert(i,'x') &

.pop(i) undo each other

Page 24: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Practice

2/19/2014

add 'thou' to the list

change 'i' to "I'

add 'we' and 'they'

remove 'thou'

add pron2 to pron

add 'yinz' between 'we' and 'they'

Page 25: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Try it yourself

2/19/2014

2 minutes

Page 26: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

dict: a dictionary data type

2/19/2014 26

A dictionary of the Simpson family members' age

A dictionary of verb past tense

Dictionaries store a mapping between a set of keys and a set of values.

Keys can be any immutable type: string, integer, tuple

Values can be any type (can also be mixed types)

There is no inherent order (unlike lists and tuples)

You can define, modify, view, lookup, and delete the key-value pairs in the dictionary.

{'Homer':36, 'Marge':36, 'Bart':10, 'Lisa':8, 'Maggie':1}

{'go':'went', 'eat':'ate', 'see':'saw', 'say':'said'}

Page 27: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Looking up a dictionary

2/19/2014 27

>>> en2es = {'cat':'gato', 'dog':'perro', 'tiger':'tigre'}

>>> en2es['cat']

'gato'

>>> en2es['dog']

'perro'

>>> en2es['gato']

Traceback (most recent call last):

File "<pyshell#2>", line 1, in <module>

en2es['gato']

KeyError: 'gato'

Dictionary is one way. Cannot look up based on

the value. Mapping can be many-to-

one.

Page 28: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Adding and deleting an entry

2/19/2014 28

>>> en2es = {'cat':'gato', 'dog':'perro'}

>>> en2es['cat']

'gato'

>>> en2es['tiger'] = 'tigre'

>>> en2es

{'tiger': 'tigre', 'dog': 'perro', 'cat': 'gato'}

>>> del en2es['dog']

>>> en2es

{'tiger': 'tigre', 'cat': 'gato'}

There is no order in dictionary!

del deletes a key and its value from a

dictionary

creates a new key & maps value

Page 29: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Checking if something's in a dictionary

2/19/2014 29

>>> en2es

{'tiger': 'tigre', 'dog': 'perro', 'cat': 'gato'}

>>> 'fox' in en2es

False

>>> 'cat' in en2es

True

>>> 'gato' in en2es

False

"in" does not work with value

"in" tests if a key is in a dictionary

Page 30: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Finding out what's in

2/19/2014 30

>>> en2es

{'tiger': 'tigre', 'wolf': 'lobo', 'cat': 'gato'}

>>> en2es.keys()

['tiger', 'wolf', 'cat']

>>> en2es.values()

['tigre', 'lobo', 'gato']

>>> en2es.items()

[('tiger', 'tigre'), ('wolf', 'lobo'), ('cat', 'gato')]

.keys() returns a list of keys, .values() returns a list of values.

The orders match!

A list of key, value TUPLES ('pairs')

Page 31: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Iterating through dictionary

2/19/2014 31

>>> en2es

{'tiger': 'tigre', 'wolf': 'lobo', 'cat': 'gato'}

>>> 'tiger' in en2es

True

>>> for k in en2es :

print k, 'is', en2es[k]

tiger is tigre

wolf is lobo

cat is gato

en2es.keys() also works

"in" tests if a key is in a dictionary

Page 32: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Iterating through key:value tuples

2/19/2014 32

>>> en2es

{'tiger': 'tigre', 'wolf': 'lobo', 'cat': 'gato'}

>>> en2es.items()

[('tiger', 'tigre'), ('wolf', 'lobo'), ('cat', 'gato')]

>>> for (k,v) in en2es.items() :

print k, 'is', v

tiger is tigre

wolf is lobo

cat is gato

Page 33: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Common dict application: counting

2/19/2014 33

>>> tally = {'gold':1, 'bronze':3}

>>> tally['bronze']

3

>>> medals = ['bronze', 'silver', 'gold', 'gold', 'silver']

>>> for m in medals:

tally[m] += 1

Traceback (most recent call last):

File "<pyshell#80>", line 2, in <module>

tally[m] += 1

KeyError: 'silver'

Error: 'silver' is not yet in the

dictionary as a key

Page 34: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Common dict application: counting

2/19/2014 34

>>> tally = {'gold':1, 'bronze':3}

>>> tally['bronze']

3

>>> medals = ['bronze', 'silver', 'gold', 'gold', 'silver']

>>> for m in medals:

if m not in tally :

tally[m] = 1

else :

tally[m] += 1

>>> tally

{'bronze': 4, 'gold': 3, 'silver': 2}

Make sure to account for the

initial key creation & assignment

Page 35: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Word count

2/19/2014 35

sent = 'Rose is a rose is a rose is a rose.' words = sent.split() counts = {} for w in words : if w in counts : counts[w] += 1 else : counts[w] = 1 print counts

>>> {'a': 3, 'Rose': 1, 'is': 3, 'rose.': 1, 'rose': 2} >>>

Fold case We really must start tokenizing

punctuation.

Page 36: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Word count and tokenization

2/19/2014 36

sent = 'Rose is a rose is a rose is a rose.' words = sent.lower().replace('.',' .').split() counts = {} for w in words : if w in counts : counts[w] += 1 else : counts[w] = 1 print counts

>>> {'a': 3, 'rose': 4, 'is': 3, '.': 1} >>>

'.' is its own word. 4 tokens of 'rose'

Page 37: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Word count and tokenization

2/19/2014 37

sent = 'Rose is a rose is a rose is a rose.' words = sent.lower().replace('.',' .').split() counts = {} for w in words : if w in counts : counts[w] += 1 else : counts[w] = 1 print counts

>>> {'a': 3, 'rose': 4, 'is': 3, '.': 1} >>>

'.' is its own word. 4 tokens of 'rose'

2 minutes

Page 38: Lesson 7: How to Use Help, List and Dictionary Methodsnaraehan/ling1901/Lesson7.pdfList operations 2/19/2014 9 List methods Functions that are defined on the list datatype Called on

Wrap-up

2/19/2014 38

Next class

Sorting

File IO: reading from and writing to a file

Exercise5

Due Tuesday midnight

Not yet online – will be up tonight