session 2 wharton summer tech camp 1: basic python 2: start regex

Post on 28-Dec-2015

223 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Session 2Wharton Summer Tech Camp

1: Basic Python2: Start Regex

Announcement

If you did not get an email from me saying that the slides have been uploaded, please email me and I’ll add you to the list

Python Packaged Distribution

• Download this packaged version • Enthought Canopy or EPD– Company that maintains a great compiled version of

Python.– Has many packages included. – Alternative is to download python and install countless

number of packages -> can be a nightmare due to compiler incompatibility etc

– https://www.enthought.com/products/canopy/academic/• Free for people with EDU email

Why ?• Has many great packages useful for us (Scientific computing, Machine

Learning, NLP, Scraping etc) • One of the easiest and concise language yet powerful

– Memory consumption was often "better than Java and not much worse than C or C++”

• Has IDLE ("Interactive DeveLopment Environment") – Read-Eval-Print-Loop

• Great OOP (Compared to other comparable languages, say PERL. bless() those who use it)

• Highly scalable • Easy incorporation of other languages (Cython, Jython) • Named after Monty Python

Used by many companies as prototyping and "duct-tape" language as well as the main language: Wall Street, Yahoo, CERN, NASA, Con Edison, Google, etc. Also Youtube is written in Python!

Bit More Background on Python• Does few things EXCELLENTLY (OOP, Sci Comp, etc) and is generally good

for lot of things• Guido van Rossum – late 1980s• Programmer oriented (easy to write and read). Use of white space.• Automatic memory management • Can be interpreted or compiled (PyPy – Just-in-time compiler)• Direct opposite of PERL when it comes to programming philosophy

– PERL "there is more than one way to do it" -> Super fun when writing your own code. Rage when you debug other people’s PERL code (there is even a contest Obfuscated PERL)

– Python "there should be one—and preferably only one—obvious way to do it" -> Writing your own & Reading others’ = Fun

• Would you like to know more? – http://www.youtube.com/watch?v=ugqu10JV7dk– Van Rossum talks about history of python for 110 min!

Let’s start coding in Python!Fire up your IDLE.

Load the file called basicpython.py from the camp website

Basic Data Types

• All the standard types– Integers, floating• 2, 2.2, 3.14 etc

– Strings • “Hi, I am a string”

– Booleans • True• False

Hello World & Arithmetic

Helloworld.py >>> print "hello, world!" #that's it# <- used for commenting

Simple Arithmetic (+ - * ** / %)>>> 1+1>>> 5**2

Booleans (operators: and, or, not, >, <, <=, ==, !=, etc)>>> True >>> False

Strings

string="hello";string+stringstring*3string[0]string[-1]string[1:4]len(string)

Lists, Tuples, and Dictionaries

Data structures – there are many but 4 most commonly used. Each has pros and cons.

• List – list of values • Sets – set(list). You can do set operations which can be faster

than going through array element one at a time.• Tuples – just like list but not mutable and fixed size. Also, style-

wise, array usually consist of homogeneous stuff while tuples can consist of heterogeneous stuff and make a some sort of structure. (firstname, lastname) (name, age)

• Dictionaries – Hash look up table. Index of stuff. Basic book keeping "Key->Value". Fast look up O(1).

Lists, Tuples, and Dictionaries

• List – []>>> TPlayersList=["Federer","Nadal","Murray", "Djokovic"]range(), append(),pop(),insert(),reverse(),sort() e.g. TPlayersList.sort()

• Tuples – ()>>> TPlayersTuple=("Federer","Nadal","Murray", "Djokovic")

• Dictionaries – {}>>> TPlayersDict={ "Federer": 5, "Nadal": 4, "Murray":2, "Djokovic":1}>>>TPlayersDict["Ferrer"]=3>>>TPlayersDict["Ferrer"]>>>del TPlayersDict["Ferrer"]let d be a dictionary then d.keys(), d.values(), d.items()

• When you are first reading in Data– Think carefully about what you want to do with the data – Then decide what data structures to use– It is common to have things like

• Array of arrays• Array of tuples • Dictionary of arrays• Dictionary of dictionaries• Dictionary made of (tuple keys)

– However, once you need things like dictionary of dictionary of dictionary of arrays or similar ridiculous structures, consider using object-oriented programming • Look up python Classes

(http://docs.python.org/2/tutorial/classes.html)

Lists, Tuples, and Dictionaries

Basic Control Flow

• Boils down to– If (elif, else)–While– For

• Python has better syntactic sugar for control flow to iterate through different data structure

Basic Control Flow

• True Things – True– Any non-zero numbers– Any non-empty string or data structure

• False Things – False – 0– “”– Empty data structures

If and while

if True: print "everything is good”else: print "?! HUHHHHH?"

i=1while (i<=5): print "Hellodoctornamecontinueyesterdaytomorrow" i+=1 if i>5: print "good morning dr. chandra"

Basic Control Flow - forfor player in TPlayersList: print player

for player in sorted(TPlayersList): print player

for index, player in enumerate(TPlayersList): print index, player

for i in xrange(1,10,2): print i

for key, value in TPlayersDict.iteritems(): print key, value

continue and break

• While running loops, you may need to skip or stop at some point, look up – continue– break

Defining a function

def fib(n): # write Fibonacci series up to n """Print a Fibonacci series up to n.""" a, b = 0, 1 while a < n: print a, a, b = b, a+b

Importing Libraries

• Import library• E.g. “import sys”• Some useful libraries

– sys– re– csv– scipy– numpy

• http://wiki.python.org/moin/UsefulModules#Useful_Modules.2C_Packages_and_Libraries

File IO

• Reading data files into the memory • open() – returns a file object which can read or

write files• open(filename, mode)• filehandle= open(filename, mode)• filehandle.readline() Mode• r= read w=write a=append rb=read in binary

(windows makes that distinction)

Python Example 1

• Reading a CSV and saving each row as an array– Dealing with CSV can be very painful. – Sometimes different character encoding causes

problem when reading csv – If CSV reading just doesn’t work, suspect that you

have an encoding issue. Look up encodings (ISO-8859-1/latin1 to UTF-8)

– This is why no serious programs really use csv as a storage mechanism

• Fire up csvRead.py

Lab

Do Interactive tutorials athttp://www.codecademy.com/courses/

http://www.learnpython.org/

top related