data types and flow control june 25, 2015. didn’t we learn this already? most of these topics have...

34
Data Types and Flow Control June 25, 2015

Upload: lenard-hardy

Post on 12-Jan-2016

215 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Data Types and Flow Control

June 25, 2015

Page 2: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Didn’t we learn this already?

Most of these topics have been introduced, but this lecture gives you the details you need to use them:– Which data type is appropriate for a situation• Advantages and disadvantages of each

– Compare/contrast methods for flow control– Writing and evaluating logical expressions– Common pitfalls and unexpected behavior

Page 3: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

PART 1: DATA TYPES

Page 4: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Part 1 Overview

• Primitive types– Integers– Floating point numbers– Boolean– Characters

• Ordered types– Array/list– String

• Unordered types– Set– Hash/map/dictionary

• Teaser for structs and objects

Page 5: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Integers

• Number with no fractional part• Stores a bit to indicate if the number is positive or

negative + a binary representation of the number• Stores the exact value (i.e. 2 is stored as exactly 2),

so == comparisons are safe• Absolute size limit (system-dependent)– E.g. a 32-bit system could store integers within

±(231-1).• Converting floating point numbers to integers

truncates the decimal part (2.1 2, -2.1 -2)

Page 6: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Special types of integers

• Long integers– Take up more space in memory than integers, but can

store larger values• Short integers– Take up less space, but have smaller range

• Unsigned integers– Same amount of space as integers– Only store integers >= 0– Useful for values that can never be negative (count,

class size, etc.)

Page 7: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Integer Overflow

• Overflow: Integer exceeds its maximum/minimum size – Some systems may give you an overflow error– In other cases, values may wrap:

<largest possible positive integer> + 1 = <largest possible negative integer>– Others (e.g. Python) automatically convert to a long integer

• Unsigned integers may cause unexpected behavior if you try to store a negative value:

>>> y = uint(12)>>> print yc_uint(12L) 12 is still 12>>> x = uint(-12)>>> print xc_uint(4294967284L) But -12 is this giant number!

Page 8: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Floating point numbers

• Numbers stored in scientific notation:– Sign– Power of ten – Binary representation of fractional part

• i.e. 10.2 = +1.02*101

• Special values– Inf (infinity) and –Inf (negative infinity)– NaN (not a number)—produced from operations without a definite

answer (Inf/Inf, etc.)

• Double precision: Uses additional space, so it can store larger numbers and more decimal places

Page 9: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Problems with floating point numbers

Since floating point numbers are stored as fractions in binary, not all numbers can be stored precisely, so math may not give exactly the answer you expect:

>>> x = 3.2>>> y = 1.1>>> x+y == 4.3False Hmmmm, that’s not right . . .>>> x+y4.300000000000001

Therefore, it is best not to check equality of floating point numbers.

Page 10: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Boolean

• Used to store true/false values• Can be converted from any other type.– False values (may vary by language):• Numeric types: 0• Empty aggregate types ([], “”, {}, etc.)

– All other values are true• Commonly used in loops and control

structures (coming up!)

Page 11: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Arrays, vectors, and lists

• Ordered collection of values or variables• Typically mutable: you can change the

values in the array after you have created it

• Various implementations– All one type or mixed type (lists typically

mixed, others not)– Fixed length or variable length

• Often indexed from 0 (the first element is 0, second is 1, etc.)– Language dependent

(e.g. Matlab indexes from 1)

Page 12: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Basic array operations

• Accessing data: Use brackets or parentheses (language-dependent) and the index number of the element you want to access

>>myArray = [2, 4, 6, 8, 10]>>myArray[3]8

• Setting values: Same as accessing data, but set it equal to a value

>>myArray[3] = 100>>print myArray[2, 4, 6, 100, 10]

• Other operations vary (appending, deleting entries, etc.)

Page 13: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Strings

• Sequence of characters enclosed in single or double quotation marks (language-dependent)

• Similar operations to arrays:– Usually indexed from 0 (language-dependent)– Access individual characters by indexing

• In many languages, strings are immutable, meaning they can’t be changed in place (i.e. myString[3] = ‘a’ won’t work).

– Concatenation: “race” + “car” = “racecar”• Spaces aren’t added automatically—you have to tell the

computer where they go!

Page 14: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Sets• Values stored and accessed in no particular

order• Each value appears only once in the set• Very fast to check whether a value is in the

set• Objects can be added or removed, but are

not placed at a particular index• Common set operations (for sets A and B)

– Union: items found in A, B, or both– Intersect: items found in both A and B– Difference: e.g. B-A = set of items in B but not in

A– Symmetric difference = set of items in A or B but

not both

Page 15: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Hash tables, maps, dictionaries

• Set of keys (constant value, like a word in a dictionary) paired with values– Keys must be immutable– Values can be any type including

numeric types, strings, arrays, or even other dictionaries

• Keys stored in no particular order

Page 16: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Using dictionaries• Values are indexed by their keys for accessing/setting

>>myDictionary = {‘first’:1, ‘second’:2, ‘third’:3}>>myDictionary[‘first’]1>>myDictionary[‘first’] = 25>>print myDictionary Note

that they aren’t in{'second': 2, 'third': 3, 'first': 25} their original

order

• For a dictionary of lists, you can index sequentially>>> myDictionary = {'first':[1,2,3], 'second':[10,20,30], 'third':

[100,200,300]}>>> myDictionary['first'][2]3

**This also works for lists of lists, dictionaries of dictionaries, etc.

Page 17: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

When will I use unordered structures?

Sets• Keep track of what and how

many unique sequences are in a file (but not # of replicates)

mySet = set()for sequence in file:

mySet.insert(sequence)#If sequence is already in #mySet, nothing happens

• For two files, quickly find which sequences appear in both:

inBoth = mySet1.intersection(mySet2)

Dictionaries• Map codons to amino acids for

mRNA translationmyCodons = {“AUG”:”M’, “AAA”:”K”, . . . “UUU”:F”}

• Keep running counts of how often each unique sequence appears in a file

• You could even store lists of the line numbers where they appear:

myDict[sequence].append(lineNumber)

Page 18: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

What if I want something more complicated/specific?

• Some languages allow you to define structures (struct) which store some set of pre-specified variables

• Object-oriented languages let you define classes, which contain both data specific to the class and special functions (methods) that belong to that class

Page 19: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

PART 2: CONTROL STRUCTURES

Page 20: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Flow control topics to cover• Logic—and, or, not, order of priority, De Morgan’s law• If-elif (or else if, or elsif) –else• Loops

– For– While (pre-test)– Do-while (post-test)

• Statements to affect flow control– Break– Continue– Pass

• Functions• Try/Except• Don’t use goto

Page 21: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Basic Logical Comparisons

• Mathematical comparisons<, >, <=, >=, ==, !=**Don’t confuse equality (==) and assignment (=)**

• Logical operations– x OR y (x||y): Inclusive or (x is true or y is true or

both)– x AND y (x && y): x and y are both true– NOT x (!x): x is not true

Page 22: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Logical order of operations

• Not has higher preference than And which has higher preference than Or

• Parentheses can be used to give a grouping priority (like in mathematical order of operations)

Examplesx and y or y and z: Either both x and y are true OR both y and z are

true (or they are all true); (x and y) or (y and z)not x and y or y and z: Either 1) x is false and y is true or 2) y and z

are both true (or both); ((not x) and y) or (y and z)not (x or y) = not x and not ynot (x and y) = not x or not y

Page 23: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

If statements

if( x ):some code

more codeelse:

some different code

end

x can be a variable, comparison, or any combination of variables/comparisons with and/or/not

The truth value of x is evaluated

If x is true, the code in red is executed, and the code in blue is skipped

If x is false, the code in red is skipped, and the code in blue is executed

Page 24: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Else-if

You can use this structure to test multiple cases:if( x ):

some code

more codeelse if (y):

some different codeelse if (z):

still more codeelse if (x and z):

this code will never runelse:

code if all else fails

• Code in red runs if x is true– All subsequent “else” and

“else if” blocks are skipped—the conditions are never tested

• Code in blue runs if x is false and y is true

• Code in green runs if x and y are false and z is true

• Code in purple will never run• Code in orange runs if x, y,

and z are false

Page 25: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

If statements don’t have to have an else

if (x):code

if (x or z):code

else if (y):code

code that always runs

• Code in red runs if x is true

• Code in blue runs if x is true OR z is true– Note this is independent

of the first test

• Code in green runs if y is true and x and z are false– This else if belongs to the

second if statement

Page 26: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Loops

• A loop is a section of code that will run repeatedly until some condition is met

• Types of loops you may see– Pre-test loops (“while” or “until”)– Post-test loops(“do … while” or “do … until”)– For loops

Page 27: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Pre-test loops

while(x):codemore codecode that

affects xcode that runs once x is no longer true

• x is tested before the loop runs– If x is false to begin with, the

loop never executes– Each time the loop

completes, x is tested again to determine whether the loop will run again

• What about until(x)?– Same as while(not x)

• What if x never changes?– Infinite loop!

Page 28: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Post-test loops

do:codemore codecode that

affects xwhile(x)code that runs once x is no longer true

• x is tested after the loop executes

• What difference does this make?– Both can accomplish the

same thing– If x is false the first time you

reach the loop, a while loop will not run at all, but a do-while loop will run at least once

• Pre-test loops are much more common

Page 29: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

For Loops

x = [1, 2, 3, 4, 5, 6, 7, 8]For loop version:for value in x:

print valueWhile loop version:index = 0while(index < length(x))

print x[index]index = index + 1

• For loops are essentially specialized while loops designed to loop through a data structure (lists, strings, etc.)

• Number of iterations generally known before the loop begins

• Much harder to write an infinite for loop (but still possible)

Page 30: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Infinite loops

while(true):This code will run foreverForever and everOver and over againUntil you manually kill it (Control-C)Unless you have another way out

code that will never run

Page 31: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Break and Continue statements

Break• Use break when you want to

completely leave (‘break out’) of a loop

while (True):x = x + 1if (x > 6):

breaky = y / 2x = x + y

code that runs once x > 6

Continue• Use continue when you want to

skip to the next iteration of a loop

• Always tests the condition before continuing

while (x > 0):if (y > 5):

x = x – 1continue

x = x – 2 #This only executes if #y <= 5

Page 32: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Functionsfunction addTheseNumbers(x, y, z):

return x + y + zfunction printMessage():

print “This function”,print “returns nothing.”

function appendToList(list)list.append(‘more’)

a = 2 #We actually start here!!b = 7c = 1000myList = [‘f’, ‘g’, ‘h’]z = addTheseNumbers(a, b, c)print z#This should print 1009printMessage()appendToList(myList)#myList now contains [‘f’, ‘g’, ‘h’, ‘more’]

• A function lets you – Often take in values and return

(output) a different value (like a mathematical function)

– Some functions take no input and/or produce no output

• When it reaches the function call (in the code below), the program gets instructions from the function definition (above)

• x, y, and z are the parameters of the function

• a, b, and c are the arguments– The contents of the arguments are

assigned to the parameters

• The function ends when it reaches the return statement

Page 33: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

Exception handling (sneak preview)

try:some code

except:some different coderuns if & only if red

codethrows an exception

• Example:x = 0y = 2try:

z = y/xexcept:

print “You can’t do that!”

Page 34: Data Types and Flow Control June 25, 2015. Didn’t we learn this already? Most of these topics have been introduced, but this lecture gives you the details

What will be the output?

def translate(mySequence): #Assume this contains all the codons myCodons = {“AAA”:”K”, “AAC”:”N”, . . . “UUU”:”F”} myProtein = “” start = mySequence.find(“AUG”) for i in range(start, len(mySequence)-2, 3): #HINT: range(start, stop, step) makes a list of numbers #from start to (stop – 1) in increments of step

if myCodons[ mySequence[ i:i+3] ] == “STOP”:break

myProtein += myCodons[ mySequence[ i:i+3 ] ]

return myProtein

transcript_1 = “AUGAUCCCUUUAUAGAG”transcript_2 = “ACUUAUGCAUGATCAUUGACAAAAAA”print “Transcript 1 translates to”, translate(transcript_1)print “Transcript 2 translates to”, translate(transcript_2)

• What is the first thing this program does?

• What does the function do?• What is myCodons and what

is it used for?• What is the purpose of the

break statement?• What happens if the

sequence contains a stop codon? If it doesn’t contain a stop codon?