ess391h1 the application of machine learning and artificial neural …€¦ · the application of...

28
The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python University of Toronto, Department of Earth Sciences, Hosein Shahnas 1 ESS391H1

Upload: others

Post on 20-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

The Application of Machine Learning and Artificial

Neural Networks in Geosciences

Lecture 1 - Python

University of Toronto, Department of Earth Sciences,

Hosein Shahnas

1

ESS391H1

Page 2: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

2

Outline of the Course

Basics of Python

Perceptron Rule

Linear and Nonlinear Models

Support Vector Machine

Kernel Methods

Random Forest

Overfitting and Regularization

Validation

Neural Networks

Convolutional Networks

Recurrent Neural networks

Regression Learning

Multi Class Learning

Page 3: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

3

Programming

Procedural Programming

Computer programs that are just a list of instructions to the computer, telling the

computer to do certain things in a certain way, are called procedural programming

(PP).

Object-oriented Programming

In object-oriented programming (OOP), computer programs make use of objects that

talk to one another and to change the data in those objects, to work in a way that the

user wants. It is a way of writing computer programs which is using the idea of

"objects" to represent data and methods. The way it is designed, allows the code to

be easily reused by other parts of the program.

Page 4: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

4

Programming

Python

Python is an a) interpreted, b) object-oriented, c) high-level programming language

with d) dynamic semantics that can be used for both procedural programming and

object-oriented programming.

Interpreted: Python codes are read and executed by some other computer program

called interpreter, which is different from compiled programming languages where

the source code has to be converted into machine-readable code.

Dynamic semantics: Objects can be assigned multiple values and update

themselves, which is different from a static semantic language (for example variable-

type may change during running time which is dynamic in nature).

High-level: The programming languages that are closer to human languages and

further from machine languages are called high-level programming language.

Page 5: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

5

Programming

https://www.quora.com

Page 6: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

6

Why Python

Python is Free

The Python is one of today’s most popular programming languages and is

developed under an OSI-approved open-source license, making it free to install.

Python is Interpreted

Many languages are compiled, where the source code needs to be translated into

machine code, before it can be run. Programs written in an interpreted language are

passed straight to an interpreter to be run directly.

Python is Portable

Because Python code is interpreted, code written for one platform will work on any

other platform that has the Python interpreter installed.

Python is Simple

Python is relatively simple and uncluttered with clear syntax and readability, and the

developers have deliberately kept it that way.

Python

Page 7: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

7

Simple but Powerful

Despite its syntactical simplicity, Python supports most constructs (e.g., functions,

classes as objects) that would be expected in a very high-level language that

supports structured, functional programming, and object-oriented programming.

Additionally, a very extensive library of classes and functions is available that

provides capability well beyond what is built into the language. Python is powerful

programming language in machine learning (ML) and artificial neural networks

(ANN) with a rich suite of libraries.

Python

Page 8: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

8

Variables

Variables are containers for storing data values.

Variable Names

a) A variable name must start with a letter or the underscore character.

b) A variable name cannot start with a number.

c) A variable name can only contain alpha-numeric characters and underscores (A-

z, 0-9, and _ ).

d) Variable names are case-sensitive.

Standard Data Types

a) Numbers

b) String

c) List

d) Tuple

e) Dictionary

Python Variables

Page 9: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

9

Python Numbers

Number data types store numeric values.

Ex.

x = 1

y = 10.3

Python Strings

Strings in Python are identified as a contiguous set of characters represented in the

quotation marks.

Ex.

x = ‘Hello’

Standard Data Types

Page 10: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

10

Python Lists

A list is a collection which is ordered and changeable. In Python lists are written with

square brackets.

Ex.

my_list = [‘apple’, ‘banana’, ‘cherry’]

my_list = [1.2, 4, 66]

Python Tuple

A tuple is a collection which is ordered and unchangeable. In Python tuples are

written with round brackets.

Ex.

my_tuple = (‘apple’, ‘banana’, ‘cherry’)

my_tuple = (1.2, 4, 66)

Standard Data Types

Lists Tuple

Lists are mutable Tuple are immutable

Implication of iterations is Time-consuming Implication of iterations is comparatively Faster

Lists consume more memory Tuple consume less memory as compared to the list

Page 11: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

11

Dictionary

A dictionary is a collection which is unordered, changeable and indexed. In Python

dictionaries are written with curly brackets, and they have keys and values.

Ex.

my_dict = {

'brand': 'Ford',

'model': 'Mustang',

'year': 1964

}

Basic Operations

Standard Data Types

Operation Result

x + y sum of x and y

x - y difference of x and y

x * y product of x and y

x / y quotient of x and y

x // y (floored) quotient of x and y

x % y remainder of x / y

x ** y x to the power y

Page 12: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

12

New line

print ('First line Second line') Output

First line Second line

print ('First line \n Second line')

Output

First line

Second line

print ('First line \nSecond line')

Output

First line

Second line

Format in Python

Page 13: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

13

Format in Python

Numbers

Integers:

X = 1234

("%d " % (X)) old ('{:d}'.format(X)) new

("%10d " % (X)) old ('{:10d}'.format(X)) new

("%+7d " % (X)) old ('{:+7d}'.format(X)) new

1 2 3 4

1 2 3 4

+ 1 2 3 4

Page 14: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

14

Format in Python

Numbers

Floats:

X = 1234

("%f " % (X)) old ('{:f}'.format(X)) new

("%12f " % (X)) old ('{:12f}'.format(X)) new

("%10.3f " % (X)) old ('{:10.3f}'.format(X)) new

("%+10.3f " % (X)) old ('{:1+0.3f}'.format(X)) new

1 2 3 4 . 0 0 0 0 0 0

1 2 3 4 . 0 0 0

+ 1 2 3 4 . 0 0 0

1 2 3 4 . 0 0 0 0 0 0

Page 15: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

15

Format in Python

Numbers

Exponential:

X = 5684

("%e " % (X)) old ('{:e}'.format(X)) new

("%10.2e " % (X)) old ('{:10.2e}'.format(X)) new

5 . 6 8 4 0 0 0 e + 0 3

5 . 6 8 e + 0 3

Page 16: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

16

Format in Python

Strings

X = [‘one’, ‘two’]

("%s " % (X)) old ('{}'.format(X)) new

("%s " % (X[0])) old ('{}'.format(X)) new

('{1}'.format(‘one’, ’two’)) new

('{1} {0}'.format(‘one’, ’two’)) new

[‘ o n e ‘ , ‘ t w o ‘ ]

t w o

t w o o n e

o n e

Page 17: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

17

Format in Python

Strings

X = ‘characters’

("%s " % (X)) old ('{}'.format(X)) new

("%.5s " % (X)) old ('{:.5}'.format(X)) new

See more examples in format.py.

c h a r a c t e r s

c h a r a

Page 18: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

18

Declaring Python Variables a = 10

b = ‘hello'

print ('a is an int type variable because it has an int value in it. a = ', a)

print ('b is a string type variable as it has a string value in it. b = ', b)

Output

a is an int type variable because it has an int value in it. a = 10

b is a string type variable as it has a string value in it. b = hello

Re-declaring Python Variable

a = 1

print ('a = ', a)

a = 'hello'

print ('a = ', a)

Output

a = 1

a = hello

Variables in Python

Page 19: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

19

Local Variables a=100

print('a outside function = ', a)

def my_function():

a = ‘hello'

print('a inside function = ', a)

my_function()

print('a outside function = ', a)

Output

a outside function = 100

a inside function = hello

a outside function = 100

Variables in Python

Global Variables a=100

print('a outside function = ', a)

def my_function():

global a

print('a inside function = ', a)

a = 'hello'

print('a inside function = ', a)

my_function()

print('a outside function = ', a)

Output

a outside function = 100

a inside function = 100

a inside function = hello

a outside function = hello

Page 20: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

20

Deleting Variables a = 10

print('a before del = ', a)

del a

print('a after del = ', a)

Output

a before del = 10

Err . . . .

Concatenating Python Variables a = 'hello'

b = 100

c= a + str(b)

print('a + b = ', c)

Output

a + b = hello100

Variables in Python

Assigning Multiple Values a, b, c = 15, 32.2, ‘Hello’

Output

a, b , c = 15 32.2 Hello

Assigning the Same Value

a = b = c = 'same'

print('a, b , c = ', a, b, c)

Output

a, b , c = same same same

https://intellipaat.com/blog/tutorial/python-tutorial/python-variables/#_Creating_and_Declaring

Page 21: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

21

Boolean Values

You can evaluate any expression in Python, and get one of two answers, True or False. When

you compare two values, the expression is evaluated and Python returns the Boolean answer:

Variables in Python

x = 10 > 9

y = 1 == 9

z = 10 < 9

q = 1 == True

p = 1 == False

a = True + 4

b = False + 10

print('x = ', x)

print('y = ', y)

print('z = ', z)

print('q = ', q)

print('p = ', p)

print('a = ', a)

print('b = ', b

Output

x = True

y = False

z = False

q = True

p = False

a = 5

b = 10

Page 22: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

22

Learning from Data

Y. S. Abu-Mostafa

Python Machine Learning S. Raschka, PACKT, Open Source

Building Machine Learning Systems

with Python Mastering Machine W. Richert & L. P. Coelho, PACKT, Open Source

Learning with scikit-learn G. Hackeling, PACKT, Open Source

Look at the other relevant books published by PACKT

Some Useful References

Page 23: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

23

Open-source Python Data Science Platform

The open source Anaconda Distribution is the fastest and easiest way to do Python and R

data science and machine learning on Linux, Windows, and Mac OS X. It's the industry

standard for developing, testing, and training on a single machine.

https://anaconda.org/

https://www.anaconda.com/download/

Applications and Packages for Practical Work

Page 25: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

25

Open-source software library

Canopy is a tailor-made for the workflows of scientists and engineers, combining a

streamlined integrated analysis environment over 450 proven scientific and analytic

Python packages from the trusted Enthought Python Distribution. Canopy provides a

complete, self-contained installer that gets you up and running with Python and a library

of scientific and analytic tools – fast.

https://www.enthought.com/product/canopy/

Applications and Packages for Practical Work

Page 26: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

26

Open-source software library

TensorFlow™ is an open source software library for high performance numerical

computation. Its flexible architecture allows easy deployment of computation across a

variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to

mobile and edge devices. Originally developed by researchers and engineers from the

Google Brain team within Google’s AI organization, it comes with strong support for

machine learning and deep learning and the flexible numerical computation core is used

across many other scientific domains.

https://www.tensorflow.org/

https://www.tensorflow.org/install/

conda create --name tensorflow python=3.5

activate tensorflow

conda install jupyter

conda install scipy

pip install tensorflow-gpu

Applications and Packages for Practical Work

Page 27: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

27

Open-source software library

The open source Anaconda Distribution is the fastest and easiest way to do Python and R

data science and machine learning on Linux, Windows, and Mac OS X. It's the industry

standard for developing, testing, and training on a single machine.

https://anaconda.org/conda-forge/keras

conda install -c conda-forge keras

Applications and Packages for Practical Work

Page 28: ESS391H1 The Application of Machine Learning and Artificial Neural …€¦ · The Application of Machine Learning and Artificial Neural Networks in Geosciences Lecture 1 - Python

28

Open-source web application

The Jupyter Notebook is an open-source web application that allows you to create and

share documents that contain live code, equations, visualizations and narrative text. Uses

include: data cleaning and transformation, numerical simulation, statistical modeling,

data visualization, machine learning, and much more.

https://jupyter.org/

http://jupyter.org/install

Applications and Packages for Practical Work