ess391h1 the application of machine learning and artificial neural …€¦ · the application of...
TRANSCRIPT
The Application of Machine Learning and Artificial
Neural Networks in Geosciences
Lecture 1 - Python
University of Toronto, Department of Earth Sciences,
Hosein Shahnas
1
ESS391H1
2
Outline of the Course
Basics of Python
Perceptron Rule
Linear and Nonlinear Models
Support Vector Machine
Kernel Methods
Random Forest
Overfitting and Regularization
Validation
Neural Networks
Convolutional Networks
Recurrent Neural networks
Regression Learning
Multi Class Learning
3
Programming
Procedural Programming
Computer programs that are just a list of instructions to the computer, telling the
computer to do certain things in a certain way, are called procedural programming
(PP).
Object-oriented Programming
In object-oriented programming (OOP), computer programs make use of objects that
talk to one another and to change the data in those objects, to work in a way that the
user wants. It is a way of writing computer programs which is using the idea of
"objects" to represent data and methods. The way it is designed, allows the code to
be easily reused by other parts of the program.
4
Programming
Python
Python is an a) interpreted, b) object-oriented, c) high-level programming language
with d) dynamic semantics that can be used for both procedural programming and
object-oriented programming.
Interpreted: Python codes are read and executed by some other computer program
called interpreter, which is different from compiled programming languages where
the source code has to be converted into machine-readable code.
Dynamic semantics: Objects can be assigned multiple values and update
themselves, which is different from a static semantic language (for example variable-
type may change during running time which is dynamic in nature).
High-level: The programming languages that are closer to human languages and
further from machine languages are called high-level programming language.
5
Programming
https://www.quora.com
6
Why Python
Python is Free
The Python is one of today’s most popular programming languages and is
developed under an OSI-approved open-source license, making it free to install.
Python is Interpreted
Many languages are compiled, where the source code needs to be translated into
machine code, before it can be run. Programs written in an interpreted language are
passed straight to an interpreter to be run directly.
Python is Portable
Because Python code is interpreted, code written for one platform will work on any
other platform that has the Python interpreter installed.
Python is Simple
Python is relatively simple and uncluttered with clear syntax and readability, and the
developers have deliberately kept it that way.
Python
7
Simple but Powerful
Despite its syntactical simplicity, Python supports most constructs (e.g., functions,
classes as objects) that would be expected in a very high-level language that
supports structured, functional programming, and object-oriented programming.
Additionally, a very extensive library of classes and functions is available that
provides capability well beyond what is built into the language. Python is powerful
programming language in machine learning (ML) and artificial neural networks
(ANN) with a rich suite of libraries.
Python
8
Variables
Variables are containers for storing data values.
Variable Names
a) A variable name must start with a letter or the underscore character.
b) A variable name cannot start with a number.
c) A variable name can only contain alpha-numeric characters and underscores (A-
z, 0-9, and _ ).
d) Variable names are case-sensitive.
Standard Data Types
a) Numbers
b) String
c) List
d) Tuple
e) Dictionary
Python Variables
9
Python Numbers
Number data types store numeric values.
Ex.
x = 1
y = 10.3
Python Strings
Strings in Python are identified as a contiguous set of characters represented in the
quotation marks.
Ex.
x = ‘Hello’
Standard Data Types
10
Python Lists
A list is a collection which is ordered and changeable. In Python lists are written with
square brackets.
Ex.
my_list = [‘apple’, ‘banana’, ‘cherry’]
my_list = [1.2, 4, 66]
Python Tuple
A tuple is a collection which is ordered and unchangeable. In Python tuples are
written with round brackets.
Ex.
my_tuple = (‘apple’, ‘banana’, ‘cherry’)
my_tuple = (1.2, 4, 66)
Standard Data Types
Lists Tuple
Lists are mutable Tuple are immutable
Implication of iterations is Time-consuming Implication of iterations is comparatively Faster
Lists consume more memory Tuple consume less memory as compared to the list
11
Dictionary
A dictionary is a collection which is unordered, changeable and indexed. In Python
dictionaries are written with curly brackets, and they have keys and values.
Ex.
my_dict = {
'brand': 'Ford',
'model': 'Mustang',
'year': 1964
}
Basic Operations
Standard Data Types
Operation Result
x + y sum of x and y
x - y difference of x and y
x * y product of x and y
x / y quotient of x and y
x // y (floored) quotient of x and y
x % y remainder of x / y
x ** y x to the power y
12
New line
print ('First line Second line') Output
First line Second line
print ('First line \n Second line')
Output
First line
Second line
print ('First line \nSecond line')
Output
First line
Second line
Format in Python
13
Format in Python
Numbers
Integers:
X = 1234
("%d " % (X)) old ('{:d}'.format(X)) new
("%10d " % (X)) old ('{:10d}'.format(X)) new
("%+7d " % (X)) old ('{:+7d}'.format(X)) new
1 2 3 4
1 2 3 4
+ 1 2 3 4
14
Format in Python
Numbers
Floats:
X = 1234
("%f " % (X)) old ('{:f}'.format(X)) new
("%12f " % (X)) old ('{:12f}'.format(X)) new
("%10.3f " % (X)) old ('{:10.3f}'.format(X)) new
("%+10.3f " % (X)) old ('{:1+0.3f}'.format(X)) new
1 2 3 4 . 0 0 0 0 0 0
1 2 3 4 . 0 0 0
+ 1 2 3 4 . 0 0 0
1 2 3 4 . 0 0 0 0 0 0
15
Format in Python
Numbers
Exponential:
X = 5684
("%e " % (X)) old ('{:e}'.format(X)) new
("%10.2e " % (X)) old ('{:10.2e}'.format(X)) new
5 . 6 8 4 0 0 0 e + 0 3
5 . 6 8 e + 0 3
16
Format in Python
Strings
X = [‘one’, ‘two’]
("%s " % (X)) old ('{}'.format(X)) new
("%s " % (X[0])) old ('{}'.format(X)) new
('{1}'.format(‘one’, ’two’)) new
('{1} {0}'.format(‘one’, ’two’)) new
[‘ o n e ‘ , ‘ t w o ‘ ]
t w o
t w o o n e
o n e
17
Format in Python
Strings
X = ‘characters’
("%s " % (X)) old ('{}'.format(X)) new
("%.5s " % (X)) old ('{:.5}'.format(X)) new
See more examples in format.py.
c h a r a c t e r s
c h a r a
18
Declaring Python Variables a = 10
b = ‘hello'
print ('a is an int type variable because it has an int value in it. a = ', a)
print ('b is a string type variable as it has a string value in it. b = ', b)
Output
a is an int type variable because it has an int value in it. a = 10
b is a string type variable as it has a string value in it. b = hello
Re-declaring Python Variable
a = 1
print ('a = ', a)
a = 'hello'
print ('a = ', a)
Output
a = 1
a = hello
Variables in Python
19
Local Variables a=100
print('a outside function = ', a)
def my_function():
a = ‘hello'
print('a inside function = ', a)
my_function()
print('a outside function = ', a)
Output
a outside function = 100
a inside function = hello
a outside function = 100
Variables in Python
Global Variables a=100
print('a outside function = ', a)
def my_function():
global a
print('a inside function = ', a)
a = 'hello'
print('a inside function = ', a)
my_function()
print('a outside function = ', a)
Output
a outside function = 100
a inside function = 100
a inside function = hello
a outside function = hello
20
Deleting Variables a = 10
print('a before del = ', a)
del a
print('a after del = ', a)
Output
a before del = 10
Err . . . .
Concatenating Python Variables a = 'hello'
b = 100
c= a + str(b)
print('a + b = ', c)
Output
a + b = hello100
Variables in Python
Assigning Multiple Values a, b, c = 15, 32.2, ‘Hello’
Output
a, b , c = 15 32.2 Hello
Assigning the Same Value
a = b = c = 'same'
print('a, b , c = ', a, b, c)
Output
a, b , c = same same same
https://intellipaat.com/blog/tutorial/python-tutorial/python-variables/#_Creating_and_Declaring
21
Boolean Values
You can evaluate any expression in Python, and get one of two answers, True or False. When
you compare two values, the expression is evaluated and Python returns the Boolean answer:
Variables in Python
x = 10 > 9
y = 1 == 9
z = 10 < 9
q = 1 == True
p = 1 == False
a = True + 4
b = False + 10
print('x = ', x)
print('y = ', y)
print('z = ', z)
print('q = ', q)
print('p = ', p)
print('a = ', a)
print('b = ', b
Output
x = True
y = False
z = False
q = True
p = False
a = 5
b = 10
22
Learning from Data
Y. S. Abu-Mostafa
Python Machine Learning S. Raschka, PACKT, Open Source
Building Machine Learning Systems
with Python Mastering Machine W. Richert & L. P. Coelho, PACKT, Open Source
Learning with scikit-learn G. Hackeling, PACKT, Open Source
Look at the other relevant books published by PACKT
Some Useful References
23
Open-source Python Data Science Platform
The open source Anaconda Distribution is the fastest and easiest way to do Python and R
data science and machine learning on Linux, Windows, and Mac OS X. It's the industry
standard for developing, testing, and training on a single machine.
https://anaconda.org/
https://www.anaconda.com/download/
Applications and Packages for Practical Work
24
Applications and Packages for Practical Work
Open-source high-level programming language
Python is an interpreted, object-oriented, high-level programming language. Since there
is no compilation step, the edit-test-debug cycle is incredibly fast.
https://www.python.org/
https://www.python.org/downloads/
https://docs.python.org/3.6/index.html (Documentation)
25
Open-source software library
Canopy is a tailor-made for the workflows of scientists and engineers, combining a
streamlined integrated analysis environment over 450 proven scientific and analytic
Python packages from the trusted Enthought Python Distribution. Canopy provides a
complete, self-contained installer that gets you up and running with Python and a library
of scientific and analytic tools – fast.
https://www.enthought.com/product/canopy/
Applications and Packages for Practical Work
26
Open-source software library
TensorFlow™ is an open source software library for high performance numerical
computation. Its flexible architecture allows easy deployment of computation across a
variety of platforms (CPUs, GPUs, TPUs), and from desktops to clusters of servers to
mobile and edge devices. Originally developed by researchers and engineers from the
Google Brain team within Google’s AI organization, it comes with strong support for
machine learning and deep learning and the flexible numerical computation core is used
across many other scientific domains.
https://www.tensorflow.org/
https://www.tensorflow.org/install/
conda create --name tensorflow python=3.5
activate tensorflow
conda install jupyter
conda install scipy
pip install tensorflow-gpu
Applications and Packages for Practical Work
27
Open-source software library
The open source Anaconda Distribution is the fastest and easiest way to do Python and R
data science and machine learning on Linux, Windows, and Mac OS X. It's the industry
standard for developing, testing, and training on a single machine.
https://anaconda.org/conda-forge/keras
conda install -c conda-forge keras
Applications and Packages for Practical Work
28
Open-source web application
The Jupyter Notebook is an open-source web application that allows you to create and
share documents that contain live code, equations, visualizations and narrative text. Uses
include: data cleaning and transformation, numerical simulation, statistical modeling,
data visualization, machine learning, and much more.
https://jupyter.org/
http://jupyter.org/install
Applications and Packages for Practical Work