machine learning - wichita state universitysinha/teaching/fall14/cs697ab/slide/intro.pdf · what is...
TRANSCRIPT
Introduction
✤ Instructor: Asst. Prof. Kaushik Sinha
✤ 2 lectures per week MW 12:30-1:45 pm
✤ Office Hours MW 11:30-12:30 Jabara Hall 243
Study Groups (2-3 people)
✤ This course will cover non-trivial material, learning in a group makes it less hard and more fun!
✤ It is recommended (but not required)
Prerequisites
✤ Three pillars of ML:
✤ Statistics / Probability
✤ Linear Algebra
✤ Multivariate Calculus
✤ Should be confident in at least 1/ 3, ideally 2/ 3.
Homework
✤ You can d iscuss homework with your peers but your submitted answer should be your own!
✤ Make honest attempt on all questions
Exams
✤ Exams will be (to some degree) based on homework assignments
✤ Best preparation: Make sure you really really understand the homework assignments
✤ 2 Exams: Final + Midterm
✤ Will be 40% of your grade.
Final Project
✤ 40% of your grade.
✤ 4 page writeup. Joint effort of 2-3 people.
✤ Come up with your own ideas.
✤ Application: Some interesting application of machine learning
✤ In-depth study: Reproduce results from a high-acclaimed paper
✤ Research: Incorporate ML into a research project
✤ Extra cred it is given for working systems (e.g. iphone-, web app)
✤ Details will be posted on course website later
What is Machine Learning?
✤ Formally: (Mitchell 1997): A computer program A is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
✤ Informally: Algorithms that improve on some task with experience.
When should we use ML?
✤ Not ML problems: Traveling Salesman, 3-Sat, etc.
✤ ML Problems: Hard to formalize, but human expert can provide examples / feedback.
✤ Computer needs to learn from feedback.
✤ Is there a sign of cancer in this fMRI scan?
✤ What will the Dow Jones be tomorrow?
✤ Teach a robot to ride a unicycle.
Sometimes easy for humans, hard for computers
✤ Even 1 year old children can identify gender pretty reliably
✤ Easy to come up with examples.
✤ But impossible to formalize as a CS problem.
✤ You need machine learning!
Male or Female?
Example:
Clever
Algorithm
Problem: Given an image of a handwritten d igit, what d igit is it?
2
Input:
Output:
Problem:
You have absolutely no
idea how to do this!
Example: Problem: Given an image of a handwritten d igit, what d igit is it?
Clever
Algorithm
2
Input:
Output:
Problem:
You have absolutely no
idea how to do this!
Good news:
You have examples
0
1
2
3
4
5
6
7
8
9
Machine Learning
Algorithm
The Machine Learning Approach:
Example: Problem: Given an image of a handwritten d igit, what d igit is it?
0
1
2
3
4
5
6
7
8
9
Clever
Algorithm
2
Input:
Output:
Machine Learning
Algorithm
Example: Problem: Given an image of a handwritten d igit, what d igit is it?
0
1
2
3
4
5
6
7
8
9
Learned
Algorithm
2
Training Testing
Handwritten Digits Recognition
✤ (1990-1995) Pretty much solved in the mid nine-tees. (Lecun et al)
✤ Convolutional Neural Networks
✤ Now used by USPS for zip -codes, ATMs for automatic check cashing etc.
TD-Gammon (1994)
✤ Gerry Tesauro (IBM) teaches a neural network to play Backgammon. The net plays 100K+ games against itself and beats world champion [Neurocomputation 1994]
✤ Algorithm teaches itself how to play so well!!!
Deep Blue (1997)
✤ IBM’s Deep Blue wins against Kasparov in chess. Crucial winning move is made due to Machine Learning (G. Tesauro).
Watson (2011)
✤ IBM’s Watson wins the game show jeopardy against former winners Brad Rutters and Ken Jennings.
✤ Extensive Machine Learning techniques were used .
Face Detection (2001)
✤ Viola Jone’s “solves” face detection
✤ Previously very hard problem in computer vision
✤ Now commodity in off-the-shelf cellphones / cameras
Grand Challenge (2005)
✤ Darpa Grand Challenge: The vehicle must drive autonomously 150 Miles through the dessert along a d ifficult route.
✤ 2004 Darpa Grand Challenge huge d isappointment, best team makes 11.78 / 150 miles
✤ 2005 Darpa Grand Challenge 2 is completed by several ML powered teams.
Speech, Netflix, ...
✤ iPhone ships with built-in speech recognition
✤ Google mobile search speech based (very reliable)
✤ Automatic translation
✤ ....
ML is the engine for many fields...
Machine
Learning
Computer
Vision
Robotics
Computatio
nal
Biology
Natural
Language
Processing
Internet companies
✤ Collecting massive amounts of data
✤ Hoping that some smart Machine Learning person makes money out of it.
✤ Your future job!
Example: Webmail
Spam
filtering Given Email,
predict if it is
spam or not.
Ad -
matching Given user
info predict
which ad
will be
clicked on.
Example: Websearch Ad Matching
Given query, predict which ad will be
clicked on.
Web-search ranking Given query, predict which
document will be clicked
on.
Example: Google News
Document clustering Given news articles,
automatically identify and
sort them by topic.
When will it stop?
✤ The human brain is one big learning machine
✤ We know that we can still do a lot better!
✤ However, it is hard . Very few people can design new ML algorithms.
✤ But many people can use them!
What types of ML are there?
✤ supervised learning: Given labeled examples, find the right pred iction of an unlabeled example. (e.g. Given annotated images learn to detect faces.)
✤ unsupervised learning: Given data try to d iscover similar patterns, structure, low d imensional (e.g. automatically cluster news articles by topic)
As far as this course is concerned:
Basic Setup
Pre-processing
Feature Extraction
Learning
(Post-processing)
Clean up the data.
Boring but necessary.
Use expert knowledge to get
representation of data.
Focus of this course.
Whatever you do when you are done.
Feature Extraction
Real World
Represent data in terms of vectors.
Features are statistics that describe the data.
Data Vector Space
Each d imension is
one feature.
✤ Features are statistics that describe the data
✤ Feature: width/height
✤ Pretty good for “1” vs. “2”
✤ Not so good for “2” vs. “3”
16x16
256x1
✤ Feature: raw pixels
✤ Works for d igits (to some degree)
✤ Does not work for trickier stuff
Handwritten digits
Bag of Words for Images
✤ Image: Interest Points 0
1
0
0
0
3
0
0
0
0
✤ Extract interest points and represent the image as a bag of interest points.
Dictionary of possible interest points.
Sparse Vector
Text (Bag of Words)
✤ Text documents: Bag of Words 0
1
0
0
0
2
0
0
0
0
✤ Take d ictionary with n words. Represent a text document as n d imensional vector, where the i-th d imension contains the number of times word i appears in the document.
in
into
...
is
...
...
Audio? Movies?
✤ Use a slid ing window and Fast Fourier Transform
QuickTime™ and aPhoto - JPEG decompressor
are needed to see this picture.
✤ Treat it as a sequence of images
Feature Space
✤ Everything that can be stored on a computer can stored as a vector
✤ Representation is critical for successful learning. [Not in this course, though.]
✤ Throughout this course we will assume data is just points in a Feature Space
✤ Important d istinction: sparse / dense
Every
feature is
present
Most
features
are zero
Mini-Quiz
✤ T/F: Every trad itional CS problem is also an ML problem. FALSE
✤ T/F: Image Features are always dense. FALSE
✤ T/F: The feature space can be very high d imensional. TRUE
✤ T/F: Bag of words features are sparse. TRUE
Mini-Quiz
✤ T/F: Every trad itional CS problem is also an ML problem. FALSE
✤ T/F: Image Features are always dense. FALSE
✤ T/F: The feature space can be very high d imensional. TRUE
✤ T/F: Bag of words features are sparse. TRUE