machine learning

STIN2063 Machine Learning Chapter 1 (Part 2)

Introduction to Machine Learning

1

Dr. Mohd Shamrie Sainin School of Computing

College of Arts and Sciences Universiti Utara Malaysia

Chapter Objective

• At the end of this chapter, student must be able to• Design simple Learning System.• Describe the basic of problem solving using

Machine Learning technique.• Discuss issues related to Machine Learning

2

Chapter 1 Part 2 Outline

• Defining a learning system• Training Experience• Target Function• Target Function Representation• Training Procedure and Algorithm

• Architecture of a Learner• Issues in Machine Learning

3

Defining a Learning Systems• Learning Tasks :

• Improve over task T, with respect to performance measure P, based on experience E.

• Examples:• T: Playing checkers

P: Percent of games won against opponentsE: Playing practice games against itself

• T: Recognizing hand-written wordsP: Percent of words correctly classifiedE: Database of classified handwritten words 4

Design Choices for Learning to Play Checkers

5

Completed Design

Determine Type ofTraining Experience

Gamesagainst experts

Gamesagainst self

Table ofcorrect moves

DetermineTarget Function

Board valueBoard move

Determine Representation ofLearned Function

Polynomial Linear functionof six features

Artificial neuralnetwork

DetermineLearning Algorithm

Gradientdescent

Linearprogramming

Designing a Learning System

• Step:• Choosing the Training Experience• Choosing the Target Function• Choosing a Representation for the

Target Function• Choosing a Function Approximation

Algorithm6

Defining The Learning System• Learning Task• Task:

• Learning to classify/predict student grade• Performance:

• Percentage of correct classification of the student grade

• Experience: • Previous data about student’s grade

7

Step 1

• Step1: Choosing the training experience• Student data is normally consist of: quizzes,

assignments, projects, attendance and presentations.

• Training experience should be taken from the items which directly related to our future target.

• Here we assume that the training data must be direct training example (supervised) 8

Step 2

• Step 2: Choosing the target function• Target function is the type of knowledge

that will be learned• Here, our target is to know the grade of the

student, G(s).• Therefore we can define our target value:

• G(s) ≥ 70, then HIGH• G(s) < 70, then LOW

• We can simplified our target function as:• G: Mark → ℜ 9

Step 3

• Step 3: Choosing the Representation of target function • Representation of target function is the function which

the learning program will use to learn.• We have many options:

• Represent G using rules• Represent G using boolean feature• Etc..

• Note: The more expressive the representation, the more training data the program will require to choose among alternative it can represent.

10

Step 3: Contd• We choose a simple representation: for any given grade,

the function of G will be calculated as discretized combination of the following features:• x1: Quiz 1 mark• x2: Quiz 2 mark• x3: Assignment 1 mark• x4: Assignment 2 mark• x5: project mark• x6: presentation mark• x7: attendance

• Using this combination, our target function representation can be formulated as:• X → G(X) where G(X) ← {HIGH|LOW}• Where X is combination of features from x1..x7

11

Step 3: Contd

• Thus, our learning program will represent G(s) as a discrete linear neural network function of the form:• G(s) =

w0+w1x1+w2x2+w3x3+w4x4+w5x5+w6x6+w7x7

• where w is weight chosen by the learning program

12

Step 4

• Step 4: Choosing a Function Approximation. • In order to learn target function G we require set of

training examples, each describing a specific mark m and the training value Gtrain(m) for m.

• The example representation of training examples:• <<x1=5,x2=5,x3=10,x4=10,x5=20,x6=10,x7=10>,

HIGH>• <<x1=3,x2=2,x3=4,x4=2,x5=10,x6=4,x7=2>, LOW>

13

Step 4 Contd

• With the representation of target function and training data, we can use function approximation: Perceptron

• Using perceptron because:• Can be used to solve linear problem• uses a one-layer network with a binary step activation

function

14

Final Design

15

Experiment Generator

Critic

Performance System Generalizer

Hypothesis G

Training examplesSolution trace

New problem

Final Design• The final design of grade learning system can be

described by four distinct program modules:• The Performance System

• module that must solve performance task. It takes new input (grade) and provide an output (classification)

• Performance is expected to improve• The Critic

• Takes input of history trace of the data and produce a set of training examples

• The Generalizer• Takes as input the training examples and produce output

hypothesis that is estimating the target function• Experiment Generator

• Takes an input the current hypothesis (learned function) and output new problem.

16

Summary

17

Determine Type of Training Experience

Determine Target Function

Determine Rep. Target Function

Determine Rep. Target Function

List of marksSet of rules

….

Marks → valueMarks → rules

Neural NetworkPolynomial

….

….

Completed design

PerceptronBackpropagation

Perspectives in ML

• “Learning as search in a space of possible hypotheses”

• Representations for hypotheses• Linear functions• Logical descriptions• Decision trees• Neural networks

• Learning methods are characterized by their search strategies and by the underlying structure of the search spaces.

18

Issues in ML

• Algorithms• What generalization methods exist?• When (if ever) will they converge?• Which are best for which types of problems and

representations?• Amount of Training Data

• How much is sufficient?• confidence, data & size of space

• Prior knowledge• When & how can it help?• Helpful even when approximate? 19

Issues in ML• Choosing experiments

• Are there good strategies?• How do choices affect complexity?

• Reducing problems• learning task --> function approximation

• Flexible representations• automatic modification?

• Biological learning systems• any clues there? E.g. ABC, ACO, GWO, etc.

• Noise• influence to accuracy 20

Future of ML• Current Directions

• Feature Selection & Extraction• Biologically-inspired solutions (Genetic

Algorithms)• Multiple models, hybrid models• ML & Intelligent Agents – distributed models• Web/Text/Multimedia Mining• ML in emerging data-intensive areas:

Bioinformatics, Intrusion Detection• Philosophical and social aspects of ML

21

Future of ML

• Currently, most ML is on stationary flat tables• Richer data sources

• text, links, web, images, multimedia, knowledge bases

• Advanced methods• Link mining, Stream mining, …

• Applications• Web, Bioinformatics, Customer modeling, …

22

Future of ML

• Technical• tera-bytes and peta-bytes – data flood!• complex, multi-media, semi-structured data• integration with domain (expert) knowledge

• Business• finding new good application areas/tasks

• Societal• Privacy/ethical issues – many issues still

unsolved! 23

Reading• Machine Learning for Science: State of the Art and Future

Prospects• http://www-aig.jpl.nasa.gov/public/mls/papers/emj/emj-

science-01.pdf

24

Summary• Defining Leaning System• Issues of ML• Future of ML

25

machine learning

Documents

learning program

target function g

function of g

learning systemstep

machine learning technique

target functionchoosing

target value

training data