introduction to pattern recognition -...

34
Lec1: Introduction to Pattern Recognition 1 Introduction to Pattern Recognition Prof. Daniel Yeung School of Computer Science and Engineering South China University of Technology Lecture 1 Pattern Recognition Lec1: Introduction to Pattern Recognition 2 A Cyber Security Example

Upload: others

Post on 30-May-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Lec1: Introduction to Pattern Recognition 1

Introduction to

Pattern Recognition

Prof. Daniel Yeung

School of Computer Science and Engineering

South China University of Technology

Lecture 1Pattern Recognition

Lec1: Introduction to Pattern Recognition 2

A Cyber Security Example

Lec1: Introduction to Pattern Recognition 3

Cyberinfrastructure vulnerability According to the 2012 Norton Study

� Global Cybercrime annual cost: US$114 Billion

� More Than One Million Victims a Day

� Time lost due to cybercrime -- additional US$274 billion

� Cybercrime costs the world significantly more than the global black market in marijuana, cocaine and heroin combined (US$288 billion)

� 69% online adults have at least once been a victim of cybercrime

� Every second 14 adult victims of cybercrime

� The number of Cybercrime doubles in 3 years“Norton Study Calculates Cost of Global Cybercrime: US$114 Billion Annually”, Symantec Corp, 2012

http://www.symantec.com/about/news/release/article.jsp?prid=20110907_02

Lec1: Introduction to Pattern Recognition 4

Cyberinfrastructure vulnerability� The average cost of cybercrime in the five countries covered by the ‘2012

Cost of Cybercrime Study’

� Source: HP/Ponemon Institute.

“Cybercrime Attacks Double in Three Years”, Computer Fraud & Security, 2012

Lec1: Introduction to Pattern Recognition 5

Cyberinfrastructure vulnerability� In May, 2011, more than 77 million Sony Playstations hacked

� 24 Million customers’ information of Amazon Fashion e-retailer

hacked

� Q4 2009 to Q4 2010: 8% increase in DDoS attacks against e-

Commerce companies

� Q4 2010 to Q4 2011, attacks up by 153%

� 94 Billion spam messages sent daily, cost society 20 Billions

US

� Global Payments, reported credit card data leakage in 2012

and its share price drops 9% immediately

� Heartland Payment Systems, paid more than US 110

million to Visa, Master and American Express and other

companies to settle claims

Sony warns of almost 25 million extra user detail thefthttp://www.bbc.co.uk/news/technology-13256817

Lec1: Introduction to Pattern Recognition 6

Cyberinfrastructure vulnerability

� Steganography made news headlines

� U.S. charged 11 individuals to act as unlawful

agents of a foreign nation by using

steganography to embed messages in more

than 100 image files posted on public websites

� Rumor has it that the 911 terrorists used stego

media to communicate their attack plan

http://www.nij.gov/nij/topics/forensics/evidence/digital/analysis/steganography.htmhttp://www.justice.gov/opa/pr/2010/June/10-nsd-753.html

Lec1: Introduction to Pattern Recognition 7

Challenge of Cybersecurity

� Misuse Detection

� Stealing information, unauthorized control

� Anomaly Detection

� Denial of Service Attack

� Scan Detection

� Scan for weak point in cyber-infrastructure

� Network Profiling

� Mimic normal network flow patterns

� Steganography

� Hide information in JPEG files on websites

“Data Mining and Machine Learning in Cybersecurity”, CRC Press, 2011

Lec1: Introduction to Pattern Recognition 8

Challenge of Cybersecurity

� Machine Learning for Cybersecurity

� Misuse Detection

� Classify abnormal behavior or command sequence

� Anomaly Detection

� Classify abnormal burst of flow and increase in flow traffic

� Scan Detection

� Classify incoming request to be malicious or not

� Network Profiling

� Learn the pattern of network flow to facilitate misuse and anomaly detection, understanding network pattern and discovery of abnormal flows

� Steganography

� Classify a JPEG image containing hidden message or not

Lec1: Introduction to Pattern Recognition 9

Cybersecurity Solutions

� Policy Driven Approach

� Simulation Approach

� Data Mining Approach

� Machine Learning Approach

� Hybrid Approach

Lec1: Introduction to Pattern Recognition 10

Challenge of Cybersecurity

� Big data:

� 2.5 quintillion (2.5×1018) bytes of data created daily in 2012

� 3 V phenomenon -- volume (data size), velocity (data speed in and out), and variety (Image, map, FB, vedio, email, website)

� Imbalance between attack and normal patterns

� Among billions of TCP packets, only a few of them may be malicious

� Attacks and abnormal behaviors are minorities comparing to normal usage and packets

� Fast response

� Characteristics of the Internet and local area networks

� Robustness

� With only training samples, it is not reasonable to ask for correct decision for all possible future attacks which may be very different from training samples

� Robustness refers to the ability of correctly classifying patterns similar to training samples

Lec1: Introduction to Pattern Recognition 11

Steganography� Steganography method can hide information in carriers

such as images, audio or video files that no one, except

the sender and the recipient, suspects the existence of

the message

Stego imageOriginal image Secret messages

Lec1: Introduction to Pattern Recognition 1212

Steganography for Secured

Communication

Send a lovely cat

to Mary

Lec1: Introduction to Pattern Recognition 13

Looks good and nothing suspicious

Network Admin

Steganography for Secured Communication

Lec1: Introduction to Pattern Recognition 1414

Steganography for Secured Communication

Wow! What A

lovely cat!

Lec1: Introduction to Pattern Recognition 15

How to hide the cat?

Simple LSB

248,230,250

30,19,13

30,20,1230,19,12

31,21,1530,21,1230,20,12

M

O

O

L

Every pixel is represented by three [0, 255] values (RGB)

12,20,30; 12,21,30; 15,21,31; …250,230,248

00001100 00010100 00011110

Lec1: Introduction to Pattern Recognition 16

How to hide the cat?

Simple LSB

218,222,240

229,219,243

230,220,242230,219,242

231,233,245230,233,245230,232,245

M

O

O

L

Every pixel is represented by three [0, 255] values (RGB)

245,232,230; 245,233,230; 245,233,231; …240,222,218

11110101 11101000 11100110 11110101 11101001 11100110

We do the same conversion to the cover image

Lec1: Introduction to Pattern Recognition 17

How to hide the cat?

Simple LSB

00001100 00010100 00011110

11110101 11101000 11100110 11110101 11101001 111001100 0 0 0 1 1

11110101 11101000 11100110 11110101 11101001 11100110

11110100 11101000 11100110 11110100 11101001 11100111

00001100 00010100 00011110

Lec1: Introduction to Pattern Recognition 18

How to hide the cat?

Simple LSB

218,222,240

229,219,243

230,220,242230,219,242

231,233,245231,232,244230,232,244

M

O

O

L

Every pixel is represented by three [0, 255] values (RGB)

244,232,230; 244,233,231; 245,233,231; …240,222,218

11110100 11101000 11100110 11110100 11101001 11100111

Lec1: Introduction to Pattern Recognition 1919

Internet

Steganography & Steganalysis

Raw

Image

JPEG

Image

Quantization

Steganography

Stego

JPEG

Steganalysis Feature

Extraction

Classification Result

(Stego or not)

Steganalysis

classification

Steganalysis

Lec1: Introduction to Pattern Recognition 20

Steganalysis� Train a classifier to classify an image whether

it contains a hidden message or not

� Basic idea is to identify the changes in

statistics of transitional probabilities of

different DCT coefficients in the JPEG file

after compression

Lec1: Introduction to Pattern Recognition 21

Robust Steganalysis

� Why current ML methods not effective?

� Performance of current steganalysis methods drops significantly when images of training and testing are different and/or compressed by different quantization tables

� Difference in quantization tables is unavoidable owing to the large variety of digital camera and editing software

� Difference in training and testing images is natural since no training set covers all possibilities

� So, how to design a robust steganalysis method is a key issue for discovering stego images.

Lec1: Introduction to Pattern Recognition 22

Examples of standard and non-standard

quantization tables of 75 quality factor

8 6 5 8 12 20 26 31

6 6 7 10 13 29 30 28

7 7 8 12 20 29 35 28

7 9 11 15 26 44 40 31

9 11 19 28 34 55 52 39

12 18 28 32 41 52 57 46

25 32 39 44 52 61 60 51

36 46 48 49 56 50 52 50

8 6 5 8 12 19 26 31

6 5 7 9 13 29 29 27

6 7 7 11 19 28 35 28

6 9 11 15 25 43 40 30

9 10 18 27 34 54 51 39

11 18 27 31 40 51 57 45

24 31 39 43 51 61 59 50

36 46 48 49 56 49 51 50

8 5 4 8 11 20 25 30

5 6 7 9 13 28 29 27

6 6 8 11 20 29 35 27

7 9 11 14 26 44 40 30

8 11 19 28 34 54 52 39

12 17 27 31 41 51 56 45

25 31 38 44 52 60 60 51

36 46 48 48 55 49 52 49

75s 75ns 75ns

�JPEG compression uses Quantization Table (QT)

•100 standard JPEG Quantization Tables (QTs)

•QT is a 8x8 integer matrix

Lec1: Introduction to Pattern Recognition 2323

Difference in steganalysis features of the SAME IMAGE

compressed by different QTs of different digital cameras

Sony DSC-H9 uses dynamic QTs

Robust to Quantization Table Changes

Lec1: Introduction to Pattern Recognition 24

Training Image

Very Similar Images

Similar Images

Dissimilar Images

Totally DifferentImages

Testing Images

Lec1: Introduction to Pattern Recognition 25

Robust to Different Testing Images

Chen: LGEM:

Chen: LGEM:

Chen: LGEM:

Chen: LGEM:

Chen: LGEM:

Chen: LGEM:Chen: LGEM:

Chen: LGEM:

Chen: LGEM:

Chen: LGEM:

is the training image.

Others are testing images.

Δx denotes Manhattan distance.

Lec1: Introduction to Pattern Recognition 26

The Case of

Salmon and Sea Bass

Lec1: Introduction to Pattern Recognition 27

Pattern Recognition – An Example

Salmon / Sea Bass

� Real Life Example

�A fish packing plant

wants to automate the

process of sorting

incoming fishes

(Salmon / Sea Bass)

on a belt according to

species

?Sea bass

Salmon

Lec1: Introduction to Pattern Recognition 28

PR: Salmon / Sea Bass

Process?

Fish

� Steps in the sorting process

Preprocessing (Isolate Fish, reduce noise…)

Image

Classification

Input Features

Class (Salmon / Sea Bass)

Output

Feature Extraction (Take Measurement)

Refined Image

Sensing (camera)

Object

Lec1: Introduction to Pattern Recognition 29

PR: Salmon / Sea Bass

Process

� Sensing

�Digitize the object to the format which can be

handled by machines

� Preprocessing

�Refine the data

�What can cause problems during sensing?

� E.g. lighting conditions, position of fish on the

conveyor belt, camera noise, etc.

?

Lec1: Introduction to Pattern Recognition 30

PR: Salmon / Sea Bass

Process

� Feature Extraction

�What kind of information can distinguish one

specie of fish from the other?

� E.g. length, width, weight, number and shape of

fins, tail shape, etc.

�Experts may help

� Classification

�Many classification techniques (classifiers)

available

�Discuss in detail later

?

Lec1: Introduction to Pattern Recognition 31

?PR: Salmon / Sea Bass

Example of Feature Extraction

� Fisherman (the expert) :

�A salmon is usually shorter than a sea bass

� Length is chosen (as a feature) as a

decision criterion

� But what is the decision threshold?

Lec1: Introduction to Pattern Recognition 32

?

Histograms of the length feature

for two types of fish in Training Samples

Sea BassSalmon

l*

� 15 is selected as the threshold

� Although sea bass is longer than salmon in general, there are many exceptions

� The experts may be wrong!

� How about other features?

� E.g. lightness

PR: Salmon / Sea Bass

Length as feature

Lec1: Introduction to Pattern Recognition 33

?

Histograms for the lightness feature

for the types of fish in Training Samples

Sea BassSalmon

� 5.5 is selected as the threshold

� Using “lightness” as a feature is much betterthan using “length”

PR: Salmon / Sea Bass

Lightness as feature

Lec1: Introduction to Pattern Recognition 34

?PR: Salmon / Sea Bass

Cost Consideration� Besides accuracy, “costs of different errors” should also be

considered

� For example:

� Case 1: Company’s view

� Salmon is more expensive than sea bass. Selling Salmon with the

price of sea bass will be a loss

� “If a fish is a salmon, it is classified as sea bass” HIGH cost

� “If a fish is a sea bass, it is classified as salmon” LOW cost

� Case 2: Customer’s view

� Customers who buy salmon will be very upset if they get sea bass;

Customers who buy sea bass will not be upset if they get the more

expensive salmon

� “If a fish is a salmon, it is classified as sea bass” LOW cost

� “If a fish is a sea bass, it is classified as salmon” HIGH cost

Lec1: Introduction to Pattern Recognition 35

PR: Salmon / Sea Bass

Cost Consideration� How would these cost considerations affect our decision?

?

Sea BassSalmon

Sea BassSalmon Sea BassSalmon

Case 1 Case 2

More seabass

Mistaken as salmonMore salmon

Mistaken as seabass

Lec1: Introduction to Pattern Recognition 36

PR: Salmon / Sea Bass

Multiple Features

� If using only ONE feature is not good enough, more features can be used.

� Two features:

�Lightness: x1

�Width: x2

� A fish is represented by a point in a two-dimensional feature space:

?

Lec1: Introduction to Pattern Recognition 37

PR: Salmon / Sea Bass

Simple Classifier?

The two features (lightness and width)

for sea bass and salmon in Training Samples

� A decision boundary

can be drawn to divide

the feature space into

two regions

(Salmon / Sea Bass)

� Is it (a linear classifier)

too simple?

Sea Bass

Salmon

?

What is this unseen fish?

Lec1: Introduction to Pattern Recognition 38

PR: Salmon / Sea Bass

Complex Classifier

� Will other classifiers be better than Linear

Classifier (Straight Line)?

� More complex classifier:

?

� It classifies training

samples perfectly

� However, the ultimate

objective is to classify

unseen fishes correctly

� Can it be generalized to

unseen sample??

What is this unseen fish?

Lec1: Introduction to Pattern Recognition 39

PR: Salmon / Sea Bass

Appropriate Classifier� Simple Classifier

� Performance on the training samples is not good

� Complex Classifier� Cannot be generalized to the unseen samples

� Tradeoff between accuracy of training samples and complexity

?

� Look more

reasonable

� Not too complex

� Good in classifying

the training samples

?

What is this unseen fish?

Lec1: Introduction to Pattern Recognition 40

Pattern Recognition Systems

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Training Samples

Learning

Model

Learning Phase

Classifying Phase

Lec1: Introduction to Pattern Recognition 41

Pattern Recognition Systems

� Sensing

�Use of a transducer

� E.g. camera or microphone

�Depends on

� Bandwidth

� Resolution

� Sensitivity

� Distortion of the transducer

� Cost

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 42

Pattern Recognition Systems

� Pre-Processing (Segmentation)

�Patterns should be well separated

and should not overlap, E.g.

� Fish Recognition:

� Fish often abutting or overlapping

� System must determine where one fish ends and the

next begins

� Speech recognition:

� Clear boundaries between two consecutive words

(Difficult for speech because we only receive a sequence

of waveform)

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 43

Pattern Recognition Systems

� Feature Extraction

�Choice of features vital to success of a pattern

recognition system

�Problem and domain dependent

� Requires domain knowledge

�Criteria:

Recognition result should be

� Invariant to translation

� Location of fish on conveyor belt irrelevant

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 44

Pattern Recognition Systems

� Invariant to rotation� Rotation of fish irrelevant

� Invariant to scale� Size of fish irrelevant

� This is why the length of fish is not a good feature

� Invariant to occlusion� Parts of the object hidden by other parts irrelevant

� The eye of fish may not be captured by the camera when

the fish is rotated

� Face recognition with and without sun-glasses

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 45

Pattern Recognition Systems

� Invariant to projection distortion� Distortion caused by camera angle

or distance irrelevant

� Invariant to rate� In speech recognition, duration of the word irrelevant

� Different people speak at different speeds

� Invariant to deformation� Particularly significant for Handwritten character

recognition

� Different people write the same word differently

� Even the same person at different times

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 46

Pattern Recognition Systems

� Example: Fish Recognition

� The features should be invariant to rotation

� However, for character recognition, a good feature should not

be invariant to rotation

� Example: Classification of horse and dog

� Number of leg is not a good feature

� Body color is not a good feature

� Height may be a good feature

� Example: Classification of spices of dog

� Body color is important

� Height may or may not be a good feature

= ≠

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 47

Pattern Recognition Systems

� Classification

�Use a feature vector provided by a feature

extractor to assign an object to a category

�Two factors decide the degree of difficulty of

the classification

� Variability in the feature values in the same

category

� Small within-class variation is preferred

� Variability in the feature values in the different

categories

� Large inter-class variation is preferred

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 48

Pattern Recognition Systems

� Action / Decision

(Post Processing)

�Exploit context input-dependent information

other than from the target pattern itself to

improve performance

�After classification, we could perform some

actions (cost) based on the classification

result

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 49

Pattern Recognition Systems

�We may also measure the

performance of classification

� Error rate

�Percent of patterns being correctly classified

� Risk

�Different misclassifications may lead to different

penalties

�Salmon is more expensive than sea bass

�Higher penalty for misclassifying salmon to

be sea bass

Environment

Preprocessing

Classification

Feature Extraction

Sensing

Action / Decision

Lec1: Introduction to Pattern Recognition 50

Pattern Recognition

Design Cycle

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 51

Pattern Recognition

Design Cycle

� Data Collection

�Collect samples for real environment

�Separate samples into two different sets

exclusively:

� Training Samples (For training)

� Testing Samples (For evaluation)

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 52

Pattern Recognition

Design Cycle�How to decide a sample set being

adequately large and representative?

� The more the better ?

� May be, depending on quality of samples collected. Time

and cost could be constraints, e.g.

� Medical data very expensive to collect

� Stock Market data is time dependent

� Not really. Sometimes

� Too much data could be confusing, e.g., internet

traffic data

� Only representative samples are useful

� Try to collect samples randomly without bias

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 53

Pattern Recognition

Design Cycle

� Feature Selection

�Depends on the characteristics

of the problem domain

�Prior Information

� E.g. Expert’s advices

�Computational cost and feasibility

� Simple to extract

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 54

Pattern Recognition

Design Cycle

�Discriminative� Similar values for patterns in same class

� Different values for patterns in different classes

� Invariant to transformation

� E.g. Translation, rotation and scale

�Robust to noise

� E.g. Occlusion, distortion, deformation, and variations in environment

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 55

Pattern Recognition

Design Cycle

� Model Selection

�Many different models� E.g. Neural Network, Decision Tree

� Will be discussed in detail later

�Domain dependent

� 2 class problem?

� Many features?

� Scattered data?

�How close to the true model?� Classification performance

� Complexity

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 56

Pattern Recognition

Design Cycle

� Training

�“Knowledge” is learnt from training samples

� Parameters of the classifiers are determined

�Only samples in training set are used

�Supervised learning� A teacher provides a class label for each pattern in

the training set

�Unsupervised learning� No teacher is available => No class label is

provided

� Input patterns are grouped “naturally”

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 57

Pattern Recognition

Design Cycle

� Evaluation

�Can the trained model generalize the

knowledge from training samples to

future unseen samples?

�Ultimate Objective

� Performance on unseen samples

(Generalization Ability)

� Cannot be calculated

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 58

Pattern Recognition

Design Cycle

�Measurable Criteria

� Performance on training samples

(Training Accuracy)

� Perfect in training samples > Over-fitting problem

(slide 38)

� Performance on testing samples

(Testing Accuracy)

� Better evaluation criterion than training samples

� Testing samples are not involved in training process

� Good at testing samples may be very bad at unseen

samples (No guarantee!)

� Repeating the experiments may help

Data Collection

Model Selection

Evaluation

Training

Feature Selection

Start

End

Lec1: Introduction to Pattern Recognition 59

Comparing Classifiers

� For a classification problem, given

�Dataset D

�Classifiers A and B

� How can we measure which classifier, A or

B, is better for D?

Lec1: Introduction to Pattern Recognition 60

Comparing Classifiers

� Method

�Randomly separate D into Training Set and

Testing Set

�Use Training Set to train A and B

�Use Testing Set to measure the performances

of the trained A and B

�Select the better performing classifier

� Any problem with this proposed approach?

Lec1: Introduction to Pattern Recognition 61

Comparing Classifiers

� Problem: The winner may just be lucky in

performing better for that specific testing

set. No guarantee for different testing sets

� Two re-sampling techniques are

introduced to reduce the bias on testing

set:

� Independent Run

�Cross-Validation

Lec1: Introduction to Pattern Recognition 62

Comparing Classifiers

Independent Run� Statistical method

� Also called Bootstrap and Jackknifing

� Repeat the experiment “n” times independently

� Repeat n times

� i is the number of running time

� Randomly separate D into Training Setiand Testing Set

i

� Use Training Setito train A

iand B

i

� Use Testing Setito calculate the accuracy of trained A

iand B

i

� Select the classifier with higher mean (average)

accuracy

Lec1: Introduction to Pattern Recognition 63

Comparing Classifiers

Cross-Validation� M-fold Cross-Validation

� Dataset D is randomly

divided into m disjoint sets Di

of equal size n / m, where n is

the number of samples in

dataset

� Classifier is trained m times

and each time with different

set held out as a testing set

� Select the classifier with

higher mean accuracy

D1

D2

D3

D4

D5

D

D1

D2

D3

D4

D5

D1

D2

D3

D4

D5

D1

D2

D3

D4

D5

D1

D2

D3

D4

D5

D1

D2

D3

D4

D5

1

2

3

4

5

Training Set

Testing Set

randomly

Lec1: Introduction to Pattern Recognition 64

Pattern Recognition

� Three types of learning

�Supervised Learning

� Lectures 01 - 07

�Unsupervised Learning

� Lecture 08

�Reinforcement Learning

� Not covered in this course

Lec1: Introduction to Pattern Recognition 65

Pattern Recognition

Supervised Learning

� Need a teacher

� A class label for each training sample is

known

� Any mistake made by the model during

training is known

� Examples:

�Neural Network and Decision Tree

Lec1: Introduction to Pattern Recognition 66

Pattern Recognition

Unsupervised Learning

� No teacher is available

� Don’t know if a model is correct or not

� Patterns are grouped “naturally”

� Examples:

�Clustering

Lec1: Introduction to Pattern Recognition 67

Pattern Recognition

Reinforcement Learning

� Training examples as input-output pattern

pairs, with evaluative output provided by a

critic( “lazy” teacher)

�Just know the answer is incorrect

�But do not know the correct answer

� Examples:

�Learning to play chess game