infsy540 information resources in management

Post on 19-Jan-2016

44 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

INFSY540 Information Resources in Management. Lesson 10 Chapter 10 Artificial Neural Networks and Genetic Algorithms. Learning from Observations. Learning can be viewed as trying determine the representation of a function. - PowerPoint PPT Presentation

TRANSCRIPT

INFSY540Information Resources in Management

Lesson 10

Chapter 10

Artificial Neural Networks and Genetic Algorithms

Chapter 10 Slide 4

Learning from Observations

Learning can be viewed as trying determine the representation of a function.

Examples of input output pairs with two points and then with three points.

Ockham’s razor- The most likely hypothesis is the simplest one that is consistent with all observations.

Chapter 10 Slide 5

Cognitive vs Biological AI

Cognitive-based Artificial Intelligence Top Down approach Attempts to model psychological processes Concentrates on what the brain gets done Expert System approach

Biological-based Artificial Intelligence Bottom Up approach Attempts to model biological processes Concentrates on how the brain works Artificial Neural Network approach

Chapter 10 Slide 6

Introduction to Neural NetworksAs a biological model, Neural Nets seek to emulate how the human brain works.

How does the brain work? The human receives input from independent nerves. The brain receives these independent signals and interprets

them based on past experiences.Much brain reasoning is based on pattern recognition. Patterns of impulses from the skin identify simple

sensations such as pain or pressure The brain decides how to react to these impulses and

sends output signals to the muscles and other organs.

Chapter 10 Slide 7

What are ANNs?

Rough Definition: an adaptive information processing system designed to mimic

the brain’s vast web of massively interconnected neurons.

Attributes: system of highly interconnected processors, each operating

independently and in parallel trained (not programmed) for an application learns by example processing ability is stored in connection weights which are

obtained by a process of adaptation or learning

Chapter 10 Slide 9

Biological NeuronDendrites

Node of Ranier

End Brush

Myelin Sheath

Axon

Nucleus

Cell Body

Cytoplasm

Chapter 10 Slide 12

A Model of an Artificial Neuron

Output

X1

X2

Xn

Single Node

Inputs are Stimulation Levels

Output is the Response of the Neuron

(dendrites)

(axon)

(neuron)

Chapter 10 Slide 13

A Model of an Artificial Neuron

Output

Single Node with Sum of Weighted Inputs

W1

W2

Wn

X1

X2

Xn

S = W1X1 + W2X2 + ...+ WnXn = WiXi

Weights are Synaptic Strength (Local memory stores previous computations, modifies weights)(synapses)

(dendrites)

(axon)

(neuron)

S

Chapter 10 Slide 14

A Model of an Artificial Neuron

Single Node with Sum of Weighted Inputscompared to a threshold to determine output

Inputs

Output = f (S)

W1

W2

Wn

f(S)

Weights Transfer Function determines output (based on comparison of S to threshold) X1

X2

Xn

s

ƒ(s)

Step Functions

ƒ(s)

Sigmoid Function

S

Chapter 10 Slide 15

Outputs continue to spread the signal inj = wijouti

Types of connections Excititory: positive Inhibitory: negative Lateral: within same layer Self: connection from a neuron back to itself

Connection (Artificial Synapse)

ƒaxon

neuroni2

i3

i4

wi5

w

w

w

ƒ

ƒ

ƒ

ƒ

i1w

neurons

dendrites

injouti

synapses

Lateral

Self

ƒ

Chapter 10 Slide 16

This is one example of how the nodes in a network can be connected. It is typically used with “backpropagation”.

Another example is for every node to be connected to every other node.

Feedforward ANNInput Layer (distribution) Hidden Layers

(processing)Output Layer (processing)

y

y

1

2

x1

x2

x3

x4

x5k = 1

k = 2k = 3 = L

Output S

ignalsInpu

t S

igna

ls

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

k = 0

Chapter 10 Slide 17

How ANNs Work

First, data must be obtained.

Second, the network architecture and training mechanism must be chosen.

Third, the network must be trained.

Fourth, the network must be tested.

Chapter 10 Slide 18

How ANNs WorkNetwork Training: Begin with a random set of weights. The net is provided with a series of inputs and corresponding

outputs (one pair of inputs/outputs at a time). The net calculates its own solution and compares it to the correct one.

The net then adjusts the weights to reduce error. Training continues until net is good enough or run out of time.

Network Testing (i.e. Validation): The net is tested with cases not included in the training set. The net output and desired output are compared. If enough test set cases are incorrect, then the net must be

retrained and retested.

Chapter 10 Slide 19

Example: Tree Classification

Chapter 10 Slide 20

Classification System

Sensing System imaging system, spectrometer, sensor array, etc.

Measurements (Features)wavelength, color, voltage, temperature, pressure,

intensity, shape, etc.

SensingSystem

envi

ronm

ent

feature valuesX1X2

Neural Network

11223344

labeled pattern

Chapter 10 Slide 21

Example: Tree Classifier

Two Features:

ClassifierNeedle Length

Four Classes:

Cone Length

Black Spruce (BS)Western Hemlock (WH)Western Larch (WL)White Spruce (WS)

Sensor: Ruler

INPUTS OUTPUTS

Chapter 10 Slide 22

Tree Classifier: DataCone Needle Tree BS WH WL WS

25 mm 11 mm Black Spruce 1 0 0 0

26 mm 11 mm Black Spruce 1 0 0 0

26 mm 10 mm Black Spruce 1 0 0 0

24 mm 9 mm Black Spruce 1 0 0 0

20 mm 13 mm Western Hemlock 0 1 0 0

21 mm 14 mm Western Hemlock 0 1 0 0

19 mm 8 mm Western Hemlock 0 1 0 0

21 mm 20 mm Western Hemlock 0 1 0 0

28 mm 30 mm Western Larch 0 0 1 0

37 mm 31 mm Western Larch 0 0 1 0

33 mm 33 mm Western Larch 0 0 1 0

32 mm 28 mm Western Larch 0 0 1 0

51 mm 19 mm White Spruce 0 0 0 1

50 mm 20 mm White Spruce 0 0 0 1

52 mm 20 mm White Spruce 0 0 0 1

51 mm 21 mm White Spruce 0 0 0 1

Chapter 10 Slide 24

Tree Classifier: Training Process

Cone Length (mm)0 10 20 30 40

Cone Length (mm)50 60

0

10

20

30

40

Ne

ed

le L

en

gth

(m

m)

western hemlock

white spruce

western larch

black spruce

Iterations = 0 MSE = 0.754

0 10 20 30 40 50 60

0

10

20

30

40

western hemlock

white spruce

western larch

Iterations = 1000 MSE = 0.235

0 10 20 30 40

Cone Length (mm)50 60

0

10

20

30

40

Ne

ed

le L

en

gth

(m

m)

western hemlock

white spruce

western larch

black spruce

Iterations = 2000 MSE = 0.046

0 10 20 30 40

Cone Length (mm)50 60

0

10

20

30

40

western hemlock white

spruce

0 10 20 30 40

Cone Length (mm)50 60

0

10

20

30

40western larch

black spruce

Iterations = 3000 MSE = 0.009

Chapter 10 Slide 25

Tree Classifier: Results

0 10 20 30 40

Cone Length (mm)50 60

0

10

20

30

40

western hemlock white

spruce

0 10 20 30 40

Cone Length (mm)50 60

0

10

20

30

40 western larch

black spruceN

ee

dle

Le

ng

th (

mm

)

Iteration40003000200010000

0.0

0.2

0.4

0.6

0.8Mean Square Error (MSE)

Training ErrorValidation Error

Black Spruce

Western Hemlock

Westerm Larch

White Spruce

Cone Length

Needle Length

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

Chapter 10 Slide 26

Modeling Example

Function Approximation: On the interval [0,1]

f(x) = 0.02(12 + 3x - 3.5x2 + 7.2x3)(1 + cos4x)(1 + 0.8 sin3x

Data (many hundred points) x f(x)

0.0 0.480

0.1 0.529

0.2 0.084

0.3 0.061

0.4 0.181

0.5 0.108

0.6 0.195

0.7 0.071

0.8 0.107

x

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

ƒ

f(x)

Chapter 10 Slide 27

Modeling: Results

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1 1.2

Function

ANN Model

ErrorRMS = 0.0117

MSE = 0.000137

Chapter 10 Slide 28

Character Recognition Example

The green circles of the input nodes represent 1, the dark 0.The green boxes of the output nodes represent 1, the white 0.

0

1

0

Input Nodes Output NodesHidden Nodes

= 1

= 3

= 2

Chapter 10 Slide 29

Ways to Categorize ANNs

Architecture of Nodes and Arcs

(i.e. How are nodes connected)There are many different architecturesWe will show the Feedforward and Recurrent

General Training SchemesSupervised or Unsupervised (Will discuss later)

Specific Training ApproachesMany different types (Will discuss later)

Chapter 10 Slide 35

Some Tasks Performed by ANNsPrediction/Forecasting Recognize Trends in Time Series Data

Decision Recognize Key Components in a Given Situation

Classification Recognize Objects and Assign to Appropriate Classes

Modeling Recognize Similar Conditions to those in the Model

Chapter 10 Slide 36

Neural Net Application: Diagnosis

Breast Cancer Diagnosis Developed from Neural Nets trained to identify tanks

Headache Diagnostic System There are over 130 different types of headaches (believe

it or not), and each has separate causes or combinations of causes (dietary, environmental, etc.).

A neural net can help classify the headache based on the location, severity, and type (constant, throbbing, ...) of pain present.

Chapter 10 Slide 37

Neural Net Applications:Diagnosis & Repair

Shock Absorber Testing Determining what particular portion of a shock absorber is going to

fail is a difficult task. Similar to the TED* expert system, neural networks can be used to

identify faults in mechanical equipment. Some researchers are examining neural network systems used to analyze shock response patterns (force applied vs. displacement of shock cylinder).

* Work is being done on developing a neural network to improve Turbine Engine Diagnostic expert system

Chapter 10 Slide 38

Strengths of Neural Nets

Generally efficient, even for complex problems.

Remarkably consistent, given a good set of training cases.

Adaptability

Parallelism

Chapter 10 Slide 39

Why Use Neural Networks?Mature field -- widely accepted

Consistent

Efficient

Use existing historical data to make decisions

Chapter 10 Slide 40

Limitations of Neural Nets

Amount of training data needed.Training cases must be plentiful .Training cases should be consistent.Training cases must be sufficiently diverse.

Outcomes must be known in advance (for supervised training).

Scaling-up the net is difficult given new outcomes:No satisfactory mathematical model exists for this process

-- yet.The net must be retrained from scratch if the set of desired

outcomes change.

Chapter 10 Slide 43

Pacific Northwest Laboratory http://www.emsl.pnl.gov:2080/docs/cie/neural/

Applets for Neural Networks and Artificial Life http://www.aist.go.jp/NIBH/~b0616/Lab/Links.html#BL

Function Approximation Applethttp://neuron.eng.wayne.edu/bpFunctionApprox/bpFunctionApprox.html

Web Applets for Interactive Tutorials on ANNshttp://home.cc.umanitoba.ca/~umcorbe9/anns.html#Applets

Some Good ANN References

Some ANN WWW SITESSome ANN WWW SITES

•A Practical Guide to Neural Nets, W. Illingsworth & M. Nelson , 1991 (A Very Easy Read!!)

•Artificial Neural Systems, J. Zurada, 1992

Chapter 10 Slide 44

More ANN ReferencesFunction Approximation Using Neural Networksneuron.eng.wayne.edu/bpFunctionApprox/bpFunctionApprox.htmArtificial Neural Networks Tutorial www.fee.vutbr.cz/UIVT/research/neurnet/bookmarks.html.iso-8859-1

 MINI-TUTORIAL ON ARTIFICIAL NEURAL NETWORKShttp://www.imagination-engines.com/anntut.htm

Artificial Neural Networks Lab on the Web. www.dcs.napier.ac.uk/coil/rec_resources/Software_and_demos25.html

 Software Examples. The Html Neural Net Consulter. nastol.astro.lu.se/~henrik/neuralnet1.html

 MICI Neural Network Tutorials and Demoswww.glue.umd.edu/~jbr/NeuralTut/tutor.html Sites using neural network applets. http://www.aist.go.jp/NIBH/~b0616/Lab/Links.html

Chapter 10 Slide 45

Chapter 10 Slide 46

Chapter 10 Slide 47

ANN QuestionsANN Questions• What will the inputs be?

• What will the outputs be?

• Will signals be discrete or continuous?

• What if the inputs aren’t numeric?

• How should you organize the network?

• How many hidden layers should there be?

• How many nodes per hidden layer?

• Should weights be fixed, or is there any need to adapt as circumstances change?

• Should you have a hardware or software ANN solution? (i.e. do you need a neural net chip?)

These are questions that only a technologist would need to know. Managers would not generally need to know the answers to these questions.

Chapter 10 Slide 48

Questions about artificial neural networks?Did we cover the math of backpropagation?

Chapter 10 Slide 49

Genetic Algorithms

Optimization and Search are difficult problems: Domains are complex They require heavy computation Getting best solution is nearly impossible

Ops Research has developed techniques for them e.g. linear programming, goal programming

AI community has developed alternative techniques

Chapter 10 Slide 50

What is an Optimization Problem?

To optimize is to “make the most effective use of”, according to Webster’s Dictionary.

Optimization can mean:Maximize effective use of resourcesMinimize costsMinimize risksMaximize crop yieldMinimize casualties

Chapter 10 Slide 51

Typical Optimization Problems

VP Opns wants to visit all company sites while minimizing transportation costs.

Find a series of moves in a chess game that guarantees a victory

Find a maximum value for the function

f(x,y) .( x y ) .

( . (x y ))=

+ -

+ +05

05

1 0001

2 2 2

2 2sin

y- < <1 1s t x- < <1 1. .

Chapter 10 Slide 53

Optimization Problems = Search Problems

Types of Search ProblemsTo find the top of Mount EverestTo find the South PoleTo find the deepest part of the ocean

(aka Mariannas Trench)

Which is the easiest?

How do you know when to stop?

Chapter 10 Slide 54

Illustration of Search Problem

x

y

z

Chapter 10 Slide 55

Difference between Prediction and Optimization

Prediction: What is the nutrition content of a McDonald’s Happy Meal?

Optimization: What is the most nutritious meal at McDonald’s?

Solving optimization problems typically requires solving many interations of smaller prediction problems.

Chapter 10 Slide 56

Problems with Searching

Domains are complex

They require heavy computation

Getting the best solution may be impossible

•10! = 3,628,800 possible combinationsif computer can solve 1,000,000 evaluations per second 3.6 seconds

•25! = 15,500,000,000,000,000,000,000,000 16 billion years to solve this problem

Chapter 10 Slide 57

Sample OptimizationProblem

Think of each possible combination of characteristics of a “zebra” in the Serengeti

The strength of the combination corresponds to how well the zebra evades the lions.

Chapter 10 Slide 58

The “Zebra Model”A gene is a single characteristic about an individual zebra. Some examples of zebras genes listed below.

In GA terms, a gene is a parameter in the solution.

Genes of Zebra

#1. Heart Size#2. Leg Length#3. Forelimb Strength

...#n. Lung Capacity

Chapter 10 Slide 59

The “Zebra Model”The combination of genes is called a chromosome (genome): The genetic makeup of a zebra

Think of each chromosome as a “potential alternative solution”.

Chromosome

Genes of Zebra

#1. Heart Size

#2. Leg Length

#3. Forelimb Strength

...

#n. Lung Capacity

Chapter 10 Slide 60

The “Zebra Model”The fitness describes how well a zebra evades lions.

In Genetic Algorithms, the Fitness Function is a function that calculates how well a chromosome performs.

Chromosome

Genes of Zebra

#1. Heart Size

#2. Leg Length

#3. Forelimb Strength

...

#n. Lung Capacity

Fitness = 37

Chapter 10 Slide 61

The “Zebra Model”A generation describes a herd of zebras.

The GA evaluates a population of chromosomes at once rather than one solution at a time

Fitness = 65

Fitness = 51

Fitness = 75

Fitness = 57

Fitness = 68

Fitness = 77Fitness = 61

Fitness = 55

Fitness = 48

Fitness = 44

Fitness = 42

Fitness = 36

Fitness = 30

Chapter 10 Slide 62

The “Zebra Model”

In each generation, the weakest zebras are caught by the lions.

Fitness = 65

Fitness = 51

Fitness = 75

Fitness = 57

Fitness = 68

Fitness = 77Fitness = 61

Fitness = 55

Fitness = 48

Fitness = 44

Fitness = 42

Fitness = 36

Fitness = 30

Chapter 10 Slide 63

The “Zebra Model”To make up for lost comrades, the surviving zebras reproduce.

Some will be stronger than their parents, others weaker.

Fitness = 68

Fitness = 51

Fitness = 75

Fitness = 57

Fitness = 65

Fitness = 77 Fitness = 61

Fitness = 55

Fitness = 48

Fitness = 44

Fitness = 83Fitness = 38

Fitness = 66

Chapter 10 Slide 64

The “Zebra Model”Occasionally, a child has a mutation: Usually these mutant children are weaker than their parents and die. Occasionally these children have some new characteristic that makes

them stronger than previous generations.

This mutation allows the GA to search new regions of the search space and examine new types of zebras.

Fitness = 68

Fitness = 51

Fitness = 75

Fitness = 57

Fitness = 65

Fitness = 77 Fitness = 61

Fitness = 55

Fitness = 48

Fitness = 44

Fitness = 83Fitness = 38

Fitness = 66

Chapter 10 Slide 65

The “Zebra Model”Eventually, the overall population of zebras gets better. The best possible zebra may be found. But that is not guaranteed.

The process could take hundreds or thousands of generations.

Fitness = 236

Fitness = 197

Fitness = 244

Fitness = 213

Fitness = 225

Fitness = 243 Fitness = 217

Fitness = 208

Fitness = 190

Fitness = 178Fitness = 253

Fitness = 229Fitness = 166

Chapter 10 Slide 66

Interdisciplinary TermsGenetic Algorithm: A mathematical search process based on

the theory of evolution.Biological Term GA Term Engineering/OR Term

Gene Gene Parameter or Variable

Chromosome Chromosome Alternative Solution

Herd (of Zebras) Generation Solution Search Space

NOT a random search algorithm

Based on Darwin’s Theory of Evolution: Changes in genetic composition that favors survival of individual Finds good solutions for a variety of problems

Chapter 10 Slide 67

Natural Evolutionary Process

PreconditionsEntity must have ability to reproducePopulation of these entities must existVariety of entitiesDifference in ability to survive based on variety

Chapter 10 Slide 68

GA Algorithm

Determine Representation (Genes and Fitness)

Create Initial Population (Random)

Evaluate Individual ( Decode and Determine Fitness)

Perform Selection (For Reproduction)

Apply Crossover (Exchange Genetic Materials)

Apply Mutation (Randomly)

Apply Replacement Scheme (Kill Parents?)

Termination Criteria Met?No

Yes

Chapter 10 Slide 87

Example

max f(x) = x2 such that 0<= x < 31representation = finite length string of 5 bits

ON

OFF0 1 0 0 1 8 1 = 9

16 8 4 2 1

GO p select Expect MatingString Initial x f(x) fi/ f fi/ f Actual Pool K G1 x 1 01101 13 169 .14 .58 1 01101 4 01100 12 2 11000 24 576 .49 1.97 2 11000 4 11001 25 3 01000 8 64 .06 .22 0 11000 2 11011 27 4 10011 19 361 .31 1.23 1 10011 2 10000 16

f = 1170 1.00 4.00 f = 292.5

f

Only technologists would be interested in how this actually works. If you are interested, let me know after class. I’d be glad to explain it.

Chapter 10 Slide 88

Advantages of GAs

GA always returns a solution in a known search time

You don’t describe how to find a solution, only that you recognize a good one when you see it Results in novel solutions Solves problems you don’t know how to solve

Requires low-level access to experts Good if only know how to describe a solution

Very flexible Only change the fitness function?

Chapter 10 Slide 89

Problems with GAs

Important limitation/research issue with GAs: It’s impossible to predict optimal population size,

crossover method, etc.

A GA might still plateau at A solution, not THE solution. One helpful approach: Mutation Kick start the process into a new direction.

Requires a GOOD Fitness function.

No explanation subsystem

Chapter 10 Slide 91

Questions about Genetic Algorithms?

top related