artificial neural networks - computational science - home

49
Lyle N. Long 1 Artificial Neural Networks Artificial Neural Networks Lyle N. Long Lyle N. Long Professor of Aerospace Engineering Professor of Aerospace Engineering Director, Institute for Computational Science Director, Institute for Computational Science http://www.personal.psu.edu/lnl http://www.personal.psu.edu/lnl [email protected] [email protected] The Pennsylvania State University, University Park, PA 16802 The Pennsylvania State University, University Park, PA 16802 Seminar for Institute for Computational Science Seminar Series Seminar for Institute for Computational Science Seminar Series Oct. 2005 Oct. 2005

Upload: others

Post on 12-Sep-2021

6 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 1

Artificial Neural NetworksArtificial Neural Networks

Lyle N. LongLyle N. Long

Professor of Aerospace EngineeringProfessor of Aerospace EngineeringDirector, Institute for Computational ScienceDirector, Institute for Computational Science

http://www.personal.psu.edu/lnlhttp://www.personal.psu.edu/[email protected]@psu.edu

The Pennsylvania State University, University Park, PA 16802The Pennsylvania State University, University Park, PA 16802

Seminar for Institute for Computational Science Seminar SeriesSeminar for Institute for Computational Science Seminar SeriesOct. 2005Oct. 2005

Page 2: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 2

IntroductionThere is increasing interest in neuroscience, intelligent systems, artificial intelligence, robots, ...Neural networks are just one approach in this area Can we build neural networks the size of the human brain or larger ?

For neuroscience, can we develop systems that emulate the human brain ?For intelligent systems, can we build enormous neural networks (for pattern recognition, nonlinear function approximation, etc) ?

If we can develop them, how long would it take to train them?

After all, it takes about 18 years to train a human...Could these become “conscious” ?

Page 3: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 3

Uses of Neural NetworksPattern recognitionFunction approximationScientific classificationControlCognitive models...

For linear applications or linear equations, you don’t really need ANN’s,but for nonlinear applications or equations they are quite valuable. Most things

are nonlinear, and linear theory and linear algebra has been used too long.

Page 4: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 4

Neural Networks

Neural Networks

Engineering Applications

Brain Modeling / Simulation

Biological plausibilityAlgorithm efficiency and accuracy

Intelligent Systems Cognitive modeling and Neuroscience

Page 5: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 5

Aston Martin DB9

“The 2005 DB9 contains the first onboard neural network in an engine control module. Unlike traditional computer systems that need to be programmed for each step, neural networks are programs modeled on the way human brains learn and adapt. The DB9's module keeps tabs on engine combustion performance with a sophisticated software program that compares actual engine performance to the design specifications.”

http://media.ford.com/newsroom/feature_display.cfm?release=18677

Page 6: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 6

Types of ANN’sMulti-layer perceptrons (MLP) SpikingAdaptive Resonance Theory (ART)Recurrent

...

Page 7: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 7

Introduction (cont.)Massively parallel computers are approaching power of human brainCan we develop neural networks which work well on these machines?It is well known that artificial neural networks do not scale well

Page 8: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 8

Creatures and Technology

from:H. Moravec

IBMBlueGene/L

Page 9: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 9

Human Brain vs SupercomputersHuman Brain

~100 billion neurons (1011) (for comparison, a rat cortex has about 30 million)~1000 synapses per neuron (1014 total ) This is roughly 100 terabytes (1014) of data storageIt is also capable of roughly 1000 teraflops (1015 operations per second)

IBM BlueGene/L Computer (DOE) has:65,536 processors ( 1013 transistors? )33 terabytes RAM (1013)137 teraflops (1014 operations per second)

NASA’s Columbia computer has:10,240 Itanium processors (1.5 GHz)10 terabytes RAM (1013) 40 teraflops (1013 operations per second)Theoretically more capable than the brain of a monkey ... and near human capability ...

Page 10: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 10

Brains

Page 11: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 11

Human Neocortex

• Outer layer of brain, about size and shape of wrinkled napkin.• Neocortex exists primarily in mammals (somewhat in birds and reptiles)• Responsible for Perception, language, imagination, mathematics, art,

music, planning, ... • Six layers and columnar structure• Approximately 30 billion neurons • Size = 50 cm x 50 cm x 2 mm• About 50,000 neurons / mm2 of sheet

from: T. Dean

Page 12: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 12

Neocortex Area

Human: 2500 cm2

Monkey: 250 cm2

Cat: 83 cm2

Rat: 6 cm2

Page 13: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 13

Neurons: Dendrites and Axons(Information travels thru electrical signals)

(about 3microns)

Page 14: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 14

Synapses

Neuron

SynapseNeurotransmitters

In dendrites and axons signals propagate via electrical signals

In synapses, information propagates via chemical-based neuro-transmitters

There are roughly 1014

synapses in human brains

Page 15: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 15

Synapses“You are your synapses” (LeDoux)Your memories, emotions, etc. are stored in your synapsesLearning occurs via changes to the synapsesSome of the synapses are set at birth, while others are trainedHuman memory is a product of the synapses, and the process is often referred to as Hebbianlearning (after D.O. Hebb, a Canadian neuroscientist, The Organization of Behavior, Wiley, 1949.)

Page 16: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 16

Synapses (cont.)The synapses can be inhibitory or excitatory. Glutamate (excitatory) and GABA (gamma-aminobutyricacid) (inhibitory) are the primary neural transmitters in the synapseThe human ability to hear, remember, fear, or desire are all related to glutamate and GABA in the synapses. Drugs can directly effect synapses:

The drug Valium works by enhancing GABAProzac prevents the removal of serotonin in the synaptic spaceLSD acts on serotonin receptorsCocaine and amphetamines affect norepinephrine and dopamine levels in synapses

Page 17: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 17

SynapsesSome Artificial Neural Network’s (ANN) (e.g. spiking NN’s) are more biologically plausible than othersSome ANN’s (e.g. those that use backpropagation) are better suited to engineering applications (e.g. nonlinear function approximation, pattern recognition, nonlinear control, etc.)The complex chemistry that occurs in human synapses are very crudely approximated in ANN’s(e.g. as weights in ANN)

Page 18: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 18

ConsciousnessWhat is consciousness?Can a computer become conscious?See book by LeDoux (Synaptic Self):

“Conciousness can be thought of as the product of underlying cognitive processes”“… we are never aware of processing, but only of the consequences of processing.”“You are your synapses.”

See book by Hawkins and BlakesleeYour neocortex is reading this text!

Page 19: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 19

ConsciousnessSee book by Dennett (Consciousness Explained):

“Human consciousness is itself a huge complex of memes (or more exactly, meme-effects in brains) that can best be understood as the operation of a “Von Neumannesque” virtual machine implemented in the parallel architecture of a brain that was not designed for any such activities. The powers of this virtual machine vastly enhance the underlying powers of the organic hardware on which it runs...”

Wikipedia.org: “In casual use, the term meme often refers to any piece of information passed from one mind to another.”

Page 20: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 20

Moore’s LawFor about 50 years we have seen computer performance double every 18 monthsIntel expects this to continue until at least 2011Number of transistors per chip area has doubled every 18 months:

1974, Intel 8088, 29,000 transistors2000, Intel Pentium 4, 42,000,000 transistors2011, Intel, 20 billion transistors/chip expected (maybe 128 processors per chip)

Page 21: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 21

Cycle TimesTypical cycle times in the brain are on the order of 20 milliseconds (2E-2)The IBM BlueGene/L computer uses 700 MHz chips, which corresponds to 1.4 nanoseconds (1.4E-9 seconds)Cognitive neuroscientists are trying to emulate the human brain (and often limit their cycle times to 20 ms)Engineers are trying to build Intelligent Systems and would be very happy to have performance and cycle times better than human brains!

Page 22: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 22

IBM BlueGene/L (2005)

65,536 processors33 terabytes RAM (1013)

137 teraflops (1014 operations per second)

http://www.research.ibm.com/journal/rd49-23.html

Page 23: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 23

Future ComputersIntel expects the first petaflop (1015

ops/second) to appear by about 2009 – this is human-brain-level computing power (it might cost $150M and require 8 MWatts of power)By 2011 there could be several petaflopcomputersBy 2011 there could be supercomputers several times more powerful than human brains

S. Wheat, AIAA Paper No. 2005-7148, InfoTech@Aerospace Conference, Sept., 2005

Page 24: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 24

Very Rough No. of Inputs to Brain

O(104)Hair cells in cochleaO(106)Skin nerve endingsO(108)Retinal rodsO(106)Retinal conesO(106)Olfactory Receptor CellsO(104)Taste budsO(106)Fibers in Optic Nerve

NumberItem

O(104)Hair cells in cochleaO(106)Skin nerve endingsO(108)Retinal rodsO(106)Retinal conesO(106)Olfactory Receptor CellsO(104)Taste budsO(106)Fibers in Optic Nerve

NumberItem

Page 25: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 25

Artificial Neural NetworksArtificial Actual

Crude Model:Inputs ~ Dendrites

Outputs ~ AxonsWeights ~ Synapses

Page 26: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 26

Forward propagation

j i ijx yW=∑( )j jy f x=

2( ) 11 jx

f xe−

= −+

20.5 ( )i iE y d= −∑

y1

y2

y3

y4

w1

w2

w3

w4

xj

(sigmoid fctn.)

(d is desired output)

e

y = f(xj)

A Single Artificial Neuron

The weights (wj) store the “knowledge”and need to be trained

Page 27: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 27

Neural Network

7-3-3 Network:No. of weights = 3 * 7 + 3 * 3 = 30

Imagine a huge network, e.g. 1,000,000 – 100 – 10:No. of weights = 100,001,000 (~ 1 gigabyte)And all neurons are usually connected to all neurons in each layer

Page 28: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 28

Backward propagation

' ' ''(1 ) ( )ij ij j i ij ijw w e x w wα η α= + − + −

(1 )( )j j j j je y y d y= − −

)1.0..()5.0..(

geratelearninggefactormomentum

==

ηα

Given a sets of inputs and corresponding outputs, we use backpropagationto adjust the weights (to learn the dataset)

Page 29: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 29

Scalable Character Test Set(4 output values)

As a small example, lets assume we want a ANN to be able to detect which of the following four letters (A, B, C, or D) are being displayed. And lets represent each letter using 15 pixels (3x5).

1 0

Page 30: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 30

Digitizing Character Pixels (an “A”)

= 0 1 0 1 0 1 1 1 1 1 0 1 1 0 1

Note: you don’t have to use just 0 or 1, you could use 0.0 to 1.0 (ie shades of grey) or colors

Page 31: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 31

Small Character Test Set Example(4 output values)

15 Inputs (3x5 pixels)

4 outputs (A, B, C, and D)

75 weights(i.e. 15x5)

(Note:95 weights =380 bytes)

20 weights(i.e. 5x4)

Page 32: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 32

ANN Training ProcessSet all weights to random valuesUsing a large training set of data, show the network one example at a time (where we know what the output should be) and adjust the weights each time using forward and backward propagation

Page 33: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 33

Example: Training for An “A”

1 0 0 0

0 1 0 1 0 1 1 1 1 1 0 1 1 0 1

FeedForward(determineoutput values ofthis input)

Back Propagation(adjust weights to reduce error

Page 34: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 34

Example: Use of Network after Training

0.85 0.19 0.23 0.54

Given Some Inputs

Since output 1 islarge, this is likely to be an “A” ?

Page 35: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 35

Scalable Character Test Set(48 output values)

Inputs = 15 pixels

Outputs = 48 chars

If we use 10 hidden layers, then:

Weights = = 48*10 + 10*15= 630

(2520 bytes)

Page 36: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 36

How Long to Train? How Large Should Training Set Be?

Very difficult to say ... depends on the problemNumber of Hidden layers to use?

Log2 (no. of inputs) ?Average of no. of inputs and outputs ?

How many hidden layers?Need at least one to capture nonlinear effectsPeople seldom use more than oneHuman brain as roughly six layers in neocortex

Need to avoid overtraining and undertraining of network

Page 37: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 37

Training an ANN

00.5

1

1.52

2.53

3.54

0 1000 2000 3000Iterations

Erro

r

Page 38: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 38

Recurrent ANN

Feedback is used in human brain. Recurrent ANN’s are very valuable for time-series applications.

Page 39: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 39

Our Parallel ANN Approach

Inputs feed into each column

But in hidden layers not all neurons are connected to all other neurons

Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,”AIAA Paper No. 2005-7168, Sept. 2005.

http://www.personal.psu.edu//lnl/papers/aiaa20057168.pdf

Page 40: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 40

Object Oriented (C++)Artificial Neural Network (ANN)

Serial code: C++ Parallel code: C++ combined with MPI

Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,” AIAA Paper No. 2005-7168, Sept. 2005.http://www.personal.psu.edu//lnl/papers/aiaa20057168.pdf

Page 41: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 41

Serial ANN Training Time

Table 1. Training time required for the serial ANN

Resolution Total Weights Iterations Training time (sec)

1 1890 37824 5.4 2 3240 64800 13.1 4 8640 172800 76 5 12690 253776 157 8 30240 604800 842 9 37890 757776 1313

Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,” AIAA Paper No. 2005-7168, Sept. 2005.http://www.personal.psu.edu//lnl/papers/aiaa20057168.pdf

Page 42: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 42

ANN Training (serial code)

0

20

40

60

80

100

120

0 5000 10000 15000 20000

No. of trainings

Per

cent

cor

rect

5184 weights

27072 weights

2016 weights

Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,” AIAA Paper No. 2005-7168, Sept. 2005.http://www.personal.psu.edu//lnl/papers/aiaa20057168.pdf

Page 43: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 43

ANN Training (parallel code)

Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,” AIAA Paper No. 2005-7168, Sept. 2005.http://www.personal.psu.edu//lnl/papers/aiaa20057168.pdf

Page 44: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 44

Training Time (scaling processors with weights)

200

300

400

500

600

700

800

0 100 200 300 400 500 600Number of Processors

Tim

e(se

c)

1.0E+05

5.1E+06

1.0E+07

1.5E+07

2.0E+07

2.5E+07

3.0E+07

Wei

ghts

Training time(sec)Weights

Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,” AIAA Paper No. 2005-7168, Sept. 2005.http://www.personal.psu.edu//lnl/papers/aiaa20057168.pdf

Page 45: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 45

Forward Propagation Time (scaling processors with weights)

0.00E+00

5.00E-05

1.00E-04

1.50E-04

2.00E-04

2.50E-04

3.00E-04

3.50E-04

4.00E-04

4.50E-04

5.00E-04

0 100 200 300 400 500 600Number of Processors

Tim

e(se

c)

1.0E+05

5.1E+06

1.0E+07

1.5E+07

2.0E+07

2.5E+07

3.0E+07

Wei

ghts

Forward propagationtime(sec)Weights

Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,” AIAA Paper No. 2005-7168, Sept. 2005.http://www.personal.psu.edu//lnl/papers/aiaa20057168.pdf

Page 46: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 46

Large ANN Cases

Processors Inputs Neurons Neurons Per Hidden Layer

Weights Percent Correct

MemoryUsed (GB)

CPU Time (sec.)

16 37,500 1584 256 9,613,376 100 % 0.08 246 64 150,000 6272 1024 153,652,480 100 % 1.20 2489

500 600,000 25,000 4000 2,400,106,384 89 % 19.0 6238

~10 times fewer synapses than rat. Trained in under two hours.

Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,” AIAA Paper No. 2005-7168, Sept. 2005.http://www.personal.psu.edu//lnl/papers/aiaa20057168.pdf

Page 47: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 47

ConclusionsANN’s are widely used in practical applicationsLarge ANN’s will be quite interestingWe have developed parallel object-oriented C++ software to simulate very large neural networksMore tests are needed, but the performance results are very promising

Page 48: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 48

References1. Long, L.N. and Gupta, A., “Scalable Massively Parallel Artificial Neural Networks,” AIAA

Paper No. 2005-7168, Sept. 2005.2. Mitchell, T. M., Machine Learning, McGraw-Hill, NY, 1997.3. Rumelhart, D.E. and McClelland, J.L., Parallel Distributed Processing, MIT Press, 1986.4. LeDoux, J., Synaptic Self, Penguin, New York, 2002.5. Hawkins, J., and Blakeslee, S., On Intelligence, Times Books, New York, 2004.6. Dean, T., “A Computational Model of the Cerebral Cortex”, AAAI Conference, Pittsburgh,

20057. Mountcastle, V.B., “Introduction to the special issue on computation in cortical columns,”

Cerebral Cortex, Vol. 13, No. 1, 2003.8. Mumford, D., “On the computational architecture of the neocortex II: The role of cortico-

cortical loops”, Biological Cybernetics, Vol. 66, 1992.9. Dennett, D.C., Consciousness Explained, Back Bay, Boston, 1991.10. Kurzweil, R., The Age of Spiritual Machines, Penguin, NY, 1999.11. Moravec, H., Robot: mere machine to transcendent mind, Oxford University Press, November

1998.12. Gerstner, W. And Kistler, W.M., Spiking Neuron Models, Cambridge Univ. Press, Cambridge,

2002.13. http://www.nas.nasa.gov/Resources/Systems/columbia.html14. Haykin, S., Neural Networks: A Comprehensive Foundation, 2nd Ed., Prentice-Hall, 1999.15. Werbos, P.J., Backpropagation: Basics and New Development, The Handbook of Brain

Theory and Neural Networks, MIT Press, First Edition, 1995.

Page 49: Artificial Neural Networks - Computational Science - Home

Lyle N. Long 49

Thank You. Questions?

Lyle N. Long

[email protected]://www.personal.psu.edu/lnl