object-oriented rosenblatt perceptron using c++

30
Object-Oriented Artificial Neural Network with C++ SSE550 Object-Oriented Programming I Project I (Chapter 1-5) February 13, 2012 Samuel Bixler

Upload: sam-bixler

Post on 20-Feb-2015

270 views

Category:

Documents


6 download

TRANSCRIPT

Page 1: Object-Oriented Rosenblatt Perceptron Using C++

Object-Oriented Artificial Neural Network with C++

SSE550 Object-Oriented Programming I

Project I (Chapter 1-5)

February 13, 2012

Samuel Bixler

Page 2: Object-Oriented Rosenblatt Perceptron Using C++

Table of Contents

Introduction

Basic Perceptron Theory

NeuralNet Class

• UML Diagram

• Headers

• Interface

• Implementation

Main Program

• Headers

• Instantiation

• Control Loop

Output

• Figure 4 - Options Menu

• Figure 5 - Initialize Weights

• Figure 6 - Refresh Menu

• Figure 7 - Display Weights

• Figure 8 - Input Training Set

• Figure 9 - Train Net Default Learning Rate

• Figure 10 - Train Net Learning Rate 10.0

• Figure 11 - Display Weights after Training

• Figure 12 - Test Net Boolean Inputs

• Figure 13 - Test Net Float Inputs

• Figure 14 - Weights Plotted on Input Space

• Figure 15 - Set Activation Function

• Figure 16 - Set Learning Rate

• Figure 17 - Exit

Conclusion

Page 3: Object-Oriented Rosenblatt Perceptron Using C++

Index of Topics Covered

Chapter 1 - Introduction to Computers and C++

Chapter 2 - Introduction to C++ Programming

• Compiler directives • The main function • Input statements • Output statements • Stream insertion operator • Escape sequences • Return statement • Variable declarations • Fundamental types • Identifiers • Memory • Arithmetic • Operator precedence • Relational operators

Chapter 3 - Introduction to Classes, Objects and Strings

• User-defined classes • Creating and using objects • Declaring data members • Defining member functions • Calling member functions • Passing data as arguments • Local variables vs. data members • Initial values via constructor • Separating interface from implementation • UML class diagrams • Data member set methods

Chapter 4

• Constructing an algorithm in pseudocode • Selection statements • Repetition statements • Assignment operators

Chapter 5

• More control statements • Logical operators

Page 4: Object-Oriented Rosenblatt Perceptron Using C++

Introduction

This project explores the application of object-oriented programming techniques to the

construction of a single neuron artificial neural network (ANN). The framework that was

constructed is designed in a scaleable way so that it will be useful to represent more

complex networks. The focus of the project was on the construction of an easy to use

NeuralNet class with member functions to perform common ANN operations. The class

can be used to create and manipulate complex network architectures, which would be

useful for real world applications. This paper will first address the theory of operation of

a single neuron ANN, followed by the implementation of the NeuralNet class and then

present results of using the class to perform the AND logical operation.

Basic Perceptron Theory

Artificial Neural Networks are mathematical models of biological neurons which are

usually used to perform functions that are not easily achieved using traditional

algorithms. Pattern recognition and classification are two tasks that neural networks are

especially well suited for.

Figure 1 - Two Input/Single Output Neural Network

The simplest example of an ANN is the Rosenblatt Perceptron; it is the name given to a

single neuron ANN and the algorithm used to train it. Figure 1 is a graphical

representation of the functions and data that make up the Rosenblatt Perceptron. It is the

model that will be explored using the NeuralNet class in this project. The mathematical

neuron functions are similar to a biological neuron. Inputs either from the environment

(user) or other neurons (hidden layers) are summed in the body of the neuron and if a

certain threshold, called the activation potential, is reached, the output changes. The

Page 5: Object-Oriented Rosenblatt Perceptron Using C++

function that maps the weighted sum of the input(s) to the output(s) is called the

activation function. There are several functions that can be used for this step depending

on the specific network architecture and data that is used. The Rosenblatt Perceptron can

also be viewed mathematically as a line in 2D "input space" that is adjusted to divide the

inputs based on which class they belong to. In the general case with n inputs, these

weights represent an n-dimensional hyperplane that is able to perfectly classify any

linearly separable sets of inputs. Unfortunately the Rosenblatt Perceptron performs very

poorly at classifying inputs that are not linearly separable and more advanced networks

and training algorithms are needed for more complex problems.

Figure 2 - Two Input Decision Boundary

To understand exactly how the Rosenblatt Perceptron is able to classify inputs it is

helpful to graph the line on the input space. If the input lies above the line, it belongs to

one class and to the other if it lies below. There are several ways to train a neural

network, but the method which this project uses is called supervised training. A training

set of inputs and the correct outputs is shown to the perceptron and the weights are

modified according to a learning rule which will be discussed later.

Page 6: Object-Oriented Rosenblatt Perceptron Using C++

NeuralNet Class

The NeuralNet class is composed of a set of private data members that store the

architecture parameters which are required to initialize and train the network. This class

contains methods to initialize, train and test the network as well as several mutators, and

a function required by the learning algorithm. A UML diagram of the NeuralNet class is

shown below.

Figure 3 - UML Diagram NeuralNet Class

NeuralNet

-numInput: int-numHidden: int-numOutput: int-numTrainSets: int-activationSelect: int-learnRate: float-sigmoidCoef: float-weightMatrix: Eigen::MatrixXf-trainingInputs: Eigen::MatrixXf-trainingOutputs: Eigen::MatrixXf-testInputs: Eigen::VectorXf

<<constructor>>+NeuralNet()+refreshScreen(): void+initializeWeights(): void+displayWeights(): void+inputTrainSet(): void+trainNet(): void+testNet(): void+setActivationFunction(): void+setLearningRate(): void+activationFunction(: float): float

Page 7: Object-Oriented Rosenblatt Perceptron Using C++

Interface

#include <Eigen\Dense>

class NeuralNet{public:

NeuralNet();void refreshScreen();void initializeWeights();void displayWeights();void inputTrainSet();void trainNet();void testNet();void setActivationFunction();void setLearningRate();float activationFunction(float);

private:// Private data ////Network architecture parametersint numInput, numHidden, numOutput, numTrainSets;int activationSelect;float learnRate, sigmoidCoef;//Eigen matrices and vectorsEigen::MatrixXf weightMatrix;Eigen::MatrixXf trainingInputs;Eigen::MatrixXf trainingOutputs;Eigen::VectorXf testInputs;

};

The contents of the header file NeuralNet.h, where the interface of the NeuralNet class is

defined, are shown above. The implementation is in the NeuralNet.cpp file, and will be

covered piece by piece in the next section. The data members numInput, numHidden and

numOutput are integers that are used during the instantiation of a NeuralNet object to

specify the architecture of the network. These parameters determine the size of the

weights matrix and also are used to control for loops that initialize weights and train the

network. The numHidden parameter is not utilized in this project, but it is included for

flexibility. Its purpose is to specify the number of hidden layers of neurons in the

network. In the single neuron case there are no hidden layers (i.e. not input or output

layer). The learnRate floating point parameter is used in the training algorithm to vary the

amount of adjustment that is made to the weight matrix after each training iteration. For

the remaining data members, data types from the Eigen matrix library were used. Eigen is

an open source template library that provides the capability to easily create, manipulate

and display matrices and vectors. The weightMatrix data member is a dynamically

allocated single precision floating point matrix that is used to store the neural network

weights.

Page 8: Object-Oriented Rosenblatt Perceptron Using C++

Headers

//NeuralNet.cpp#include <iostream>#include "NeuralNet.h"#include <cmath>#include "stdlib.h"using namespace std;

NeuralNet.cpp uses the #include compiler directive to include several required external

libraries. The iostream header is included to provide access to the system input/output.

The NeuralNet.h header needs to be included since it contains the NeuralNet class

interface definitions as well as the member function prototypes. The cmath library

implements the exponential function exp() which is required by the activationFunction

method to generate the sigmoid and hyperbolic tangent output . The stdlib.h header is

included so that system("cls") can be used to clear the console in the refreshScreen

method.

Implementation

NeuralNet::NeuralNet(){

//Default parametersnumInput = 2;numOutput = 1;numTrainSets = 4;learnRate = 0.1;activationSelect = 1;sigmoidCoef = 4.0;

//Matrix and vector sizingweightMatrix.resize(numInput+1,numOutput);trainingInputs.resize(numTrainSets,numInput+1);trainingOutputs.resize(numTrainSets,numOutput);testInputs.resize(numInput+1);

//Define training set for AND function (Default)trainingInputs << 1, 0, 0,

1, 0, 1, 1, 1, 0, 1, 1, 1;

trainingOutputs << 0, 0, 0, 1;

}

Page 9: Object-Oriented Rosenblatt Perceptron Using C++

The NeuralNet class has an overloaded constructor that initializes the private data

members with their default values. These defaults can be seen in the code above. The

constructor also resizes the matrices and vectors based on the defaults. This piece of code

will eventually need to be moved once functionality is added that allows the user to

define the network architecture. The trainingInputs and trainingOutputs matrix and vector

are populated with the appropriate data to teach the perceptron the logical AND function.

void NeuralNet::refreshScreen(){

system("cls");cout << endl;cout << " [1] Main menu" << endl;cout << " [2] Initialize weight matrix " << endl;cout << " [3] Display weights matrix " << endl;cout << " [4] Input training set " << endl;cout << " [5] Train network " << endl;cout << " [6] Test the net " << endl;cout << " [7] Set activation function " << endl;cout << " [8] Set learning rate " << endl;cout << " [9] Exit " << endl;

}

The refreshScreen method clears the console output using system("cls") mentioned in the

headers section, and refreshes the main menu options using the stream insertion operator

to send output to the cout stream. This method can be called by the user from the main

program if the screen is cluttered or the user needs to know what the options are.

void NeuralNet::initializeWeights(){

//Initialize bias weight to +1for ( int b = 0; b < numOutput; b++ )

weightMatrix(0,b) = 1;

//Initialize weights to random values (-1)-(+1) with mean 0for ( int out = 0; out < numOutput; out++ ){

for ( int in = 1; in <= numOutput + 1; in++ )weightMatrix(in,out) = (float)rand()/(float)RAND_MAX*2 - 1;

}cout << " The weight matrix has been initialized with random values.\n";

}

The initializeWeights method uses two for loops to initialize the synaptic weights and

bias, to pseudorandom numbers between -1 and +1 with a normal distribution. The C++

standard library includes a random number generator function rand(), but because its

Page 10: Object-Oriented Rosenblatt Perceptron Using C++

return type is integer, the initializeWeights method modifies the range and mean of the

random numbers using the cast operator followed by an offset of -1. In order for the

Rosenblatt Perceptron to properly classify inputs which may not be centered about the

origin (of the input space), a bias is used to shift the decision boundary up or down. This

bias is part of the weight matrix and is initialized to +1.

void NeuralNet::displayWeights(){

cout << " The weights are: \n\n";cout.fixed;cout.precision(2);for (int r = 0; r <= numInput; r++){

for (int c = 0; c < numOutput; c++){

if ( weightMatrix(r,c) < 0 )cout << " " << fixed << weightMatrix(r,c);

elsecout << " " << fixed << weightMatrix(r,c);

}cout << endl;

}}

It is interesting to see what is actually being stored in the weights matrix. To do this, the

displayWeights method was created. The format and precision of the output is set to fixed

and two decimal places, then two for loops print each value in the weights matrix. The

insertion operator could have been used to directly print the values in weightMatrix as the

Eigen::MatrixXf type has this capability, but because the results could be either negative

or positive and an if statement was included that keeps the decimals lined up for

readability.

Time did not permit implementation of the inputTrainSet method. It was planned, but not

a top priority. The method should allow the user to input the training data either from a

file or by entering it manually. At this point, if this method is called, it prints a message

to tell the user that the functionality is not there yet.

Page 11: Object-Oriented Rosenblatt Perceptron Using C++

void NeuralNet::trainNet(){

//Local variablesfloat activation, product, error; int epoch = 0, sumMisclass;

//Loop until the neural network doesn't misclassify any of the training inputsdo {

//Initialize training metrics variablesepoch++;sumMisclass = 0;

//Calculate output and error for each set of training inputsfor (int i = 0; i < numTrainSets; i++){

//Calculate errorproduct = trainingInputs.row(i).dot(weightMatrix.transpose().row(0));activation = activationFunction(product);error = trainingOutputs(i,0) - activation;

//Update weight matrixweightMatrix += trainingInputs.row(i).transpose()*learnRate*error;

//Sum misclassified inputsif ( error != 0.0 )

sumMisclass++;}cout << " " << sumMisclass << " misclassified inputs for epoch "

<< epoch << endl;} while (sumMisclass > 0);cout << " The network has finished training.\n";

}

The trainNet method is the most complex segment of code in the NeuralNet class. It

executes the Rosenblatt Perceptron training algorithm to teach the neuron the AND

operator. The method executes a supervised training algorithm that manipulates the

weights matrix using the trainingInputs and trainingOutputs data. The numTrainSets

integer variable is used to control looping in the algorithm. Local variables store

intermediate values (float activation, product), the output error (float error), the epoch

number (int epoch) and the sum of the misclassified inputs for a given epoch (int

sumMisclass).

Page 12: Object-Oriented Rosenblatt Perceptron Using C++

The pseudocode algorithm is:

• Set epoch count to 0.

• While the number of misclassified inputs is greater than 0.

o Set misclassified inputs to zero

o Increment epoch by 1.

o For every input in the training set

Compute weighted sum of inputs using initial random weights

Compute the hardlimited output of the weighted sum

o Compute error by taking difference of training output and neuron output

o Update weight matrix using the training rule.

New weight equals the sum of the old weight and the product of

the learning rate, the current input and the error.

o If the error is greater than 0

Increment misclassified inputs by 1

o Display epoch number and number of misclassified inputs

• Return to the calling function

float NeuralNet::activationFunction(float x){

float result;switch (activationSelect){case 1: //Threshold function

{if(x >= 0)

result = 1;else

result = 0;}break;

case 2: //Hyperbolic tangent functionresult = (exp(x)-exp(-x))/(exp(x)+exp(-x));break;

case 3: //Sigmoid functionresult = 1/( 1 + exp(-activationSelect*x) );

}return result;

}

The activationFunction method is a function that by default performs a threshold

operation on the floating point input and returns a floating point result. There are several

Page 13: Object-Oriented Rosenblatt Perceptron Using C++

functions such as the sigmoid, and hyperbolic tangent functions that can also be used as

the activation function, but the threshold works best for the Rosenblatt Perceptron.

void NeuralNet::testNet(){

float activation, product; //Local variables//Set bias inputtestInputs(0) = 1;//Loop to fill the test input vector with user valuesfor (int i = 1; i < testInputs.rows(); i++){

cout << " Enter input " << i << ": ";cin >> testInputs(i);

}//Compute the neuron's output given the test inputsproduct = testInputs.dot(weightMatrix.transpose().row(0));activation = activationFunction(product);cout << "\n Given the inputs you entered,\n the Rosenblatt Perceptron ";cout << " says the correct answer is: " << activation << endl;

}

The testNet method fills an Eigen VectorXf variable with a user specified set of inputs. It

then shows the perceptron the input set and computes an intermediate product and

activation value using the weight matrix resulting from the trainNet method. The result it

sent to the console.

Page 14: Object-Oriented Rosenblatt Perceptron Using C++

The NeuralNet class contains two set methods to allow the user the option to change the

learning rate and activation function.

void NeuralNet::setActivationFunction(){

int activationSelectTemp, sigmoidCoef;//Activation function selection menucout << " [1] Threshold" << endl;cout << " [2] Sigmoid" << endl;cout << " [3] Hyperbolic Tangent" << endl;cout << " Select an activation function: ";cin >> activationSelectTemp;

switch (activationSelectTemp){case 1:

activationSelect = activationSelectTemp;cout << "\n The threshold function has been selected.\n";break;

case 2:activationSelect = activationSelectTemp;cout << "\n The sigmoid function has been selected.\n";cout << "\n Enter the exponential coefficient (positive real): ";cin >> sigmoidCoef;if (sigmoidCoef > 0.0)

cout << "\n The coefficient has been set to: " << sigmoidCoef << endl;else{

sigmoidCoef = 4.0;cout << "\n Invalid entry!";cout << "\n The coefficient has been set to the default (4.0)\n";

}break;

case 3:activationSelect = activationSelectTemp;cout << "\n The hyperbolic tangent function has been selected.\n";break;

default:activationSelect = 1;cout << "\n Invalid entry!";cout << "\n The activation function has been set to the default (Threshold).\n";break;

}}

Page 15: Object-Oriented Rosenblatt Perceptron Using C++

void NeuralNet::setLearningRate(){

//Set a new learning ratecout << " Enter the new learning rate (positive real): ";cin >> learnRate;if ( learnRate > 0.0 ){

cout << " The learning rate has been set to " << learnRate << endl;}else{

learnRate = 0.1;cout << " Invalid entry!\n";cout << " The learning rate has been set to the default (0.1)\n";

}}

Main Program

The main.cpp file begins with #include directives to access the required libraries. The

iostream header is included to provide input/output and formatting capabilities to the

program. The ctime library is required to generate a seed for the rand function. The

compiler is informed that the std namespace is being used.#include <iostream>#include <ctime>#include "NeuralNet.h"using namespace std;

The main function takes no arguments and its return type is void since it is not required to

return any values for this application. The main program begins begins by seeding the

pseudo-random number generator with the current system time by executing the srand()

function. The next step is creating an instance of the NeuralNet object called myNet.

After the myNet object is created, a call to the refreshScreen method clears the screen and

displays the options to the user. The option variable holds the user's choice and is used as

the switch variable in the option selection case statement. The boolean exit is used to exit

the do-while and the program if the user chooses to do so.void main(){

//Seed the random number generatorsrand((unsigned)time(0));

//Create a NeuralNet objectNeuralNet myNet;//Clear the console and display the optionsmyNet.refreshScreen();

int option;bool exit = false;

Page 16: Object-Oriented Rosenblatt Perceptron Using C++

Execution now enters a do-while loop and prompts the user to select an option. The input

is saved in the option memory location and an if statement is used to test if the entry is a

value between 1-9 which are the valid options.

do{

//User interfacecout << "\n Enter your selection: ";cin >> option;//Invalid input checkif ( ( option > 0 ) & ( option < 10 ) ){

cout << endl;//Menu choice selection switchswitch (option){case 1:

myNet.refreshScreen();break;

case 2:myNet.initializeWeights();break;

case 3:myNet.displayWeights();break;

case 4:myNet.inputTrainSet();break;

case 5:myNet.trainNet();break;

case 6:myNet.testNet();break;

case 7:myNet.setActivationFunction();break;

case 8:myNet.setLearningRate();break;

case 9:exit = true;break;

default:cout << " Invalid input! Please enter an option (1 - 9):\n";

}} else{

cout << " Invalid entry!\n";cout << " Enter a number corresponding to one of the 9 options.";

}} while (exit == false);

return;}

Page 17: Object-Oriented Rosenblatt Perceptron Using C++

NeuralNet methods are called based on the user input using a switch control statement. If

the user chooses option 9, exit is set to "true" and when the do-while condition is tested

the loop is exited and the program returns. If invalid data is entered into choice, the else

block is executed and a message is displayed to inform the user.

Output

The following pages show screenshots of the programs response to user inputs and

demonstrate its capability to learn the AND function.

Figure 4 - Options Menu

Page 18: Object-Oriented Rosenblatt Perceptron Using C++

Figure 5 - Initialize Weights

Page 19: Object-Oriented Rosenblatt Perceptron Using C++

Figure 6 - Refresh Menu

Page 20: Object-Oriented Rosenblatt Perceptron Using C++

Figure 7 - Display Weights

Page 21: Object-Oriented Rosenblatt Perceptron Using C++

Figure 8 - Input Training Set

Page 22: Object-Oriented Rosenblatt Perceptron Using C++

Figure 9 - Train Net, Default Learning Rate (0.1)

Page 23: Object-Oriented Rosenblatt Perceptron Using C++

Figure 10 - Train Net, Learning Rate Set to (10.0)

Page 24: Object-Oriented Rosenblatt Perceptron Using C++

Figure 11 - Display Weights After Training

Page 25: Object-Oriented Rosenblatt Perceptron Using C++

Figure 12 - Test Net, Boolean Inputs

Page 26: Object-Oriented Rosenblatt Perceptron Using C++

Figure 13 - Test Net, Float Inputs

The final set of screenshots show the perceptron's response to inputs that are not 0 or 1.

Figure 9 shows several examples of this. This demonstrates that even though the

hyperplane is trained to separate the 4 inputs shown to it in the training set, it is only

finding one of the infinite number of solutions to the problem. The results that the neural

network generates are fuzzy and the neuron only learns as much as it needs to in order to

meet the learning criterion.

Page 27: Object-Oriented Rosenblatt Perceptron Using C++

Figure 14 - Weights Graphed in Input Space

The figure above shows the training set of inputs, untrained and trained decision

boundaries plotted on the 2D input space. This is a plot of observed weight matrix output

from the program.

Page 28: Object-Oriented Rosenblatt Perceptron Using C++

Figure 15 - Set Activation Funtion

Page 29: Object-Oriented Rosenblatt Perceptron Using C++

Figure 16 - Set Learning Rate

Figure 17 - Exit

Page 30: Object-Oriented Rosenblatt Perceptron Using C++

Conclusion

The goal of this project was to design an artificial neural network class using object-

oriented C++ techniques and verify the NeuralNet class's interface and implementation

by creating and testing the Rosenblatt Perceptron case. This was successfully

accomplished as the results indicate. The class is very simple at this point and would need

much more work to allow it to classify non-linearly separable patterns and to utilize the

more advance activation functions. The program only has minimal user input validation

and exception handling and this is something that would need to be improved on in the

future.