optimization of the hidden layer of a multilayer … of the hidden... · 2016-11-22 · network....

24
OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER PERCEPTRON WITH BACKPROPAGATION (BP) NETWORK USING HYBRID K-MEANS-GREEDY ALGORITHM (KGA) FOR TIME SERIES PREDICTION James Tan Yiaw Beng Master of Engineering 2012

Upload: phungnhi

Post on 14-Apr-2019

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER PERCEPTRON WITH BACKPROPAGATION (BP) NETWORK USING

HYBRID K-MEANS-GREEDY ALGORITHM (KGA) FOR TIME SERIES PREDICTION

James Tan Yiaw Beng

Master of Engineering 2012

Page 2: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

To my parents, family, relatives, friends and everyone else too numerous to name here

who have worked hard and tirelessly to assist in the completion of this research project

Page 3: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

ACKNOWLEDGEMENT

First of all I would like to take this opportunity to give my sincerest appreciation to

my academic supervisors Ir. David Bong Boon Liang and Assoc. Prof. Ir. Dr. Andrew Ragai

Henry Rigit for their patience in providing their support, knowledge, guidance and feedback

throughout this research. I would also like to thank the late Mdm. Irene Lim and Mr. Law

Kok Heng of Sarawak Energy Bhd. for their consent and assistance in providing the necessary

data to be utilized in this research, as well as their comments and ideas during the research.

Special thanks also go out to my family, relatives and friends for their motivation and support

through the good times and the bad times I have experienced for the duration of this research.

I would also like to thank the Ministry of Science, Technology and Innovation

(MOSTI) and Universiti Malaysia Sarawak (UNIMAS) for tinancial support and the

necessary facilities in performing the activities necessary for this research. I would also like to

express my gratitude to Solar Influences Data Analysis Center (SIDC) of the Royal

Observatory of Belgium and the Department of Statistics, Malaysia for their consent in

providing the additional data to be utilized to evaluate the performance and effectiveness of

the proposed model developed during this research.

Last but not least, I would like to express my heartfelt thanks to all other individuals

that are too numerous to be named here and whom were involved either directly or indirectly

for the duration of this research.

111

Page 4: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

ABSTRACT

(Research into the field of artificial neural networks (ANN) is fast gaining interest in

recent years, due to the fact that it is fast becoming a popular tool of choice in prediction of

time series trends. This recent surge in its popularity can be attributed to the fact that ANN,

especially a multilayer perceptron with backpropagation (BP) network that has the optimal

number of neurons in its hidden layer would be able to predict with better accuracy unknown

values of a time series that it is trained with, compared to other methods implemented to

predict the same time series The drawback of using BP networks in time series prediction is

that it is difficult and time-consuming to find the optimal number of neurons in its hidden

layer to minimize the prediction error. We propose a model known as K-means-Greedy

Algorithm (KGA) model in this research to overcome this serious drawback of the BP

network. The proposed KGA model combines greedy algorithm withk-means++ clustering in

this research to assist users in automating the finding of the optimal number of new-ons inside

the hidden layer of the BP network. The evaluation results the proposed KGA model using

several time series, namely the sunspot data, the Mackey-Glass time series, and electrical load

forecasting using data from several econometric factors, as well as historical electricity

demand data, show that the proposed KGA model is eflective in finding the optimal number

of neurons for the hidden layer of a BP network that is used to perform time series prediction.

IV

Page 5: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

ABSTRAK

Penyelidikan ten tang rangkaian neural buatan (ANN) semakin mendapat sambutan

dewasa ini, kerana ANN semakin popular di kalangan penyelidik IIntuk mendapatkan

ramalan bagi sesuatu siri masa (time series). Peningkatan dalam populariti ini adalah

disebabkan sesllatu ANN, terutamanya Rangkaian Perseptron Berbilang Lapis dengan

Algoritma Rambatan Belakang (BP), yang mempllnyai bilangan neuron yang optimum dalam

lapisan tersembllnyinya dapat meramalkan nilai-nilai bagi sesllatu siri masa yang tidak

diketa/lUi dengan lebih jitll berbanding dengan kaedah-kaedah lain. Kelemahan ketara

rangkaian BP ialah proses untuk mendapatkan rangkaian BP optimum yang meminimwnkan

ralat antara nilai-nilai ramalan dengan nilai-nilai sebenar bagi siri masa tersebut

menyulitkan dan memakan masa yang lama. Oleh sebab itu, satu model yang menggabungkan

dua algoritma, iaitu algoritma penggugusan cara k++ (k-means++ algorithm) dengan

algoritma tamak (greedy algorithm) yang dinamakan model K-means-Greedy Algorithm

(KGA) telah dibangllnkan untuk membantll pengguna mendapatkan bilangan neuron yang

optimum dalam lapisan tersembunyi rangkaian BP. Penilaian telah dibllat terhadap model ini

dengan menggunakan beberapa siri masa, iaitll siri masa tompok matahari, siri masa

Mackey-Glass, serta ramalan penggunaan tenaga elektrik pada masa hadapan dengan

menggunakan faktor-faktor yang mampu mempenganthi penggllnaan tenaga elektrik, dan

dellgan menggunakan data penggllnaan tenaga elektrik pada masa lalLl. Keputusan­

kepuillsan yang diperoleh hasil daripada penilaian-penilaian ini membllktikan bahawa model

KGA yang dibangunkan ini mampll mendapatkan bilangan neuron yang optimum bagi

seslIatu rangkaian BP yang digunakan untuk meramalkan nilai-nilai yang diketahlli dalam

S SUa/II siri masa yang telah digunakan unutk melatih rangkaian BP tersebllt.

v

Page 6: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

ftuliat Kbidmlill MakJumat Akademi~ lJNlVERSm MALAYSIA SARAWAK

TABLE OF CONTENTS

PageContent

iiDEDICATION

iiiACKNOWLEDGEMENT

ivABSTRACT

ABSTRAK v

viTABLE OF CONTENTS

xiiiLIST OF FIGURES

xviiiLIST OF TABLES

xxiLIST OF ABBREVIATIONS

1CHAPTER 1: INTRODUCTION

1.1 INTRODUCTION

71.2 PROBLEM STATEMENT

71.3 RESEARCH OBJECTIVES

81.4 THESIS OVERVIEW

9CHAPTER 2: LITERATURE REVIEW

92.1 INTRODUCTION

92.1.1 Definition of an ANN

2.1.2 The basic building block of an ANN: A neuron 9

2.1.3 Using neurons to construct different ANN models 11

Vi

Page 7: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

2.1.4 Training an ANN 12

2.1.5 Stopping the training of an ANN 14

2.1.6 Applications of ANN 16

2.2 MUL TILA YER PERCEPTRON WITH

BACKRPOPAGATION (BP) ALGORITHM 18

2.2.1 Architecture and algorithm of the BP network 18

2.2.2 Training a BP network and ending the training of the BP network 20

2.2.3 Using BP networks in prediction of time series trends 21

2.2.4 The importance of having hidden layers in BP networks 22

2.2.5 The importance of having the right size of the hidden layer 23

2.2.6 Optimizing the hidden layer in the BP network 25

2.3 OVERVIEW OF THE PROPOSED KGA MODEL 27

2.3.1 Parameters of the BP network to be optimized 27

2.3.2 Overall design of the proposed KGA model 28

2.3.3 Greedy algorithm 28

2.3.4 K-means++ clustering for search space reduction 29

2.4 SUMMARY 32

CHAPTER 3: HYBRID K-MEANS-GREEDY ALGORITHM (KGA) MODEL 33

3.1 INTRODUCTION 33

3.2 OVERVIEW OF THE HYBRID KGA MODEL 33

3.3 IMPLEMENTATION DETAILS OF THE HYBRID KGA MODEL 37

3.3 .1 The need for a suitable number of neurons in the

hidden layer of the BP network 37

VII

Page 8: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

,.

3.3.2 Optimization criteria and parameters 39

3.3.3 The algorithm ofthe proposed KGA model 40

3.4 BENCHMARK TIME SERIES DATA USED TO TEST

THE PROPOSED KGA MODEL 50

3.4.1 Annual sunspot data 50

3.4.2 Mackey-Glass time series 51

3 .5 LOAD FORECAST OF THE TOTAL ELECTRICITY

DEMAND IN SARAWAK (1970-2005) 53

3.5.1 Econometric load forecast of the

total electricity demand in Sarawak (1970-2005) 53

3.5.2 The total electricity demand in Sarawak (1970-2005)

based on historical data 57

3.6 IMPLEMENTING THE PROPOSED KGA MODEL 58

3.6.1 The architecture, training and testing of the BP network 58

3.6.2 Executing the proposed KGA model 59

3.7 EV ALUATING THE PERFORMANCE OF THE PROPOSED KGA MODEL 60

3.7.1 Verification of the choice made by the proposed KGA model 61

3.7.2 Perfonnance of the BP network optimized by the proposed

KGA model versus perfonnance of other methods 62

3.8 SUMMARY 64

CHAPTER 4: RESULTS, ANALYSIS AND DISCUSSION FOR

APPLICATIONS ON BENCHMARK DATA 65

4.1 lNTRODUCTION 65

Vlll

II I

Page 9: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

I4.2 SUNSPOT DATA 66

4.2.1 Method of evaluation of the proposed KGA model 66

4.2.2 Criteria of evaluation of the proposed KGA model 67

4.2.3 Evaluation results of the proposed KGA model 68

4.2.4 Verification of results obtained using the proposed KGA model 73

4.2.5 Comparison between the results obtained using the proposed

KGA model with results obtained using another method 75

4.3 MACKEY -GLASS TIME SERIES DATA 77

4.3.1 Evaluating the prediction made by the BP network 77

4.3 .2 Criteria of evaluation of the proposed KGA model 77

4.3.3 Method of evaluation of the proposed KGA model 78

4.3.4 Results of the evaluation of the proposed KGA model 79

4.3.5 Verification of the results obtained by the proposed KGA model 82

4.3.6 Comparison of the results obtained by the proposed KGA model

with results obtained using other methods 84

4.4 SUMMARY 85

CHAPTER 5: RESULTS, ANALYSIS AND DISCUSSION FOR

APPLICATIONS ON ELECTRICICAL LOAD FORECASTING 87

5.1 INTRODUCTION 87

5.1.1 The experiments to be conducted 87

5.1 .2 The criteria used for evaluation of the proposed KGA model 88

5.1.3 Structure of the results and discussion of the experiments 89

lX

Page 10: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

5.2 ECONOMETRIC LOAD FORECASTING USING 2 FACTORS 89

5.2.1 Introduction 89

5.2.2 Correlation between each of the factors and the total electricity demand 90

5.2.3 Architecture of the BP network and the error metric used 91

5.2.4 Performance of the proposed KGA model on predicting

the total electricity demand in Sarawak using 2 factors 92

5.2.5 Verification of the results obtained using the proposed KGA model 97

5.2.6 Effect of finding the optimal number of neurons in the hidden layer

of the BP network on the prediction of total electricity demand

for individual years based on 2 factors 99

5.3 USING 4 FACTORS TO PREDICT THE TOTAL ELECTRICITY DEMAND 104

5.3.1 Architecture of the BP network and the error metric used 104

5.3.2 Performance of the proposed KGA model in using 4 factors

to predict the total electricity demand in Sarawak 104

5.3.3 Verification of the performance of the proposed KGA model 109

5.3.4 Effect on finding of optima~ number of neurons in the

hidden layer of the BP network on the prediction of

total electricity demand for individual years based on 4 factors III

5.4 COMPARISON OF MAPE ERROR BETWEEN USING 4 FACTORS TO

PREDICT THE TOTAL ELECTRICITY DEMAND USING BP NETWORK,

USING 2 FACTORS TO PREDICT THE TOTAL ELECTRICITY DEMAND

USING BP NETWORK, AND USING MULTIPLE REGRESSION 116

5.5 PREDICTING THE TOTAL ELECTRICITY DEMAND IN

SARA W AK BASED ON PAST HISTORICAL DATA (1970-2005) 119

x

Page 11: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

5.6 USING DATA FOR THE PAST 2 YEARS TO

PREDICT THE TOTAL ELECTRICITY DEMAND 120

5.6.1 Architecture of the BP network and the error metric used 120

5.6.2 Results of using the proposed KGA model 121

5.6.3 Verification of results obtained using the proposed KGA model 126

5.6.4 Effect on finding of optimal number of neurons in the hidden layer

of the BP network on the prediction of total electricity demand

for individual years based on historical data for 2 years 129

5.7 USING DATA FROM THE PAST 5 YEARS TO

PREDICT THE TOTAL ELECTRICITY DEMAND 131

5.7.1 Architecture of the BP network and the error metric used 131

5.7.2 Results obtained from implementing the proposed KGA model 132

5.7.3 Verification of the results obtained from implementing

the proposed KGA model 137

5.7.4 Effect on finding of optimal number of neurons in the hidden layer

of the BP network on the prediction of total electricity demand

for individual years based on historical data for 5 years 139

5.8 COMPARISON BETWEEN THE PERFORMANCES OF THE BP

NETWORKS OPTIMIZED BY THE PROPOSED KGA MODEL

AND THE REGRESSION METHOD 141

5.9 SUMMARY 145

Xl

Page 12: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

CHAPTER 6: CONCLUSION AND FUTURE WORK 146

6.1 BACKGROUND OF THE RESEARCH 146

6.2 SUMMARY OF THE RESEARCH 147

6.3 RESEARCH FINDINGS AND CONTRIBUTIONS 147

6.4 SUGGESTIONS FOR FUTURE WORK 149

REFERENCES 150

APPENDIX 1: SUNSPOT DATA 164

APPENDIX 2: MACKEY-GLASS TIME SERIES DATA 165

APPENDIX 3: GROSS DOMESTIC PRODUCT (GDP)

OF SARAWAK (1970-2005) 170

APPENDIX 4: POPULATION OF SARA W AK (1970-2005) 171

APPENDIX 5: CONSUMER PRICE INDEX (CPI) OF SARAWAK (1970-2005) 172

APPENDIX 6: COMPARISON BETWEEN MULTILAYER PERCEPTRON

WITH BACKPROPAGATION ALGORITHM AND RADIAL

BASIS FUNCTION NETWORKS TO PERFORM FORECAST

OF ELECTRICITY DEMAND 173

APPENDIX 7: APPLICATION OF MULTILAYER PERCEPTRON WITH

BACKPROPAGATION ALGORITHM AND REGRESSION

ANALYSIS FOR LONG-TERM FORECAST OF ELECTRICITY

DEMAND: A COMPARISON 182

APPENDIX 8: USING HYBRID K-MEANS-GREEDY ALGORITHM TO

OPTIMIZE THE HIDDEN LAYER OF A BACKPROPAGATION

NETWORK FOR TIME SERIES PREDICTION 188

Xli

Page 13: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

LIST OF FIGURES

Figure Page

2.1 A neuron with R inputs and a bias (Hippert et aI., 200 I) 10

2.2 Flow chart of the supervised training process of an ANN 13

2.3 Reinforcement leaming process (Kumar, 2004) 14

2.4 A multilayer perceptron with backpropagation algorithm (BP)

network (Basheer and Hajmeer, 2000) 18

2.5 The pseudocode of the backpropagation algorithm 20

2.6 Linearly-separable problems and nonlinearly-separable problems

(Basheer and Hajmeer, 2000) 22

2.7 The effect of different size of the hidden layer on network generalization

(Basheer and Hajmeer, 2000) 23

2.8 Pseudocode of the greedy algorithm 29

2.9 The flow chart of the proposed KGA model 32

3.1 How the proposed KGA model is to be used to optimize the number of

hidden neurons in the BP network 34

3.2 Pseudocode of the algorithm of the proposed KGA model 40

3.3 The process of the selection of initial values of centroids after the first

centroid is chosen unifonnly at random among the observations Xi 43

3.4 The step-by-step process of clustering using k-means++ clustering

(Wikipedia, 2009) 45

3.5 The process of repeated clustering on the database of cOlTelations

between errors and the number of neurons in the hidden layer 46-47

X III

Page 14: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

3.6 The process of evaluation of values of the number of neurons in the

hidden layer of the BP network by greedy algorithm 49

3.7 Annual sunspot number (1770-1869) 51

3.8 (a): When r:S17 (in this case, r=6) the Mackey-Glass Time Series is

oscillating at a period of 20 units.

(b) However, when r2:17 (in this case, r=20), the Mackey-Glass time series

follows a chaotic pattern. This figure is adapted from Mackey and Glass (1977) 52

3.9 Total demand for electricity in Sarawak from the years 1970-2005 54

3.10 Gross Domestic Product of Sarawak at constant price with the 1987

base year from the years 1970-2005 55

3.11 Population ofSarawak from the years 1970-2005 55

3.12 Consumer Price Index (CPI) of Sarawak from the years 1970-2005

with the year 2000 as the base year 56

3.13 Number of customers of electricity in Sarawak from the years 1970-2005 56

3.14 The method implemented to evaluate the effectiveness of the proposed

KGA model implemented to optimize the number of neurons

in the hidden layer of the BP network 61

3.15 The method implemented to compare the performance of the BP

network optimized by the proposed KGA model against methods

proposed by other researchers 63

4.1 The architecture of the BP network that is used to predict the sunspot number 67

4.2 Clustering of the errors and the corresponding number of neurons

in the hidden layer 69

4.3 The values of the points in cluster 2 in Figure 4.2 70

XIV

Page 15: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

4.4 Clustering result of the values that are inside cluster 2 in Figure 4.2 71

4.5 Clustering of the range of values represented by cluster 3 in Figure 4.3 72

4.6 Plot of MSE vs. the number of neurons in the hidden layer of the BP network 74

4.7 Clustering of the errors and the corresponding number of neurons in

the hidden layer 79

4.8 The results of clustering the red cluster shown in Figure 4.7 81

4.9 Plot of the number of neurons in the hidden layer of the BP network

compared to RMSE error 83

5.1 The results of using the k-means++ algorithm to partition the list of guesses

attempted into 3 clusters 93

5.2 The result of clustering of the guesses made within the

range of the values shown in Table 5.3 95

5.3 The effect of changing the number of neurons in the hidden layer of the BP

network on MAPE when the BP network is considering 2 factors to predict

the future electricity demand 97

5.4 The effects of having different number of neurons in the hidden layer of the

BP network on individual differences between the actual and predicted

values of total electricity demand for the years 2000 through year 2005 100

5.5 The MAPE between the actual values and the predicted values

of total electricity demand for each of the years 2000 through 2005 101

5.6 Individual MAPE values for 2000-2005 predictions of total electricity

demand produced by setting the number of neurons in the hidden layer

of the BP network to be between 16 and 24 neurons 102

xv

Page 16: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

5.7 The results of using the k-means++ algorithm to partition the list of

guesses attempted into 3 clusters 105

5.8 The result of clustering of the guesses made within the range of the values

hown in Table 5.10 107

5.9 The effect of having different number of neurons in the hidden layer

of the BP network on MAPE 110

5.10 The MAPE errors achieved by setting the number of neurons

in the hidden layer of the BP network to values between 113 and 132 III

5.11 The effects of having different number of neurons in the hidden layer

of the BP network on differences between the actual and predicted

values oftotal electricity demand for each of the years 2000 through 2005 112

5.12 The MAPE between the actual values and the predicted values of

total electricity demand for each of the years 2000 through 2005 113

5.13 Individual MAPE values for 2000-2005 predictions of total electricity

demand produced by setting the number of neurons in the hidden layer

of the BP network to be between 16 and 24 neurons 114

5.14 Comparison between the actual total electricity demand for the years 2000-2005

and the predicted values of the total electricity demand by the optimal BP

networks and the multiple regression model lIB

5.15 Inputting the electricity demand for the years 1970 and 1971 to predict

the electricity demand for the year 1972 120

5.16 The results of using the k-means++ algorithm to partition the attempted

guesses into 3 clusters 122

XVI

Page 17: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

5.17 The result of clustering of the guesses made within the range of the

value shown in Table 5.16 124

5.18 The MAPE error achieved by changing the number of neurons in the hidden

layer of the BP network 126

5.19 The MAPE error resulting from having between II and 51 neurons

in the hidden layer of the BP network 127

5.20 The indjvidual MAPE for the years 2000-2005 with different number

of neurons in the hidden layer of the BP network that were

considered by the proposed KGA model to be the most accurate 130

5.21 The inputs to the BP network and the output of the BP network,

with I representing the year the particular data is taken 132

5.22 The results of using the k-means++ algorithm to partition

the list of guesses attempted into 3 clusters 133

5.23 The result of clustering of the guesses made within the range of the values

shown in Table 5.22 135

5.24 The effects of having different number of neurons in the

hidden layer on MAPE error 138

5.25 The average MAPE achieved when there are between 90 and 95 neurons in

the hidden layer of the BP network 139

5.26 Individual MAPE values for the years 2000 through 2005 when there are

are between 90 and 95 neurons in the hidden layer of the BP network 140

XVll

Page 18: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

LIST OF TABLES

Table Page

2.1 Several transfer functions used in the ANN 11

4.1 Coordinates of the cluster centroids of the values shown in Figure 4.3 70

4.2 The values guessed from cluster 1 shown in Figure 4.4 73

4.3 The number of neurons in the hidden layer and their respective

MSE evaluated by the greedy algorithm 73

4.4 The number of the hidden neurons that produce smallest MSE errors 75

4.5 MSE errors obtained by using the optimal BP network

and other methods cited by Park et al. (1996) 76

4.6 The values of the points in the red cluster in Figure 4.7 80

4.7 Coordinates of the cluster centroids of the values shown in Table 4.7 81

4.8 The number of neurons in the hidden layer and their respective RMSE

evaluated by the greedy algorithm 82

4.9 Number of the hidden neurons that produce smallest RMSE errors 83

4.10 RMSE errors obtained by using the optimal BP network

and other methods cited by reference Chen et al. (2006) 85

5.1 The correlation between 4 factors and the total electricity demand 91

5.2 Cluster centroids of the clusters shown in Figure 5.1 93

5.3 Values of numbers of neurons in the hidden layer of the BP network

in the cluster 3 shown in Table 5.2 94

5.4 The coordinates of cluster centroids shown in Figure 5.2 95

5.5 The values encompassed within cluster 1 shown in Figure 5.2 96

XVIII

Page 19: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

5.6 The result of using greedy algorithm to evaluate the MAPE cause by having

a certain number of neurons in the hidden layer of the BP network 96

5.7 The values of the MAPE when there are not more than

29 neurons in the hidden layer of the BP network 98

5.8 The individual MAPE values compared with the number of neurons

in the hidden layer of the BP network 102

5.9 Cluster centroids of the clusters shown in Figure 5.7 105

5.10 Values of numbers of neurons in the hidden layer of the BP network

in the cluster 3 106

5.11 The coordinates of cluster centroids shown in Figure 5.8 107

5.12 The values encompassed within cluster 1 shown in Figure 5.8 108

5.13 The result of using greedy algorithm to evaluate the MAPE

cau ed by having a certain number of neurons in the

hidden layer of the BP network 109

5.14 The individual MAPE values compared with the

number of neurons in the hidden layer of the BP network 114

5.15 Cluster centroids of the clusters shown in Figure 5.16 122

5.16 Values of numbers of neurons in the hidden layer of the

BP network in the cluster 3 123

5.17 The coordinates of cluster centroids shown in Figure 5.17 124

5.18 The values encompassed within cluster I shown in Figure 5.17 125

S.l9 The result of using greedy algorithm to evaluate the MAPE caused by having

125a certain number of neurons in the hidden layer of the BP network

XIX

Page 20: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

5.20 MAPE achieved using the number of neurons in the hidden layer

of the BP network 128

5.21 Cluster centroids of the clusters shown in Figure 5.22 133

5.22 Value of numbers of neurons in the hidden layer of the BP network in

the cluster 3 shown in Table 5.21 134

5.23 The coordinates of cluster centroids shown in Figure 5.23 135

5.24 The values encompassed within cluster 2 shown in Figure 5.23 136

5.25 The result of using greedy algorithm to evaluate the MAPE caused by

having a certain number of neurons in the hidden layer of the BP network 137

5.26 The MAPE achieved using the BP network optimized by

the proposed KGA model 141

5.27 A comparison between the actual and predicted total electricity demand

for the years 2000 through 2005 using the regression model

described using Equation (5.6) 142

5.28 A comparison between the actual and predicted total electricity demand

for the years 2000 through 2005 using several different methods 143

5.29 The MAPE errors achieved by the optimal BP networks as well as

the MAPE errors achieved using the regression model 144

xx

Page 21: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

Abbreviation

ANN

BP

CPI

GOP

KGA

MAPE

MSE

RMSE

LIST OF ABBREVIATIONS

Description

Artifical Neural Network

Multilayer Perceptron with Backpropagation Network

Consumer Price Index

Gross Domestic Product

K-means-Greedy Algorithm

Mean Absolute Percentage Error

Mean Squared Error

Root Mean Squared Error

xxi

Page 22: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

CHAPTER!

INTRODUCTION

1.1 INTRODUCTION

Artificial neural networks (ANN) are infonnation processing tools inspired by the way

that a human brain works. ANNs have been successfully implemented to solve various tasks

in recent years, such as in forecasting of electricity demand, (Hippert et aI., 200 I; AI-Shareef

et aI., 2008) character and image recognition, credit evaluation, insurance, (Huang, 2009)

pattern recognition and classification, (Karayiannis and Behnke, 2000; Pham et aI., 2006a),

and also for daily water level estimation (Bustami et aI., 2006). Numerous types of ANNs

have been developed over the years, such as, radial basis network (RBF) (Chen et aI. , 1991),

Kohonen's self-organizing map (SOM) (Kohonen, 1990) and perceptron networks

(Rosenblatt, 1958). Each of these types has its own set of strengths and weaknesses, and these

networks are suitable in solving certain types of problems. For instance, RBF network models

are suitable in applications that contain a lot of training data, since it takes less time to be

trained (Hagan et al. 1996; Bong and Tan, 2007). The most popular type of network is, the

multilayer perceptron with backpropagation algorithm (BP) network (Rumelhart et aI. , 1986)

that was developed as a method to overcome the inability of perceptrons to solve problems

t are not linearly separable (Huang, 2009). It consists of one or more layers of neurons,

own as hidden layers, sandwiched between an input layer that relays data from the external

ironment to the network and the output layer that displays the results of the infonnation

essed by the entire network.

Page 23: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

The SP network is a popular tool among researchers due to its ability to be easily

generated and to generalize relatively well (Zhang et aI. , 1998), in addition to its relative

aightforward manner in implementation (Kumar, 2004). The main drawback of using BP

networks is that while it is easy to create and train a SP network, it is difficult to obtain the

SP network with appropriate size and parameters that will provide the most accurate results

for the problem at hand. This is particularly true when it comes to determining the number of

hidden layers and the number of neurons in each existing hidden layer in the BP network. In

fact, the number of the hidden layers and the number of neurons in each hidden layer is so

important that these layers exert a lot of influence on the final output and hence on the

ork performance. This is because the BP network must have enough neurons in the

hidden layer in order to form a decision region that has the complexity as that required by the

problem at hand in order to solve the problem with desirable results (Kumar, 2004).

Modifying the number of neurons in the hidden layer of the BP network will have a

profound impact on the processing of information that the network receives. It has been

vered that the network does not train well if there are not enough neurons in the hidden

during training. Insufficient number of hidden neurons may lead to bad error tolerance

JDd poor generalization due to inadequate detection of the patterns underlying the data given

the SP network (Mehta and Gohel, 2005). On the other hand, the time taken by the SP

.lIdWOlik to learn will increase when there are too many neurons in the hidden layer of the BP

rk (Zheng and He, 2004). In addition to that, the SP network will simply memorize the

·c••t t that is given to it during training. As a result, the BP network is able to perfonn well

producing the expected outputs when it is given data that is exactly the same as the ones

2

Page 24: OPTIMIZATION OF THE HIDDEN LAYER OF A MULTILAYER … of the Hidden... · 2016-11-22 · network. The proposed KGA model combines greedy algorithm withk-means++ clustering in this

--------------------------------------------------

used during training, but produces very poor results when it is given data that is different from

ones that is used to train it (Reed, 1991).

This problem of finding the optimal number of neurons in the hidden layer of a BP

lICtWork has grave impact on the overall design of the BP network, and its suitability to so.ve

problem at hand. Firstly, since the number of neurons in the hidden layer is not optimal,

. means that the BP network is not at its full potential to analyze and solve the problem, as

decision region fonned by the network is not at the required complexity to solve the task

bane!. This also means that the results that are produced by the networks are not the best

possible, since better results could be obtained from the network. Thus not only this

s that the BP network is poorly designed, it also shows that BP networks, and ANN in

tications. In fact, Hippert et a1. (2001) noted that the poor design of the number of neurons

the hidden layer is the main reason that researchers are not entirely convinced of the

ility of the ANN in forecasting although the implementation of ANN as forecasting

. promising, and much work needs to be done before the ANN can be accepted as

NlIDdIrd forecasting tools. This view is also supported by Adya and Collopy (1998) .

In view of the impact of the problem of optimizing the size of the hidden layer in the

1P1~lI/'nl1c several optimization methods have been proposed by researchers in recent years.

methods can be broadly classified into 3 groups, namely pruning-based methods that

parameters inside the BP network that do not contribute to better results (Sietsma

1988; Reed, 1991), methods based on exploitation of statistical know ledge of the

such as the model developed by Salazar Aguilar et a1. (2006), and methods

3