research article hybrid models based on singular values and...

15
Research Article Hybrid Models Based on Singular Values and Autoregressive Methods for Multistep Ahead Forecasting of Traffic Accidents Lida Barba 1,2 and Nibaldo Rodríguez 1 1 Escuela de Ingenier´ ıa Inform´ atica, Pontificia Universidad Cat´ olica de Valpara´ ıso, 2362807 Valpara´ ıso, Chile 2 Facultad de Ingenier´ ıa, Universidad Nacional de Chimborazo, 060102 Riobamba, Ecuador Correspondence should be addressed to Lida Barba; [email protected] Received 29 December 2015; Accepted 5 May 2016 Academic Editor: Giovanni Falsone Copyright © 2016 L. Barba and N. Rodr´ ıguez. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e traffic accidents occurrence urges the intervention of researchers and society; the human losses and material damage could be abated with scientific studies focused on supporting prevention plans. In this paper prediction strategies based on singular values and autoregressive models are evaluated for multistep ahead traffic accidents forecasting. ree time series of injured people in traffic accidents collected in Santiago de Chile from 2000:1 to 2014:12 were used, which were previously classified by causes related to the behavior of drivers, passengers, or pedestrians and causes not related to the behavior as road deficiencies, mechanical failures, and undetermined causes. A simplified form of Singular Spectrum Analysis (SSA), combined with the autoregressive linear (AR) method, and a conventional Artificial Neural Network (ANN) are proposed. Additionally, equivalent models that combine Hankel Singular Value Decomposition (HSVD), AR, and ANN are evaluated. e comparative analysis shows that the hybrid models SSA-AR and SSA-ANN reach the highest accuracy with an average MAPE of 1.5% and 1.9%, respectively, from 1- to 14-step ahead prediction. However, it was discovered that HSVD-AR shows a higher accuracy in the farthest horizons, from 12- to 14-step ahead prediction, which reaches an average MAPE of 2.2%. 1. Introduction Traffic accidents with fatalities and severely injured people are a socioeconomic problem of focus. According to the WHO [1], the traffic accidents cost between 1% and 3% of the GDP of a country, regardless of invaluable emotional damage for the victims and their families. Several studies have been developed to explain the nature of the problem, most through classification; diverse methods have been used to detect the key factors that influence the incidents severity. Abell´ an et al. (2013) used decision trees to extract some key patterns of severe accidents in Granada; the rules were defined in function of the variables related to atmospheric factors, driver characteristics, road conditions, or a factor combination [2]. Chang and Chien applied a method based on nonparametric classification and regression tree to establish the empirical relationship between injury severity and driver/vehicle char- acteristics in truck-involved accidents [3]. De O˜ na et al. (2013) use Latent Class Clustering and Bayesian networks to identify the variables involved in traffic accidents; the accident type and sight distance were detected in all the traffic accidents on rural highways in Granada [4]. Recently, Shiau et al. (2015) presented Fuzzy Robust Principal Component Analysis (FRPCA) combined with Backpropagation Neural Network (BPNN) to identify the relationships between the variables of road designs, rule-violation items, and accident types; the results showed an 85.89% classification accuracy [5]. Lin et al. (2015) used real time traffic accidents in a highway in Virginia; they found that the best model was based on Frequent Pattern and a Bayesian network; the model predicted 61.1% of accidents while having a false alarm rate of 38.16% [6]. e risk indicating variables of traffic accidents have been identified taking into consideration diverse factors; some factors are related to road design parameters [7, 8], environment conditions [9], traffic signs, or interactions among some factors [10]. e data provided by the Chilean Commission of Traffic Safety (CONASET) [11] shows a high and increasing rate of Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2016, Article ID 2030647, 14 pages http://dx.doi.org/10.1155/2016/2030647

Upload: others

Post on 25-Sep-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

Research ArticleHybrid Models Based on Singular Values and AutoregressiveMethods for Multistep Ahead Forecasting of Traffic Accidents

Lida Barba12 and Nibaldo Rodriacuteguez1

1Escuela de Ingenierıa Informatica Pontificia Universidad Catolica de Valparaıso 2362807 Valparaıso Chile2Facultad de Ingenierıa Universidad Nacional de Chimborazo 060102 Riobamba Ecuador

Correspondence should be addressed to Lida Barba lbarbaunacheduec

Received 29 December 2015 Accepted 5 May 2016

Academic Editor Giovanni Falsone

Copyright copy 2016 L Barba and N Rodrıguez This is an open access article distributed under the Creative Commons AttributionLicense which permits unrestricted use distribution and reproduction in any medium provided the original work is properlycited

The traffic accidents occurrence urges the intervention of researchers and society the human losses and material damage could beabated with scientific studies focused on supporting prevention plans In this paper prediction strategies based on singular valuesand autoregressive models are evaluated for multistep ahead traffic accidents forecasting Three time series of injured people intraffic accidents collected in Santiago de Chile from 20001 to 201412 were used which were previously classified by causes relatedto the behavior of drivers passengers or pedestrians and causes not related to the behavior as road deficiencies mechanical failuresand undetermined causes A simplified form of Singular Spectrum Analysis (SSA) combined with the autoregressive linear (AR)method and a conventional Artificial Neural Network (ANN) are proposed Additionally equivalent models that combine HankelSingular Value Decomposition (HSVD) AR and ANN are evaluated The comparative analysis shows that the hybrid modelsSSA-AR and SSA-ANN reach the highest accuracy with an average MAPE of 15 and 19 respectively from 1- to 14-step aheadprediction However it was discovered that HSVD-AR shows a higher accuracy in the farthest horizons from 12- to 14-step aheadprediction which reaches an average MAPE of 22

1 Introduction

Traffic accidents with fatalities and severely injured peopleare a socioeconomic problem of focus According to theWHO [1] the traffic accidents cost between 1 and 3 of theGDP of a country regardless of invaluable emotional damagefor the victims and their families Several studies have beendeveloped to explain the nature of the problemmost throughclassification diverse methods have been used to detect thekey factors that influence the incidents severity Abellan etal (2013) used decision trees to extract some key patternsof severe accidents in Granada the rules were defined infunction of the variables related to atmospheric factors drivercharacteristics road conditions or a factor combination [2]Chang and Chien applied a method based on nonparametricclassification and regression tree to establish the empiricalrelationship between injury severity and drivervehicle char-acteristics in truck-involved accidents [3] De Ona et al(2013) use Latent Class Clustering and Bayesian networks

to identify the variables involved in traffic accidents theaccident type and sight distance were detected in all the trafficaccidents on rural highways in Granada [4] Recently Shiauet al (2015) presented Fuzzy Robust Principal ComponentAnalysis (FRPCA) combined with Backpropagation NeuralNetwork (BPNN) to identify the relationships between thevariables of road designs rule-violation items and accidenttypes the results showed an 8589 classification accuracy[5] Lin et al (2015) used real time traffic accidents in ahighway in Virginia they found that the best model wasbased on Frequent Pattern and a Bayesian network themodelpredicted 611 of accidents while having a false alarm rateof 3816 [6]The risk indicating variables of traffic accidentshave been identified taking into consideration diverse factorssome factors are related to road design parameters [7 8]environment conditions [9] traffic signs or interactionsamong some factors [10]

The data provided by the Chilean Commission of TrafficSafety (CONASET) [11] shows a high and increasing rate of

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 2030647 14 pageshttpdxdoiorg10115520162030647

2 Mathematical Problems in Engineering

fatalities and injuries in traffic accidents from 2000 to 2014Santiago is the most populated Chilean region whose timeseries of injured people is used in this analysisThe abundanceof information and research about traffic accidents severitiesrisks and occurrence factors among others might appearsomewhat unapproachable in terms of the prevention plans

In this case study with categorical causes defined byCONASET and the ranking method the potential causesof injuries in traffic accidents which can be counteractedthrough campaigns directed to the change of attitude indrivers passengers and pedestrians toward road safety havebeen identified Two groups have been created based on theprimary and secondary causes which are present in around75 of injured people and a third group was created withthe remaining 25 At first nonstationary and nonlinearcharacteristics were found in the time series turning theforecasting into a difficult task Some researches exploit thepotentialities of the SSA technique to extract components ina time series SSA commonly has been used to extract trendseasonality andor noise [12] the extracted components areused to explain the complex behavior of some time series indiverse ambit from nature [13 14] to industrial process [15]or economic indicators [16]

On the other hand the combination of SSA and autore-gressive linear and nonlinear methods is a recent alternativewhich has demonstrated robustness and universal capacitiesin terms of short-term forecasting [17ndash19] SSA combinedwith artificial intelligence techniques was used by Xiao et al(2014) for monthly air transport demand forecasting [20] asimilar combination was made by Abdollahzade et al (2015)with a nonlinear and chaotic time series [21]

To our knowledge [22] one-step ahead forecasting basedon HSVD combined with ARIMA and neural networksreaches high accuracy in traffic accidents prediction HSVDhas similarities with respect to SSA in the steps of embeddingdecomposition and grouping The main difference is inthe last step previous to the component extraction HSVDavoids diagonal averaging instead a particular extractionprocess is proposed Although this difference exists in termsof computational complexity both SSA and HSVD haveequal floating point operations which is due to the R-bidiagonalization (RSVD) [23] implemented in both casesFor that reason comparisons in terms of implementation andforecasting results are presented

This forecasting approach is described in two stagespreprocessing and prediction In the first stage SingularSpectrumAnalysis and SingularValueDecomposition (SVD)of Hankel are implemented to obtain the components oflow and high frequency from the observed time series anadditive component of low frequency is extracted whereasthe component of high frequency is computed by subtractionThe result is a pair of smoothed time series which can bepredicted robustly by computing low-order autoregressivemethods in the second stage The direct method is appliedin the second stage to develop multistep ahead predictionwith multi-input and single-output The inputs are the com-ponents lagged values whose optimal number is identifiedthrough the Autocorrelation Function

Conventional SSA is put into practice in four stepsembedding decomposition grouping and diagonal averag-ing over all the elementary matrices [24] In this work theSSA implementation is simplified in three steps embeddingdecomposition and diagonal averaging only one elementarymatrix is needed and it is computed with the first SVDeigentriple The time series of length 119901 is embedded in atrajectory matrix of dimension 119903 times 119902 where 119903 is the windowlengthThe general rule used to delimit the windows length is2 le 119903 le 1199012 a large decomposition is given with a high valueof 119903 while a short decomposition is the opposite During theliterature review some strategies have been used to select thewindow length some instances are 119903 = 1199014 [25] weightedcorrelation [26] and extreme of autocorrelation [27] Themethod used in this work to select the optimal windowlength is based on the Shannon entropy of the singular valuesThe second step of SSA is Singular Value Decompositionof the matrix obtained in the embedding with the firsteigentriple an elementary matrix is computed Finally bydiagonal averaging over the elementary matrix the elementsof low frequency component are extracted

The contribution is an accurate multistep ahead forecast-ing methodology based on singular values decompositionand autoregressive models through the comparison of fourhybrid models The prediction is focused on causes of thetraffic accidents with injured people which is oriented tosupport prevention plans of the government and policeThe paper is organized as follows Section 2 describes theMethodology Section 3 shows efficiency criteria to evaluatethe prediction accuracy Section 4 characterizes the CaseStudy Section 5 presents the Empirical Research ResultFinally Section 6 concludes the paper

2 Methodology

Initially the ranking technique is applied to find the potentialcauses of at least 75 of the events registered in the historicaltime series of injured people in traffic accidents in Santiagode Chile The causes related to drivers pedestrians andpassengers behavior were prioritized

The forecasting methodology applied in the analyzedhybrid models is described in two stages preprocessing andprediction as Figure 1 illustrates In the preprocessing stageSingular Spectrum Analysis and Singular Value Decompo-sition of Hankel are used to extract an additive componentof low frequency from the observed time series and bysimple subtraction between the observed time series andthe component of low frequency the component of highfrequency is obtained In the prediction stage linear andnonlinear models are implemented

21 Singular Spectrum Analysis Singular Spectrum Analy-sis extracts the component of low frequency 119888

119871from the

observed time series The component of high frequency 119888119867

is obtained by simple subtraction between the observed timeseries 119909 and the component 119888

119871 Consider

119888119867= 119909 minus 119888

119871 (1)

Mathematical Problems in Engineering 3

x r

Y

A

A

+

120582UV

120582UVcL

cLcL

cL cH

cH

x

Prep

roce

ssin

g SS

A

Embedding

Singular valuedecomposition

Diagonalaveraging

Preprocessing HSVD

Extraction HSVD

ARor

ANNPrediction

cL

cH

Figure 1 Traffic accidents forecasting methodology

Conventional SSA is implemented in four steps embeddingdecomposition grouping and diagonal averaging [24] Inthis work SSA is simplified in three steps embeddingdecomposition and diagonal averaging

The embedding step maps the time series 119909 of length 119901 toa sequence of multidimensional lagged vectors the Hankelmatrix structure is used in the embedding

119884 =(

(

11990911199092sdot sdot sdot 119909

119902

11990921199093sdot sdot sdot 119909119902+1

119909119903119909119903+1

sdot sdot sdot 119909119901

)

)

(2)

where the elements 119884(119894 119895) = 119909119894+119895minus1

The window length 119903 hasan important role in the forecasting model it has an initialvalue of 119903 = 1199012 and 119902 is computed as follows

119902 = 119901 minus 119903 + 1 (3)

The Singular Value Decomposition of the real matrix 119884 hasthe form

119884 =

119903

sum

119894=1

radic120582119894119880119894119881⊤

119894 (4)

where each 120582119894is the 119894th eigenvalue of the matrix 119878 = 119884119884⊤

arranged in decreasing order of magnitudes 1198801 119880

119903is

the corresponding orthonormal eigenvectors system of thematrix 119878

Standard SVD terminology calls radic120582119894the 119894th singular

value of the matrix 119884 119880119894is the 119894th left singular vector and

119881119894is the 119894th right singular vector of 119884 The collectionradic120582

119894119880119894119881119894

is called 119894th eigentriple of the SVDThe computation of the optimal window length 119903 is based

on the eigenvalues differential entropyThe first 119905 eigenvaluesobtained with 119903 = 1199012 contain a high spread of energytherefore the window length is evaluated in the range [119903 =2 119905] by means of the differential entropy as follows

Δ119867119894= 119867119894+1minus 119867119894 119894 = 1 119905 minus 1 (5a)

119867119894= minus

119903

sum

119895=1

120582119895log2120582119895 (5b)

120582119895=

120582119895

sum119903

119896=1120582119895

(5c)

where Δ119867119894is the 119894th differential entropy 119867

119894is the 119894th

Shannon entropy and120582119895is the 119895th normalized eigenvalue also

known as eigenvalue energyThe embedding step is executedagain with this decomposition that reaches a high energyspread and lower differential entropy With the optimal 119903 theembedding and decomposition are computed again

The first eigentriple is used to obtain the elementarymatrix 119860 which will be used in the extraction of the lowfrequency component

119860 = radic12058211198801119881⊤

1 (6)

The step of diagonal averaging is applied over119860 to extract theelements of the component 119888

119871 the process is shown below

119888119871(119894 119895)

=

1

119896 minus 1

119896

sum

119898=1

119860 (119898 119896 minus 119898) 2 le 119896 le 119903

1

119903

119903

sum

119898=1

119860 (119898 119896 minus 119898) 119903 lt 119896 le 119902 + 1

1

119902 + 119903 minus 119896 + 1

119903

sum

119898=119896minus119902

119860 (119898 119896 minus 119898) 119902 + 2 le 119896 le 119902 + 119903

(7)

Once 119888119871is obtained the component 119888

119867is computed with (1)

22 Hankel Singular Value Decomposition The preprocess-ing based on Singular Value Decomposition of Hankel isimplemented in three steps embedding decomposition andextraction HSVD implements the steps of embedding anddecomposition as SSA (presented in Section 21) The ele-mentary matrix 119860 is also computed with the first eigentripleobtained in the decomposition step (as (6))

In the extraction step the elements of the low frequencycomponent 119888

119871are obtained from the first row and the last

column of the matrix 119860 which has the same structure asmatrix 119884 (trajectory matrix) therefore the elements of 119888

119871are

119888119871= [119860 (1 1) 119860 (1 2) sdot sdot sdot 119860 (1 119902) 119860 (2 119902) 119860 (3 119902) sdot sdot sdot 119860 (119903 119902)] (8)

where 119860 is a 119903 times 119902matrix

4 Mathematical Problems in Engineering

23 Prediction with the Autoregressive Method The predic-tion is the second stage of the traffic accidents forecastingmethodology (illustrated in Figure 1) In order to obtain thetraffic accidents prediction during the preprocessing stagethe low frequency component 119888

119871and the high frequency

component 119888119867were obtainedThe components are estimated

through the autoregressive method and the addition of thecomponents is computed to obtain the prediction as follows

119899+ℎ= 119888119871

119899+ℎ+ 119888119867

119899+ℎ (9)

where 119899 represents the time instant and ℎ represents thehorizon with values ℎ = 1 120591 The component 119888

119871is used

as exogenous variable in the computation of the 119888119867 due to a

high influence of 119888119871over 119888

119867

The predicted components via ARmodel are definedwith

119888119871

119899+ℎ=

119898minus1

sum

119894=0

120572119894119888119899minus119894

119871 (10a)

119888119867

119899+ℎ=

119898minus1

sum

119894=0

120573119894119888119899minus119894

119867+

119898minus1

sum

119894=0

120573119898+119894119888119899minus119894

119871 (10b)

where119898 is the number of lagged values and 120572119894and 120573

119894are the

coefficients of 119888119871and 119888119867 respectively

The coefficients estimation is based on linear Least SquareMethod (LSM) the components 119888

119871and 119888119867are defined with

the linear relationship expressed in matrix form

119888119871= 120572119877 (11a)

119888119867= 120579119885 (11b)

where 119877 and 119885 are the regressor matrices of 119888119871and 119888

119867

respectively 120572 and 120579 are the coefficients vectors of 119877 and 119885respectively The coefficients are computed with the Moore-Penrose pseudoinverse matrices 119877dagger and 119885dagger as follows

120572 = 119877dagger119888119871 (12a)

120579 = 119885dagger119888119867 (12b)

24 Prediction with the Autoregressive Neural Network Inthis case study a single hidden layer Autoregressive NeuralNetwork is used to approach each component the ANN hasa standard multilayer perceptron (MLP) structure of threelayers [28]The training subset is iteratively used to adjust theconnections weights via learning algorithm the ANN withthe lowest error is selected to implement the solution withthe testing subset

The nonlinear inputs are the lagged terms which arecontained in the regressor matrix119877 at hidden layer is appliedthe sigmoid transfer function and at output layer is obtainedthe prediction The ANN output is

119899+ℎ= V119895ℎ119899+ℎ (13a)

ℎ119899+ℎ= 119891[

119898minus1

sum

119894=0

119908119895119894119877119899+ℎminus119894

119894] (13b)

where is the predicted value 119899 is the time instant V119895and119908

119895119894

are the linear and nonlinear weights of the ANN connectionsrespectively the sigmoid transfer function is computed with

119891 (119909) =1

1 + 119890minus119909 (14)

The ANN structure for 119888119871

prediction is denoted withANN(119898 119895 1) with 119898 inputs 119895 hidden nodes and 1 out-put 119888119871while the ANN structure for 119888

119867is denoted with

ANN(2119898 119895 1) with 2119898 inputs 119895 hidden nodes and 1 output119888119867 Levenberg-Marquardt is the learning algorithm applied

for weight updating in both neural networks [29]

3 Efficiency Criteria

The forecasting accuracy is evaluated with conventionalmetrics and an improved evaluationmetricThe conventionalmetrics are Mean Absolute Percentage Error (MAPE) RootMean Square Error (RMSE) Determination Coefficient 1198772and Relative Error (RE) The Modified Nash-Sutcliffe Effi-ciency (MNSE) metric is computed in order to improve theevaluation criteria which is sensitive to significant overfittingor underfitting [30] Consider

MAPE = 1119901119905

119901119905

sum

119894=1

10038161003816100381610038161003816100381610038161003816

119909119894minus 119894

119909119894

10038161003816100381610038161003816100381610038161003816

100

RMSE = radic 1119901119905

119901119905

sum

119894=1

(119909119894minus 119894)2

1198772= [1 minus

var (119909 minus )var (119909)

] 100

RE = [(119909 minus )119909

] 100

MNSE = [1 minussum119901119905

119894=1

1003816100381610038161003816119909119894 minus 1198941003816100381610038161003816

sum119901119905

119894=1

1003816100381610038161003816119909119894 minus 1199091003816100381610038161003816

] 100

(15)

where 119909 is the observed signal is the predicted signal 119901119905is

the testing sample size and var is the varianceFurthermore two statistical tests are computed to eval-

uate the differences and superiorities of either model theWilcoxon test and Pitmanrsquos correlation test respectively

TheWilcoxon (119882) signed rank test evaluates the pairwisedifferences in the squares of each multistep ahead residualsthe differences are ranked in ascending order with no regardto the sign and the ranks are assigned fromone to the numberof the forecast errors available for comparison The sum ofthe ranks of positive differences is then computed to obtain119882 [31] The probability 119901 of finding a test statistic as or moreextreme than the observed value under the null hypothesis isfound using the 119885-statistics given by

119885 =119882 minus 119901

119905(119901119905+ 1) 4

radic119901119905(119901119905+ 1) (2119901

119905+ 1) 24

(16)

Mathematical Problems in Engineering 5

Pitmanrsquos correlations test is applied to identify the superiorityof a model in pairwise comparisons [32] the test is basedon the computation of the correlation 119877 between Υ and Ψ asfollows

119877 =cov (Υ Ψ)

var (Υ) var (Ψ) (17a)

Υ = 1198641 (119894) + 1198642 (119894) (17b)

Ψ = 1198641 (119894) minus 1198642 (119894) (17c)

where cov is the covariance and 119894 = 1 1199011199051198641is the residual

vector obtained with model 1 and 1198642is the residual vector

obtained with model 2 Model 1 would show superiority infront of model 2 at 5 significance level if |119877| gt 196radic119901119905

4 Case Study

The Chilean police (Carabineros de Chile) collects the fea-tures of the traffic accidents and CONASET records the dataThe regional population is estimated in 6061185 inhabitantsequivalent to 401 of the national population Santiagoshows a high rate of the events with severe injuries from 2000to 2014 with 260074 injured people

The entire series of 783 registers is shown in Figure 2which have been collected in the mentioned period fromJanuary to December with weekly sampling The highestnumber of injured people was observed in weeks 84 184254 and 280 while the lowest number of injured people wasobserved in weeks 344 365 426 500 and 756

One hundred causes of traffic accidents have been definedby CONASET and grouped into categories In this casestudy three analysis groups have been created In groups 1and 2 those causes directly related to behavior of driverspassengers or pedestrians have been prioritized Group 3involves the rest of the causes Figures 3(a) 4(a) and 5(a)show the observed time series of the groups Injured-G1Injured-G2 and Injured-G3 respectively the values havebeen normalized via division by the maximum value in eachtime series

The categories of groups 1 and 2 are as follows recklessdriving recklessness in passenger recklessness in pedestrianalcohol in driver alcohol in pedestrian and disobedience tosignal In Table 1 are shown 20 causes of groups 1 and 2 whichcover 75 of the events with injured people The causes arelisted sequentially the cause with the highest importance hasvalue 1 (with the highest number of injured people) and thecause with minor importance has value 20

From Table 1 the first three causes of injuries in trafficaccidents are as follows unwise distance inattention to trafficconditions and disrespect to red light The cause with thelowest importance for injuries in traffic accidents is drunkpedestrian The categories with the highest number of causesare as follows imprudent driving and disobedience to signal

Two groups of analysis were formed from the informationpresented in Table 1 the first group labeled with Injured-G1 corresponds to the first ten primary causes and thesecond group labeled with Injured-G2 corresponds to the tensecondary causes The first group overspreads around 60

1 54 107 160 213 266 319 372 425 478 531 584 637 690 743 783

200

250

300

350

400

450

500

550

Time (weeks)

Num

ber o

f inj

ured

peo

ple

Figure 2 Injured people in traffic accidents

of injured people in traffic accidents whereas the secondgroup overspreads around 15The third group labeled withInjured-G3 corresponds to the categories road deficienciesmechanical failures and undeterminednoncategorized causesthis group overspreads around 25 of injured people

Complementary information was observed about trafficaccidents conditions with high rate of injured people Withregard to vehicles automobiles are involved in 54 of eventsfollowed by vans and trucks with 19 bus and trolley with16 motorcycles and bicycles with 8 and others with 3With regard to environmental conditions 85 of events wasobserved with cloudless conditions Additionally 97 of thetraffic accidentswith injured people have taken place in urbanareas whereas 3 correspond to rural area With regard torelative position 46 of the events have been produced inintersections controlled by traffic signals or police officerswith 46 followed by accidents that happened in straightsections with 37 and other relative positions with 17

5 Empirical Research Result

The results of the methodology implementation with linearand nonlinear models are described by stages componentsextraction and prediction

51 Components Extraction The methodology presented inSection 2 describes the preprocessing stage and the predictionstage The preprocessing stage is based on two types ofmethods Singular Spectrum Analysis and Singular ValueDecomposition of Hankel

Both preprocessing techniques SSA and HSVD embedthe time series in a structure of two dimensions the initialwindow length used is 119903 = 1199012 Once matrix 119884 is obtainedthe decomposition is computedThe differential energy of theeigenvalues is obtained with (5a) A high energy content wasobserved in the first 119905 = 20 eigenvalues the lowest differentialenergywas observed in decompositions based on values of 1517 and 16 of window length for Injured-G1 Injured-G2 andInjured-G3 respectively and these values were set as windowlength

The embedding process is implemented again with theoptimal window 119903 and the decomposition is recomputedThe first elementary matrix 119860 is used by SSA and HSVD to

6 Mathematical Problems in Engineering

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Time (weeks)

Inju

red-

G1

02

04

06

08

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

Time (weeks)

c L

SSAHSVD

(b)

Time (weeks)1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

minus05

0

05

c H

SSAHSVD

(c)

Figure 3 (a) Injured-G1 (b) Low frequency component (c) High frequency component

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 7830

1

Inju

red-

G2

Time (weeks)

02040608

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783Time (weeks)

02

04

06

08

c L

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783minus04minus02

0

Time (weeks)

0204

c H

SSAHSVD

(c)

Figure 4 (a) Injured-G2 (b) Low frequency component (c) High frequency component

obtain the low frequency 119888119871component In SSA to extract

the elements of119860 diagonal averaging is applied while HSVDuses direct extraction from the first row and last column of119860 Finally with (1) the component of high frequency 119888

119867was

computedEach data set of low and high frequency has been divided

into two subsets training and testing the training subset

involves 70 of the samples and consequently the testingsubset involves the remaining 30

The decomposition results by means of SSA and HSVDFigures 3(b) 4(b) and 5(b) show the low frequency com-ponents whereas Figures 3(c) 4(c) and 5(c) show thehigh frequency components of Injured-G1 Injured-G2 andInjured-G3 respectively

Mathematical Problems in Engineering 7

Table 1 Causes of injuries in traffic accidents (group 1 and group 2)

Category Number Cause Importance

Imprudent driving

1 Unwise distance 12 Inattention to traffic conditions 23 Disrespect to pedestrian passing 84 Disrespect for giving the right of way 95 Unexpected change of track 106 Improper turns 117 Overtaking without enough time or space 148 Opposite direction 189 Backward driving 19

Disobedience to signal

10 Disrespect to red light 311 Disrespect to stop sign 412 Disrespect to give way sign 613 Improper speed 13

Alcohol in driver 14 Drunk driver 715 Driving under the influence of alcohol 15

Recklessness in pedestrian16 Pedestrian crossing the road suddenly 517 Reckless pedestrian 1218 Pedestrian outside the allowed crossing 17

Recklessness in passenger 19 Get in or get out of a moving vehicle 16Alcohol in pedestrian 20 Drunk pedestrian 20

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Inju

red-

G3

02040608

Time (weeks)

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

c L

Time (weeks)

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

0

05

Time (weeks)

minus05

c H

SSAHSVD

(c)

Figure 5 (a) Injured-G3 (b) Low frequency component (c) High frequency component

The nonstationary trend in the signals was verified forInjured-G1 Injured-G2 and Injured-G3 through Kwiatkow-ski Phillips Schmidt and Shin test (KPSS) [33] The testassesses the null hypothesis that the signals are trend sta-tionary it was rejected at 5 significance level consequentlynonstationary unit-root processes are present in the signals

The time series Injured-G1 is observed in Figure 3(a)similar dynamic is observed with respect to full series takinginto consideration that this group contains the predominantcauses The 119888

119871components extracted by SSA and HSVD

are shown in Figure 3(b) long-memory periodicity featureswere observedThe 119888

119867resultant components are presented in

8 Mathematical Problems in Engineering

Figure 3(c) short-term periodic fluctuations were identifiedThe components obtained with SSA and HSVD are similarhowever a slight difference is observed in the componentsof low frequency SSA extracts smoother components thanHSVD

Both techniques SSA and HSVD show that the principalten causes (in Table 1 with importance 1 to 10) present thehighest incidence between years 2002 and 2005 (weeks 106 to312) it descends from 2006 until half 2012 (around week 710)an increment is observed between weeks 711 and 732 (secondsemester of 2013 and first semester of 2014)

Figure 4 shows the time series Injured-G2 the lowfrequency component and the high frequency componentAs previous series the components of low frequency showsimilar dynamic of slow fluctuations with decreasing trend(119888119871) while the high frequency shows fast fluctuations In this

group is also observed 119888119871via SSA smoother than 119888

119871viaHSVD

Both techniques SSA and HSVD show that the secondaryten causes (described in Table 1 with importance 11 to 20)present the highest incidence of injured people in years 20002003 and 2004 it descends from 2005 therefore forwarddowntrend is observed in the number of injured people intraffic accidents due to the 10 secondary causes

Figure 5 shows Injured-G3 and its components 119888119871and 119888119867

as in previous analysis the components of low frequency showlong-memory periodicity features whereas the componentsof high frequency show short-term periodic fluctuations and119888119871via SSA is smoother than 119888

119871via HSVD

Both techniques SSA and HSVD show that the causesare related to road deficiencies mechanical failures andundeterminednoncategorized causes (with 25 of incidence)The highest incidence is observed in year 2001 from year2002 it presents strong decay until 2006 and forward uptrendis observed with a temporal decrease in 2009

Prevention plans and punitive laws have been imple-mented in Chile during the analyzed period via educationdrivers licensing reforms zero tolerance law Emiliarsquos law andtransit law reforms among others The effect of a particularpreventive or punitive action is not analyzed in this workhowever the proposed short-term prediction methodologybased on observed causes and intrinsic components is a con-tribution to government and society in preventive plans for-mulation its implementation and the consequent evaluation

52 Prediction The prediction is implemented by means ofthe autoregressive models linear (AR) or nonlinear (ANN)The models use the lagged terms of the components 119888

119871and

119888119867 the optimal number of the lagged terms was fixed in

119898 = 32 weeks which was found through the computation ofthe Autocorrelation Function over the observed time seriesof injured people due to all causes

The models based on Singular Spectrum Analysis (SSA-AR and SSA-ANN) receive the components of SSA prepro-cessing stage whereas the models based on Singular ValueDecomposition of Hankel (HSVD-AR and HSVD-ANN)receive the components of HSVD preprocessing stage

TheANNhas single hidden layer structure (32 1 1) with32 inputs 1 hidden node and 1 outputThe LM algorithmwasused iteratively to adjust the linear and nonlinear weights

The direct method was used to develop multistep aheadforecasting in Tables 2 and 3 are presented the averageprediction results with hybrid models SSA-AR SSA-ANNHSVD-AR and HSVD-ANN with the three time series Thearithmetic mean of the resultant metrics is presented inTables 2 and 3 the results shows that the accuracy decreasesas the time horizon increases therefore the best accuracywasobtained for the nearest weeks and the lowest accuracy wasobtained for the farthest weeks The best mean accuracy wasreached by using SSA-AR model with MNSE of 926 1198772 of993 MAPE of 15 and RMSE of 07 The lowest meanaccuracy was obtained with HSVD-ANNmodel with MNSEof 8561198772 of 982MAPE of 29 andRMSE of 14Thesecond best average accuracy was reached by SSA-ANN andthe third best accuracy was reached by HSVD-AR

The highest gain in average MNSE from 1- to 14-stepahead prediction is 76 while average MAPE is 933However it was observed that HSVD-AR shows a higheraccuracy in farthest horizons from 12- to 14-step aheadprediction which reaches these average results MNSE of894 1198772 of 989 MAPE of 22 and of RMSE 10 thegain in average MNSE from 12- to 14-step ahead prediction is121 and on average MAPE is 833

Prediction horizons higher than 14 weeks provide inaccu-rate results

From previous tables similar accuracy was identified inthe prediction through SSA-AR HSVD-AR and SSA-ANNfor 11-step ahead prediction The predicted signals are shownin Figures 6 7 and 8 whereas the metrics residuals arepresented in Tables 4 5 and 6

The results for 11-step ahead prediction of Injured-G1are shown in Figures 6(a) 6(c) 6(e) and Table 4 Fromthese figures and metrics a good fit is observed the highestaccuracy was reached via SSA-AR withMNSE of 876 1198772 of986 MAPE of 20 and RMSE of 11

In Figures 6(b) 6(d) and 6(f) the Relative Error ofInjured-G2 prediction is shown The model SSA-AR shows947 of the predicted points with Relative Error lowerthan plusmn5 HSVD-AR 867 and SSA-ANN 831 The gainwas computed by means of residual metrics and the twobest models (SSA-AR and HSVD-AR) the highest gain wasobserved in MAPE with 25

The results for 11-step ahead prediction of Injured-G2are shown in Figures 7(a) 7(c) and 7(e) and Table 5 allmodels achieve a good fit SSA-AR and HSVD-AR reachthe highest and similar accuracy In Figures 7(b) 7(d) and7(f) the Relative Error is shown SSA-AR presents 858of the predicted points with a Relative Error lower thanplusmn5 HSVD-AR 853 and SSA-ANN 80 The gain wascomputed based on residual metrics and the two best modelsthe highest gain was observed in MAPE with 107

The results for 11-step ahead prediction of Injured-G3are presented in Figures 8(a) 8(c) and 8(e) and Table 6 allmodels achieve also a goodfit and similar accuracy In Figures8(b) 8(d) and 8(f) the Relative Error is illustrated SSA-ARpresents 924 of predicted points with a Relative Error lowerthan plusmn5 HSVD-AR 942 and SSA-ANN 916 The gain

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

2 Mathematical Problems in Engineering

fatalities and injuries in traffic accidents from 2000 to 2014Santiago is the most populated Chilean region whose timeseries of injured people is used in this analysisThe abundanceof information and research about traffic accidents severitiesrisks and occurrence factors among others might appearsomewhat unapproachable in terms of the prevention plans

In this case study with categorical causes defined byCONASET and the ranking method the potential causesof injuries in traffic accidents which can be counteractedthrough campaigns directed to the change of attitude indrivers passengers and pedestrians toward road safety havebeen identified Two groups have been created based on theprimary and secondary causes which are present in around75 of injured people and a third group was created withthe remaining 25 At first nonstationary and nonlinearcharacteristics were found in the time series turning theforecasting into a difficult task Some researches exploit thepotentialities of the SSA technique to extract components ina time series SSA commonly has been used to extract trendseasonality andor noise [12] the extracted components areused to explain the complex behavior of some time series indiverse ambit from nature [13 14] to industrial process [15]or economic indicators [16]

On the other hand the combination of SSA and autore-gressive linear and nonlinear methods is a recent alternativewhich has demonstrated robustness and universal capacitiesin terms of short-term forecasting [17ndash19] SSA combinedwith artificial intelligence techniques was used by Xiao et al(2014) for monthly air transport demand forecasting [20] asimilar combination was made by Abdollahzade et al (2015)with a nonlinear and chaotic time series [21]

To our knowledge [22] one-step ahead forecasting basedon HSVD combined with ARIMA and neural networksreaches high accuracy in traffic accidents prediction HSVDhas similarities with respect to SSA in the steps of embeddingdecomposition and grouping The main difference is inthe last step previous to the component extraction HSVDavoids diagonal averaging instead a particular extractionprocess is proposed Although this difference exists in termsof computational complexity both SSA and HSVD haveequal floating point operations which is due to the R-bidiagonalization (RSVD) [23] implemented in both casesFor that reason comparisons in terms of implementation andforecasting results are presented

This forecasting approach is described in two stagespreprocessing and prediction In the first stage SingularSpectrumAnalysis and SingularValueDecomposition (SVD)of Hankel are implemented to obtain the components oflow and high frequency from the observed time series anadditive component of low frequency is extracted whereasthe component of high frequency is computed by subtractionThe result is a pair of smoothed time series which can bepredicted robustly by computing low-order autoregressivemethods in the second stage The direct method is appliedin the second stage to develop multistep ahead predictionwith multi-input and single-output The inputs are the com-ponents lagged values whose optimal number is identifiedthrough the Autocorrelation Function

Conventional SSA is put into practice in four stepsembedding decomposition grouping and diagonal averag-ing over all the elementary matrices [24] In this work theSSA implementation is simplified in three steps embeddingdecomposition and diagonal averaging only one elementarymatrix is needed and it is computed with the first SVDeigentriple The time series of length 119901 is embedded in atrajectory matrix of dimension 119903 times 119902 where 119903 is the windowlengthThe general rule used to delimit the windows length is2 le 119903 le 1199012 a large decomposition is given with a high valueof 119903 while a short decomposition is the opposite During theliterature review some strategies have been used to select thewindow length some instances are 119903 = 1199014 [25] weightedcorrelation [26] and extreme of autocorrelation [27] Themethod used in this work to select the optimal windowlength is based on the Shannon entropy of the singular valuesThe second step of SSA is Singular Value Decompositionof the matrix obtained in the embedding with the firsteigentriple an elementary matrix is computed Finally bydiagonal averaging over the elementary matrix the elementsof low frequency component are extracted

The contribution is an accurate multistep ahead forecast-ing methodology based on singular values decompositionand autoregressive models through the comparison of fourhybrid models The prediction is focused on causes of thetraffic accidents with injured people which is oriented tosupport prevention plans of the government and policeThe paper is organized as follows Section 2 describes theMethodology Section 3 shows efficiency criteria to evaluatethe prediction accuracy Section 4 characterizes the CaseStudy Section 5 presents the Empirical Research ResultFinally Section 6 concludes the paper

2 Methodology

Initially the ranking technique is applied to find the potentialcauses of at least 75 of the events registered in the historicaltime series of injured people in traffic accidents in Santiagode Chile The causes related to drivers pedestrians andpassengers behavior were prioritized

The forecasting methodology applied in the analyzedhybrid models is described in two stages preprocessing andprediction as Figure 1 illustrates In the preprocessing stageSingular Spectrum Analysis and Singular Value Decompo-sition of Hankel are used to extract an additive componentof low frequency from the observed time series and bysimple subtraction between the observed time series andthe component of low frequency the component of highfrequency is obtained In the prediction stage linear andnonlinear models are implemented

21 Singular Spectrum Analysis Singular Spectrum Analy-sis extracts the component of low frequency 119888

119871from the

observed time series The component of high frequency 119888119867

is obtained by simple subtraction between the observed timeseries 119909 and the component 119888

119871 Consider

119888119867= 119909 minus 119888

119871 (1)

Mathematical Problems in Engineering 3

x r

Y

A

A

+

120582UV

120582UVcL

cLcL

cL cH

cH

x

Prep

roce

ssin

g SS

A

Embedding

Singular valuedecomposition

Diagonalaveraging

Preprocessing HSVD

Extraction HSVD

ARor

ANNPrediction

cL

cH

Figure 1 Traffic accidents forecasting methodology

Conventional SSA is implemented in four steps embeddingdecomposition grouping and diagonal averaging [24] Inthis work SSA is simplified in three steps embeddingdecomposition and diagonal averaging

The embedding step maps the time series 119909 of length 119901 toa sequence of multidimensional lagged vectors the Hankelmatrix structure is used in the embedding

119884 =(

(

11990911199092sdot sdot sdot 119909

119902

11990921199093sdot sdot sdot 119909119902+1

119909119903119909119903+1

sdot sdot sdot 119909119901

)

)

(2)

where the elements 119884(119894 119895) = 119909119894+119895minus1

The window length 119903 hasan important role in the forecasting model it has an initialvalue of 119903 = 1199012 and 119902 is computed as follows

119902 = 119901 minus 119903 + 1 (3)

The Singular Value Decomposition of the real matrix 119884 hasthe form

119884 =

119903

sum

119894=1

radic120582119894119880119894119881⊤

119894 (4)

where each 120582119894is the 119894th eigenvalue of the matrix 119878 = 119884119884⊤

arranged in decreasing order of magnitudes 1198801 119880

119903is

the corresponding orthonormal eigenvectors system of thematrix 119878

Standard SVD terminology calls radic120582119894the 119894th singular

value of the matrix 119884 119880119894is the 119894th left singular vector and

119881119894is the 119894th right singular vector of 119884 The collectionradic120582

119894119880119894119881119894

is called 119894th eigentriple of the SVDThe computation of the optimal window length 119903 is based

on the eigenvalues differential entropyThe first 119905 eigenvaluesobtained with 119903 = 1199012 contain a high spread of energytherefore the window length is evaluated in the range [119903 =2 119905] by means of the differential entropy as follows

Δ119867119894= 119867119894+1minus 119867119894 119894 = 1 119905 minus 1 (5a)

119867119894= minus

119903

sum

119895=1

120582119895log2120582119895 (5b)

120582119895=

120582119895

sum119903

119896=1120582119895

(5c)

where Δ119867119894is the 119894th differential entropy 119867

119894is the 119894th

Shannon entropy and120582119895is the 119895th normalized eigenvalue also

known as eigenvalue energyThe embedding step is executedagain with this decomposition that reaches a high energyspread and lower differential entropy With the optimal 119903 theembedding and decomposition are computed again

The first eigentriple is used to obtain the elementarymatrix 119860 which will be used in the extraction of the lowfrequency component

119860 = radic12058211198801119881⊤

1 (6)

The step of diagonal averaging is applied over119860 to extract theelements of the component 119888

119871 the process is shown below

119888119871(119894 119895)

=

1

119896 minus 1

119896

sum

119898=1

119860 (119898 119896 minus 119898) 2 le 119896 le 119903

1

119903

119903

sum

119898=1

119860 (119898 119896 minus 119898) 119903 lt 119896 le 119902 + 1

1

119902 + 119903 minus 119896 + 1

119903

sum

119898=119896minus119902

119860 (119898 119896 minus 119898) 119902 + 2 le 119896 le 119902 + 119903

(7)

Once 119888119871is obtained the component 119888

119867is computed with (1)

22 Hankel Singular Value Decomposition The preprocess-ing based on Singular Value Decomposition of Hankel isimplemented in three steps embedding decomposition andextraction HSVD implements the steps of embedding anddecomposition as SSA (presented in Section 21) The ele-mentary matrix 119860 is also computed with the first eigentripleobtained in the decomposition step (as (6))

In the extraction step the elements of the low frequencycomponent 119888

119871are obtained from the first row and the last

column of the matrix 119860 which has the same structure asmatrix 119884 (trajectory matrix) therefore the elements of 119888

119871are

119888119871= [119860 (1 1) 119860 (1 2) sdot sdot sdot 119860 (1 119902) 119860 (2 119902) 119860 (3 119902) sdot sdot sdot 119860 (119903 119902)] (8)

where 119860 is a 119903 times 119902matrix

4 Mathematical Problems in Engineering

23 Prediction with the Autoregressive Method The predic-tion is the second stage of the traffic accidents forecastingmethodology (illustrated in Figure 1) In order to obtain thetraffic accidents prediction during the preprocessing stagethe low frequency component 119888

119871and the high frequency

component 119888119867were obtainedThe components are estimated

through the autoregressive method and the addition of thecomponents is computed to obtain the prediction as follows

119899+ℎ= 119888119871

119899+ℎ+ 119888119867

119899+ℎ (9)

where 119899 represents the time instant and ℎ represents thehorizon with values ℎ = 1 120591 The component 119888

119871is used

as exogenous variable in the computation of the 119888119867 due to a

high influence of 119888119871over 119888

119867

The predicted components via ARmodel are definedwith

119888119871

119899+ℎ=

119898minus1

sum

119894=0

120572119894119888119899minus119894

119871 (10a)

119888119867

119899+ℎ=

119898minus1

sum

119894=0

120573119894119888119899minus119894

119867+

119898minus1

sum

119894=0

120573119898+119894119888119899minus119894

119871 (10b)

where119898 is the number of lagged values and 120572119894and 120573

119894are the

coefficients of 119888119871and 119888119867 respectively

The coefficients estimation is based on linear Least SquareMethod (LSM) the components 119888

119871and 119888119867are defined with

the linear relationship expressed in matrix form

119888119871= 120572119877 (11a)

119888119867= 120579119885 (11b)

where 119877 and 119885 are the regressor matrices of 119888119871and 119888

119867

respectively 120572 and 120579 are the coefficients vectors of 119877 and 119885respectively The coefficients are computed with the Moore-Penrose pseudoinverse matrices 119877dagger and 119885dagger as follows

120572 = 119877dagger119888119871 (12a)

120579 = 119885dagger119888119867 (12b)

24 Prediction with the Autoregressive Neural Network Inthis case study a single hidden layer Autoregressive NeuralNetwork is used to approach each component the ANN hasa standard multilayer perceptron (MLP) structure of threelayers [28]The training subset is iteratively used to adjust theconnections weights via learning algorithm the ANN withthe lowest error is selected to implement the solution withthe testing subset

The nonlinear inputs are the lagged terms which arecontained in the regressor matrix119877 at hidden layer is appliedthe sigmoid transfer function and at output layer is obtainedthe prediction The ANN output is

119899+ℎ= V119895ℎ119899+ℎ (13a)

ℎ119899+ℎ= 119891[

119898minus1

sum

119894=0

119908119895119894119877119899+ℎminus119894

119894] (13b)

where is the predicted value 119899 is the time instant V119895and119908

119895119894

are the linear and nonlinear weights of the ANN connectionsrespectively the sigmoid transfer function is computed with

119891 (119909) =1

1 + 119890minus119909 (14)

The ANN structure for 119888119871

prediction is denoted withANN(119898 119895 1) with 119898 inputs 119895 hidden nodes and 1 out-put 119888119871while the ANN structure for 119888

119867is denoted with

ANN(2119898 119895 1) with 2119898 inputs 119895 hidden nodes and 1 output119888119867 Levenberg-Marquardt is the learning algorithm applied

for weight updating in both neural networks [29]

3 Efficiency Criteria

The forecasting accuracy is evaluated with conventionalmetrics and an improved evaluationmetricThe conventionalmetrics are Mean Absolute Percentage Error (MAPE) RootMean Square Error (RMSE) Determination Coefficient 1198772and Relative Error (RE) The Modified Nash-Sutcliffe Effi-ciency (MNSE) metric is computed in order to improve theevaluation criteria which is sensitive to significant overfittingor underfitting [30] Consider

MAPE = 1119901119905

119901119905

sum

119894=1

10038161003816100381610038161003816100381610038161003816

119909119894minus 119894

119909119894

10038161003816100381610038161003816100381610038161003816

100

RMSE = radic 1119901119905

119901119905

sum

119894=1

(119909119894minus 119894)2

1198772= [1 minus

var (119909 minus )var (119909)

] 100

RE = [(119909 minus )119909

] 100

MNSE = [1 minussum119901119905

119894=1

1003816100381610038161003816119909119894 minus 1198941003816100381610038161003816

sum119901119905

119894=1

1003816100381610038161003816119909119894 minus 1199091003816100381610038161003816

] 100

(15)

where 119909 is the observed signal is the predicted signal 119901119905is

the testing sample size and var is the varianceFurthermore two statistical tests are computed to eval-

uate the differences and superiorities of either model theWilcoxon test and Pitmanrsquos correlation test respectively

TheWilcoxon (119882) signed rank test evaluates the pairwisedifferences in the squares of each multistep ahead residualsthe differences are ranked in ascending order with no regardto the sign and the ranks are assigned fromone to the numberof the forecast errors available for comparison The sum ofthe ranks of positive differences is then computed to obtain119882 [31] The probability 119901 of finding a test statistic as or moreextreme than the observed value under the null hypothesis isfound using the 119885-statistics given by

119885 =119882 minus 119901

119905(119901119905+ 1) 4

radic119901119905(119901119905+ 1) (2119901

119905+ 1) 24

(16)

Mathematical Problems in Engineering 5

Pitmanrsquos correlations test is applied to identify the superiorityof a model in pairwise comparisons [32] the test is basedon the computation of the correlation 119877 between Υ and Ψ asfollows

119877 =cov (Υ Ψ)

var (Υ) var (Ψ) (17a)

Υ = 1198641 (119894) + 1198642 (119894) (17b)

Ψ = 1198641 (119894) minus 1198642 (119894) (17c)

where cov is the covariance and 119894 = 1 1199011199051198641is the residual

vector obtained with model 1 and 1198642is the residual vector

obtained with model 2 Model 1 would show superiority infront of model 2 at 5 significance level if |119877| gt 196radic119901119905

4 Case Study

The Chilean police (Carabineros de Chile) collects the fea-tures of the traffic accidents and CONASET records the dataThe regional population is estimated in 6061185 inhabitantsequivalent to 401 of the national population Santiagoshows a high rate of the events with severe injuries from 2000to 2014 with 260074 injured people

The entire series of 783 registers is shown in Figure 2which have been collected in the mentioned period fromJanuary to December with weekly sampling The highestnumber of injured people was observed in weeks 84 184254 and 280 while the lowest number of injured people wasobserved in weeks 344 365 426 500 and 756

One hundred causes of traffic accidents have been definedby CONASET and grouped into categories In this casestudy three analysis groups have been created In groups 1and 2 those causes directly related to behavior of driverspassengers or pedestrians have been prioritized Group 3involves the rest of the causes Figures 3(a) 4(a) and 5(a)show the observed time series of the groups Injured-G1Injured-G2 and Injured-G3 respectively the values havebeen normalized via division by the maximum value in eachtime series

The categories of groups 1 and 2 are as follows recklessdriving recklessness in passenger recklessness in pedestrianalcohol in driver alcohol in pedestrian and disobedience tosignal In Table 1 are shown 20 causes of groups 1 and 2 whichcover 75 of the events with injured people The causes arelisted sequentially the cause with the highest importance hasvalue 1 (with the highest number of injured people) and thecause with minor importance has value 20

From Table 1 the first three causes of injuries in trafficaccidents are as follows unwise distance inattention to trafficconditions and disrespect to red light The cause with thelowest importance for injuries in traffic accidents is drunkpedestrian The categories with the highest number of causesare as follows imprudent driving and disobedience to signal

Two groups of analysis were formed from the informationpresented in Table 1 the first group labeled with Injured-G1 corresponds to the first ten primary causes and thesecond group labeled with Injured-G2 corresponds to the tensecondary causes The first group overspreads around 60

1 54 107 160 213 266 319 372 425 478 531 584 637 690 743 783

200

250

300

350

400

450

500

550

Time (weeks)

Num

ber o

f inj

ured

peo

ple

Figure 2 Injured people in traffic accidents

of injured people in traffic accidents whereas the secondgroup overspreads around 15The third group labeled withInjured-G3 corresponds to the categories road deficienciesmechanical failures and undeterminednoncategorized causesthis group overspreads around 25 of injured people

Complementary information was observed about trafficaccidents conditions with high rate of injured people Withregard to vehicles automobiles are involved in 54 of eventsfollowed by vans and trucks with 19 bus and trolley with16 motorcycles and bicycles with 8 and others with 3With regard to environmental conditions 85 of events wasobserved with cloudless conditions Additionally 97 of thetraffic accidentswith injured people have taken place in urbanareas whereas 3 correspond to rural area With regard torelative position 46 of the events have been produced inintersections controlled by traffic signals or police officerswith 46 followed by accidents that happened in straightsections with 37 and other relative positions with 17

5 Empirical Research Result

The results of the methodology implementation with linearand nonlinear models are described by stages componentsextraction and prediction

51 Components Extraction The methodology presented inSection 2 describes the preprocessing stage and the predictionstage The preprocessing stage is based on two types ofmethods Singular Spectrum Analysis and Singular ValueDecomposition of Hankel

Both preprocessing techniques SSA and HSVD embedthe time series in a structure of two dimensions the initialwindow length used is 119903 = 1199012 Once matrix 119884 is obtainedthe decomposition is computedThe differential energy of theeigenvalues is obtained with (5a) A high energy content wasobserved in the first 119905 = 20 eigenvalues the lowest differentialenergywas observed in decompositions based on values of 1517 and 16 of window length for Injured-G1 Injured-G2 andInjured-G3 respectively and these values were set as windowlength

The embedding process is implemented again with theoptimal window 119903 and the decomposition is recomputedThe first elementary matrix 119860 is used by SSA and HSVD to

6 Mathematical Problems in Engineering

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Time (weeks)

Inju

red-

G1

02

04

06

08

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

Time (weeks)

c L

SSAHSVD

(b)

Time (weeks)1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

minus05

0

05

c H

SSAHSVD

(c)

Figure 3 (a) Injured-G1 (b) Low frequency component (c) High frequency component

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 7830

1

Inju

red-

G2

Time (weeks)

02040608

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783Time (weeks)

02

04

06

08

c L

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783minus04minus02

0

Time (weeks)

0204

c H

SSAHSVD

(c)

Figure 4 (a) Injured-G2 (b) Low frequency component (c) High frequency component

obtain the low frequency 119888119871component In SSA to extract

the elements of119860 diagonal averaging is applied while HSVDuses direct extraction from the first row and last column of119860 Finally with (1) the component of high frequency 119888

119867was

computedEach data set of low and high frequency has been divided

into two subsets training and testing the training subset

involves 70 of the samples and consequently the testingsubset involves the remaining 30

The decomposition results by means of SSA and HSVDFigures 3(b) 4(b) and 5(b) show the low frequency com-ponents whereas Figures 3(c) 4(c) and 5(c) show thehigh frequency components of Injured-G1 Injured-G2 andInjured-G3 respectively

Mathematical Problems in Engineering 7

Table 1 Causes of injuries in traffic accidents (group 1 and group 2)

Category Number Cause Importance

Imprudent driving

1 Unwise distance 12 Inattention to traffic conditions 23 Disrespect to pedestrian passing 84 Disrespect for giving the right of way 95 Unexpected change of track 106 Improper turns 117 Overtaking without enough time or space 148 Opposite direction 189 Backward driving 19

Disobedience to signal

10 Disrespect to red light 311 Disrespect to stop sign 412 Disrespect to give way sign 613 Improper speed 13

Alcohol in driver 14 Drunk driver 715 Driving under the influence of alcohol 15

Recklessness in pedestrian16 Pedestrian crossing the road suddenly 517 Reckless pedestrian 1218 Pedestrian outside the allowed crossing 17

Recklessness in passenger 19 Get in or get out of a moving vehicle 16Alcohol in pedestrian 20 Drunk pedestrian 20

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Inju

red-

G3

02040608

Time (weeks)

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

c L

Time (weeks)

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

0

05

Time (weeks)

minus05

c H

SSAHSVD

(c)

Figure 5 (a) Injured-G3 (b) Low frequency component (c) High frequency component

The nonstationary trend in the signals was verified forInjured-G1 Injured-G2 and Injured-G3 through Kwiatkow-ski Phillips Schmidt and Shin test (KPSS) [33] The testassesses the null hypothesis that the signals are trend sta-tionary it was rejected at 5 significance level consequentlynonstationary unit-root processes are present in the signals

The time series Injured-G1 is observed in Figure 3(a)similar dynamic is observed with respect to full series takinginto consideration that this group contains the predominantcauses The 119888

119871components extracted by SSA and HSVD

are shown in Figure 3(b) long-memory periodicity featureswere observedThe 119888

119867resultant components are presented in

8 Mathematical Problems in Engineering

Figure 3(c) short-term periodic fluctuations were identifiedThe components obtained with SSA and HSVD are similarhowever a slight difference is observed in the componentsof low frequency SSA extracts smoother components thanHSVD

Both techniques SSA and HSVD show that the principalten causes (in Table 1 with importance 1 to 10) present thehighest incidence between years 2002 and 2005 (weeks 106 to312) it descends from 2006 until half 2012 (around week 710)an increment is observed between weeks 711 and 732 (secondsemester of 2013 and first semester of 2014)

Figure 4 shows the time series Injured-G2 the lowfrequency component and the high frequency componentAs previous series the components of low frequency showsimilar dynamic of slow fluctuations with decreasing trend(119888119871) while the high frequency shows fast fluctuations In this

group is also observed 119888119871via SSA smoother than 119888

119871viaHSVD

Both techniques SSA and HSVD show that the secondaryten causes (described in Table 1 with importance 11 to 20)present the highest incidence of injured people in years 20002003 and 2004 it descends from 2005 therefore forwarddowntrend is observed in the number of injured people intraffic accidents due to the 10 secondary causes

Figure 5 shows Injured-G3 and its components 119888119871and 119888119867

as in previous analysis the components of low frequency showlong-memory periodicity features whereas the componentsof high frequency show short-term periodic fluctuations and119888119871via SSA is smoother than 119888

119871via HSVD

Both techniques SSA and HSVD show that the causesare related to road deficiencies mechanical failures andundeterminednoncategorized causes (with 25 of incidence)The highest incidence is observed in year 2001 from year2002 it presents strong decay until 2006 and forward uptrendis observed with a temporal decrease in 2009

Prevention plans and punitive laws have been imple-mented in Chile during the analyzed period via educationdrivers licensing reforms zero tolerance law Emiliarsquos law andtransit law reforms among others The effect of a particularpreventive or punitive action is not analyzed in this workhowever the proposed short-term prediction methodologybased on observed causes and intrinsic components is a con-tribution to government and society in preventive plans for-mulation its implementation and the consequent evaluation

52 Prediction The prediction is implemented by means ofthe autoregressive models linear (AR) or nonlinear (ANN)The models use the lagged terms of the components 119888

119871and

119888119867 the optimal number of the lagged terms was fixed in

119898 = 32 weeks which was found through the computation ofthe Autocorrelation Function over the observed time seriesof injured people due to all causes

The models based on Singular Spectrum Analysis (SSA-AR and SSA-ANN) receive the components of SSA prepro-cessing stage whereas the models based on Singular ValueDecomposition of Hankel (HSVD-AR and HSVD-ANN)receive the components of HSVD preprocessing stage

TheANNhas single hidden layer structure (32 1 1) with32 inputs 1 hidden node and 1 outputThe LM algorithmwasused iteratively to adjust the linear and nonlinear weights

The direct method was used to develop multistep aheadforecasting in Tables 2 and 3 are presented the averageprediction results with hybrid models SSA-AR SSA-ANNHSVD-AR and HSVD-ANN with the three time series Thearithmetic mean of the resultant metrics is presented inTables 2 and 3 the results shows that the accuracy decreasesas the time horizon increases therefore the best accuracywasobtained for the nearest weeks and the lowest accuracy wasobtained for the farthest weeks The best mean accuracy wasreached by using SSA-AR model with MNSE of 926 1198772 of993 MAPE of 15 and RMSE of 07 The lowest meanaccuracy was obtained with HSVD-ANNmodel with MNSEof 8561198772 of 982MAPE of 29 andRMSE of 14Thesecond best average accuracy was reached by SSA-ANN andthe third best accuracy was reached by HSVD-AR

The highest gain in average MNSE from 1- to 14-stepahead prediction is 76 while average MAPE is 933However it was observed that HSVD-AR shows a higheraccuracy in farthest horizons from 12- to 14-step aheadprediction which reaches these average results MNSE of894 1198772 of 989 MAPE of 22 and of RMSE 10 thegain in average MNSE from 12- to 14-step ahead prediction is121 and on average MAPE is 833

Prediction horizons higher than 14 weeks provide inaccu-rate results

From previous tables similar accuracy was identified inthe prediction through SSA-AR HSVD-AR and SSA-ANNfor 11-step ahead prediction The predicted signals are shownin Figures 6 7 and 8 whereas the metrics residuals arepresented in Tables 4 5 and 6

The results for 11-step ahead prediction of Injured-G1are shown in Figures 6(a) 6(c) 6(e) and Table 4 Fromthese figures and metrics a good fit is observed the highestaccuracy was reached via SSA-AR withMNSE of 876 1198772 of986 MAPE of 20 and RMSE of 11

In Figures 6(b) 6(d) and 6(f) the Relative Error ofInjured-G2 prediction is shown The model SSA-AR shows947 of the predicted points with Relative Error lowerthan plusmn5 HSVD-AR 867 and SSA-ANN 831 The gainwas computed by means of residual metrics and the twobest models (SSA-AR and HSVD-AR) the highest gain wasobserved in MAPE with 25

The results for 11-step ahead prediction of Injured-G2are shown in Figures 7(a) 7(c) and 7(e) and Table 5 allmodels achieve a good fit SSA-AR and HSVD-AR reachthe highest and similar accuracy In Figures 7(b) 7(d) and7(f) the Relative Error is shown SSA-AR presents 858of the predicted points with a Relative Error lower thanplusmn5 HSVD-AR 853 and SSA-ANN 80 The gain wascomputed based on residual metrics and the two best modelsthe highest gain was observed in MAPE with 107

The results for 11-step ahead prediction of Injured-G3are presented in Figures 8(a) 8(c) and 8(e) and Table 6 allmodels achieve also a goodfit and similar accuracy In Figures8(b) 8(d) and 8(f) the Relative Error is illustrated SSA-ARpresents 924 of predicted points with a Relative Error lowerthan plusmn5 HSVD-AR 942 and SSA-ANN 916 The gain

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

Mathematical Problems in Engineering 3

x r

Y

A

A

+

120582UV

120582UVcL

cLcL

cL cH

cH

x

Prep

roce

ssin

g SS

A

Embedding

Singular valuedecomposition

Diagonalaveraging

Preprocessing HSVD

Extraction HSVD

ARor

ANNPrediction

cL

cH

Figure 1 Traffic accidents forecasting methodology

Conventional SSA is implemented in four steps embeddingdecomposition grouping and diagonal averaging [24] Inthis work SSA is simplified in three steps embeddingdecomposition and diagonal averaging

The embedding step maps the time series 119909 of length 119901 toa sequence of multidimensional lagged vectors the Hankelmatrix structure is used in the embedding

119884 =(

(

11990911199092sdot sdot sdot 119909

119902

11990921199093sdot sdot sdot 119909119902+1

119909119903119909119903+1

sdot sdot sdot 119909119901

)

)

(2)

where the elements 119884(119894 119895) = 119909119894+119895minus1

The window length 119903 hasan important role in the forecasting model it has an initialvalue of 119903 = 1199012 and 119902 is computed as follows

119902 = 119901 minus 119903 + 1 (3)

The Singular Value Decomposition of the real matrix 119884 hasthe form

119884 =

119903

sum

119894=1

radic120582119894119880119894119881⊤

119894 (4)

where each 120582119894is the 119894th eigenvalue of the matrix 119878 = 119884119884⊤

arranged in decreasing order of magnitudes 1198801 119880

119903is

the corresponding orthonormal eigenvectors system of thematrix 119878

Standard SVD terminology calls radic120582119894the 119894th singular

value of the matrix 119884 119880119894is the 119894th left singular vector and

119881119894is the 119894th right singular vector of 119884 The collectionradic120582

119894119880119894119881119894

is called 119894th eigentriple of the SVDThe computation of the optimal window length 119903 is based

on the eigenvalues differential entropyThe first 119905 eigenvaluesobtained with 119903 = 1199012 contain a high spread of energytherefore the window length is evaluated in the range [119903 =2 119905] by means of the differential entropy as follows

Δ119867119894= 119867119894+1minus 119867119894 119894 = 1 119905 minus 1 (5a)

119867119894= minus

119903

sum

119895=1

120582119895log2120582119895 (5b)

120582119895=

120582119895

sum119903

119896=1120582119895

(5c)

where Δ119867119894is the 119894th differential entropy 119867

119894is the 119894th

Shannon entropy and120582119895is the 119895th normalized eigenvalue also

known as eigenvalue energyThe embedding step is executedagain with this decomposition that reaches a high energyspread and lower differential entropy With the optimal 119903 theembedding and decomposition are computed again

The first eigentriple is used to obtain the elementarymatrix 119860 which will be used in the extraction of the lowfrequency component

119860 = radic12058211198801119881⊤

1 (6)

The step of diagonal averaging is applied over119860 to extract theelements of the component 119888

119871 the process is shown below

119888119871(119894 119895)

=

1

119896 minus 1

119896

sum

119898=1

119860 (119898 119896 minus 119898) 2 le 119896 le 119903

1

119903

119903

sum

119898=1

119860 (119898 119896 minus 119898) 119903 lt 119896 le 119902 + 1

1

119902 + 119903 minus 119896 + 1

119903

sum

119898=119896minus119902

119860 (119898 119896 minus 119898) 119902 + 2 le 119896 le 119902 + 119903

(7)

Once 119888119871is obtained the component 119888

119867is computed with (1)

22 Hankel Singular Value Decomposition The preprocess-ing based on Singular Value Decomposition of Hankel isimplemented in three steps embedding decomposition andextraction HSVD implements the steps of embedding anddecomposition as SSA (presented in Section 21) The ele-mentary matrix 119860 is also computed with the first eigentripleobtained in the decomposition step (as (6))

In the extraction step the elements of the low frequencycomponent 119888

119871are obtained from the first row and the last

column of the matrix 119860 which has the same structure asmatrix 119884 (trajectory matrix) therefore the elements of 119888

119871are

119888119871= [119860 (1 1) 119860 (1 2) sdot sdot sdot 119860 (1 119902) 119860 (2 119902) 119860 (3 119902) sdot sdot sdot 119860 (119903 119902)] (8)

where 119860 is a 119903 times 119902matrix

4 Mathematical Problems in Engineering

23 Prediction with the Autoregressive Method The predic-tion is the second stage of the traffic accidents forecastingmethodology (illustrated in Figure 1) In order to obtain thetraffic accidents prediction during the preprocessing stagethe low frequency component 119888

119871and the high frequency

component 119888119867were obtainedThe components are estimated

through the autoregressive method and the addition of thecomponents is computed to obtain the prediction as follows

119899+ℎ= 119888119871

119899+ℎ+ 119888119867

119899+ℎ (9)

where 119899 represents the time instant and ℎ represents thehorizon with values ℎ = 1 120591 The component 119888

119871is used

as exogenous variable in the computation of the 119888119867 due to a

high influence of 119888119871over 119888

119867

The predicted components via ARmodel are definedwith

119888119871

119899+ℎ=

119898minus1

sum

119894=0

120572119894119888119899minus119894

119871 (10a)

119888119867

119899+ℎ=

119898minus1

sum

119894=0

120573119894119888119899minus119894

119867+

119898minus1

sum

119894=0

120573119898+119894119888119899minus119894

119871 (10b)

where119898 is the number of lagged values and 120572119894and 120573

119894are the

coefficients of 119888119871and 119888119867 respectively

The coefficients estimation is based on linear Least SquareMethod (LSM) the components 119888

119871and 119888119867are defined with

the linear relationship expressed in matrix form

119888119871= 120572119877 (11a)

119888119867= 120579119885 (11b)

where 119877 and 119885 are the regressor matrices of 119888119871and 119888

119867

respectively 120572 and 120579 are the coefficients vectors of 119877 and 119885respectively The coefficients are computed with the Moore-Penrose pseudoinverse matrices 119877dagger and 119885dagger as follows

120572 = 119877dagger119888119871 (12a)

120579 = 119885dagger119888119867 (12b)

24 Prediction with the Autoregressive Neural Network Inthis case study a single hidden layer Autoregressive NeuralNetwork is used to approach each component the ANN hasa standard multilayer perceptron (MLP) structure of threelayers [28]The training subset is iteratively used to adjust theconnections weights via learning algorithm the ANN withthe lowest error is selected to implement the solution withthe testing subset

The nonlinear inputs are the lagged terms which arecontained in the regressor matrix119877 at hidden layer is appliedthe sigmoid transfer function and at output layer is obtainedthe prediction The ANN output is

119899+ℎ= V119895ℎ119899+ℎ (13a)

ℎ119899+ℎ= 119891[

119898minus1

sum

119894=0

119908119895119894119877119899+ℎminus119894

119894] (13b)

where is the predicted value 119899 is the time instant V119895and119908

119895119894

are the linear and nonlinear weights of the ANN connectionsrespectively the sigmoid transfer function is computed with

119891 (119909) =1

1 + 119890minus119909 (14)

The ANN structure for 119888119871

prediction is denoted withANN(119898 119895 1) with 119898 inputs 119895 hidden nodes and 1 out-put 119888119871while the ANN structure for 119888

119867is denoted with

ANN(2119898 119895 1) with 2119898 inputs 119895 hidden nodes and 1 output119888119867 Levenberg-Marquardt is the learning algorithm applied

for weight updating in both neural networks [29]

3 Efficiency Criteria

The forecasting accuracy is evaluated with conventionalmetrics and an improved evaluationmetricThe conventionalmetrics are Mean Absolute Percentage Error (MAPE) RootMean Square Error (RMSE) Determination Coefficient 1198772and Relative Error (RE) The Modified Nash-Sutcliffe Effi-ciency (MNSE) metric is computed in order to improve theevaluation criteria which is sensitive to significant overfittingor underfitting [30] Consider

MAPE = 1119901119905

119901119905

sum

119894=1

10038161003816100381610038161003816100381610038161003816

119909119894minus 119894

119909119894

10038161003816100381610038161003816100381610038161003816

100

RMSE = radic 1119901119905

119901119905

sum

119894=1

(119909119894minus 119894)2

1198772= [1 minus

var (119909 minus )var (119909)

] 100

RE = [(119909 minus )119909

] 100

MNSE = [1 minussum119901119905

119894=1

1003816100381610038161003816119909119894 minus 1198941003816100381610038161003816

sum119901119905

119894=1

1003816100381610038161003816119909119894 minus 1199091003816100381610038161003816

] 100

(15)

where 119909 is the observed signal is the predicted signal 119901119905is

the testing sample size and var is the varianceFurthermore two statistical tests are computed to eval-

uate the differences and superiorities of either model theWilcoxon test and Pitmanrsquos correlation test respectively

TheWilcoxon (119882) signed rank test evaluates the pairwisedifferences in the squares of each multistep ahead residualsthe differences are ranked in ascending order with no regardto the sign and the ranks are assigned fromone to the numberof the forecast errors available for comparison The sum ofthe ranks of positive differences is then computed to obtain119882 [31] The probability 119901 of finding a test statistic as or moreextreme than the observed value under the null hypothesis isfound using the 119885-statistics given by

119885 =119882 minus 119901

119905(119901119905+ 1) 4

radic119901119905(119901119905+ 1) (2119901

119905+ 1) 24

(16)

Mathematical Problems in Engineering 5

Pitmanrsquos correlations test is applied to identify the superiorityof a model in pairwise comparisons [32] the test is basedon the computation of the correlation 119877 between Υ and Ψ asfollows

119877 =cov (Υ Ψ)

var (Υ) var (Ψ) (17a)

Υ = 1198641 (119894) + 1198642 (119894) (17b)

Ψ = 1198641 (119894) minus 1198642 (119894) (17c)

where cov is the covariance and 119894 = 1 1199011199051198641is the residual

vector obtained with model 1 and 1198642is the residual vector

obtained with model 2 Model 1 would show superiority infront of model 2 at 5 significance level if |119877| gt 196radic119901119905

4 Case Study

The Chilean police (Carabineros de Chile) collects the fea-tures of the traffic accidents and CONASET records the dataThe regional population is estimated in 6061185 inhabitantsequivalent to 401 of the national population Santiagoshows a high rate of the events with severe injuries from 2000to 2014 with 260074 injured people

The entire series of 783 registers is shown in Figure 2which have been collected in the mentioned period fromJanuary to December with weekly sampling The highestnumber of injured people was observed in weeks 84 184254 and 280 while the lowest number of injured people wasobserved in weeks 344 365 426 500 and 756

One hundred causes of traffic accidents have been definedby CONASET and grouped into categories In this casestudy three analysis groups have been created In groups 1and 2 those causes directly related to behavior of driverspassengers or pedestrians have been prioritized Group 3involves the rest of the causes Figures 3(a) 4(a) and 5(a)show the observed time series of the groups Injured-G1Injured-G2 and Injured-G3 respectively the values havebeen normalized via division by the maximum value in eachtime series

The categories of groups 1 and 2 are as follows recklessdriving recklessness in passenger recklessness in pedestrianalcohol in driver alcohol in pedestrian and disobedience tosignal In Table 1 are shown 20 causes of groups 1 and 2 whichcover 75 of the events with injured people The causes arelisted sequentially the cause with the highest importance hasvalue 1 (with the highest number of injured people) and thecause with minor importance has value 20

From Table 1 the first three causes of injuries in trafficaccidents are as follows unwise distance inattention to trafficconditions and disrespect to red light The cause with thelowest importance for injuries in traffic accidents is drunkpedestrian The categories with the highest number of causesare as follows imprudent driving and disobedience to signal

Two groups of analysis were formed from the informationpresented in Table 1 the first group labeled with Injured-G1 corresponds to the first ten primary causes and thesecond group labeled with Injured-G2 corresponds to the tensecondary causes The first group overspreads around 60

1 54 107 160 213 266 319 372 425 478 531 584 637 690 743 783

200

250

300

350

400

450

500

550

Time (weeks)

Num

ber o

f inj

ured

peo

ple

Figure 2 Injured people in traffic accidents

of injured people in traffic accidents whereas the secondgroup overspreads around 15The third group labeled withInjured-G3 corresponds to the categories road deficienciesmechanical failures and undeterminednoncategorized causesthis group overspreads around 25 of injured people

Complementary information was observed about trafficaccidents conditions with high rate of injured people Withregard to vehicles automobiles are involved in 54 of eventsfollowed by vans and trucks with 19 bus and trolley with16 motorcycles and bicycles with 8 and others with 3With regard to environmental conditions 85 of events wasobserved with cloudless conditions Additionally 97 of thetraffic accidentswith injured people have taken place in urbanareas whereas 3 correspond to rural area With regard torelative position 46 of the events have been produced inintersections controlled by traffic signals or police officerswith 46 followed by accidents that happened in straightsections with 37 and other relative positions with 17

5 Empirical Research Result

The results of the methodology implementation with linearand nonlinear models are described by stages componentsextraction and prediction

51 Components Extraction The methodology presented inSection 2 describes the preprocessing stage and the predictionstage The preprocessing stage is based on two types ofmethods Singular Spectrum Analysis and Singular ValueDecomposition of Hankel

Both preprocessing techniques SSA and HSVD embedthe time series in a structure of two dimensions the initialwindow length used is 119903 = 1199012 Once matrix 119884 is obtainedthe decomposition is computedThe differential energy of theeigenvalues is obtained with (5a) A high energy content wasobserved in the first 119905 = 20 eigenvalues the lowest differentialenergywas observed in decompositions based on values of 1517 and 16 of window length for Injured-G1 Injured-G2 andInjured-G3 respectively and these values were set as windowlength

The embedding process is implemented again with theoptimal window 119903 and the decomposition is recomputedThe first elementary matrix 119860 is used by SSA and HSVD to

6 Mathematical Problems in Engineering

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Time (weeks)

Inju

red-

G1

02

04

06

08

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

Time (weeks)

c L

SSAHSVD

(b)

Time (weeks)1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

minus05

0

05

c H

SSAHSVD

(c)

Figure 3 (a) Injured-G1 (b) Low frequency component (c) High frequency component

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 7830

1

Inju

red-

G2

Time (weeks)

02040608

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783Time (weeks)

02

04

06

08

c L

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783minus04minus02

0

Time (weeks)

0204

c H

SSAHSVD

(c)

Figure 4 (a) Injured-G2 (b) Low frequency component (c) High frequency component

obtain the low frequency 119888119871component In SSA to extract

the elements of119860 diagonal averaging is applied while HSVDuses direct extraction from the first row and last column of119860 Finally with (1) the component of high frequency 119888

119867was

computedEach data set of low and high frequency has been divided

into two subsets training and testing the training subset

involves 70 of the samples and consequently the testingsubset involves the remaining 30

The decomposition results by means of SSA and HSVDFigures 3(b) 4(b) and 5(b) show the low frequency com-ponents whereas Figures 3(c) 4(c) and 5(c) show thehigh frequency components of Injured-G1 Injured-G2 andInjured-G3 respectively

Mathematical Problems in Engineering 7

Table 1 Causes of injuries in traffic accidents (group 1 and group 2)

Category Number Cause Importance

Imprudent driving

1 Unwise distance 12 Inattention to traffic conditions 23 Disrespect to pedestrian passing 84 Disrespect for giving the right of way 95 Unexpected change of track 106 Improper turns 117 Overtaking without enough time or space 148 Opposite direction 189 Backward driving 19

Disobedience to signal

10 Disrespect to red light 311 Disrespect to stop sign 412 Disrespect to give way sign 613 Improper speed 13

Alcohol in driver 14 Drunk driver 715 Driving under the influence of alcohol 15

Recklessness in pedestrian16 Pedestrian crossing the road suddenly 517 Reckless pedestrian 1218 Pedestrian outside the allowed crossing 17

Recklessness in passenger 19 Get in or get out of a moving vehicle 16Alcohol in pedestrian 20 Drunk pedestrian 20

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Inju

red-

G3

02040608

Time (weeks)

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

c L

Time (weeks)

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

0

05

Time (weeks)

minus05

c H

SSAHSVD

(c)

Figure 5 (a) Injured-G3 (b) Low frequency component (c) High frequency component

The nonstationary trend in the signals was verified forInjured-G1 Injured-G2 and Injured-G3 through Kwiatkow-ski Phillips Schmidt and Shin test (KPSS) [33] The testassesses the null hypothesis that the signals are trend sta-tionary it was rejected at 5 significance level consequentlynonstationary unit-root processes are present in the signals

The time series Injured-G1 is observed in Figure 3(a)similar dynamic is observed with respect to full series takinginto consideration that this group contains the predominantcauses The 119888

119871components extracted by SSA and HSVD

are shown in Figure 3(b) long-memory periodicity featureswere observedThe 119888

119867resultant components are presented in

8 Mathematical Problems in Engineering

Figure 3(c) short-term periodic fluctuations were identifiedThe components obtained with SSA and HSVD are similarhowever a slight difference is observed in the componentsof low frequency SSA extracts smoother components thanHSVD

Both techniques SSA and HSVD show that the principalten causes (in Table 1 with importance 1 to 10) present thehighest incidence between years 2002 and 2005 (weeks 106 to312) it descends from 2006 until half 2012 (around week 710)an increment is observed between weeks 711 and 732 (secondsemester of 2013 and first semester of 2014)

Figure 4 shows the time series Injured-G2 the lowfrequency component and the high frequency componentAs previous series the components of low frequency showsimilar dynamic of slow fluctuations with decreasing trend(119888119871) while the high frequency shows fast fluctuations In this

group is also observed 119888119871via SSA smoother than 119888

119871viaHSVD

Both techniques SSA and HSVD show that the secondaryten causes (described in Table 1 with importance 11 to 20)present the highest incidence of injured people in years 20002003 and 2004 it descends from 2005 therefore forwarddowntrend is observed in the number of injured people intraffic accidents due to the 10 secondary causes

Figure 5 shows Injured-G3 and its components 119888119871and 119888119867

as in previous analysis the components of low frequency showlong-memory periodicity features whereas the componentsof high frequency show short-term periodic fluctuations and119888119871via SSA is smoother than 119888

119871via HSVD

Both techniques SSA and HSVD show that the causesare related to road deficiencies mechanical failures andundeterminednoncategorized causes (with 25 of incidence)The highest incidence is observed in year 2001 from year2002 it presents strong decay until 2006 and forward uptrendis observed with a temporal decrease in 2009

Prevention plans and punitive laws have been imple-mented in Chile during the analyzed period via educationdrivers licensing reforms zero tolerance law Emiliarsquos law andtransit law reforms among others The effect of a particularpreventive or punitive action is not analyzed in this workhowever the proposed short-term prediction methodologybased on observed causes and intrinsic components is a con-tribution to government and society in preventive plans for-mulation its implementation and the consequent evaluation

52 Prediction The prediction is implemented by means ofthe autoregressive models linear (AR) or nonlinear (ANN)The models use the lagged terms of the components 119888

119871and

119888119867 the optimal number of the lagged terms was fixed in

119898 = 32 weeks which was found through the computation ofthe Autocorrelation Function over the observed time seriesof injured people due to all causes

The models based on Singular Spectrum Analysis (SSA-AR and SSA-ANN) receive the components of SSA prepro-cessing stage whereas the models based on Singular ValueDecomposition of Hankel (HSVD-AR and HSVD-ANN)receive the components of HSVD preprocessing stage

TheANNhas single hidden layer structure (32 1 1) with32 inputs 1 hidden node and 1 outputThe LM algorithmwasused iteratively to adjust the linear and nonlinear weights

The direct method was used to develop multistep aheadforecasting in Tables 2 and 3 are presented the averageprediction results with hybrid models SSA-AR SSA-ANNHSVD-AR and HSVD-ANN with the three time series Thearithmetic mean of the resultant metrics is presented inTables 2 and 3 the results shows that the accuracy decreasesas the time horizon increases therefore the best accuracywasobtained for the nearest weeks and the lowest accuracy wasobtained for the farthest weeks The best mean accuracy wasreached by using SSA-AR model with MNSE of 926 1198772 of993 MAPE of 15 and RMSE of 07 The lowest meanaccuracy was obtained with HSVD-ANNmodel with MNSEof 8561198772 of 982MAPE of 29 andRMSE of 14Thesecond best average accuracy was reached by SSA-ANN andthe third best accuracy was reached by HSVD-AR

The highest gain in average MNSE from 1- to 14-stepahead prediction is 76 while average MAPE is 933However it was observed that HSVD-AR shows a higheraccuracy in farthest horizons from 12- to 14-step aheadprediction which reaches these average results MNSE of894 1198772 of 989 MAPE of 22 and of RMSE 10 thegain in average MNSE from 12- to 14-step ahead prediction is121 and on average MAPE is 833

Prediction horizons higher than 14 weeks provide inaccu-rate results

From previous tables similar accuracy was identified inthe prediction through SSA-AR HSVD-AR and SSA-ANNfor 11-step ahead prediction The predicted signals are shownin Figures 6 7 and 8 whereas the metrics residuals arepresented in Tables 4 5 and 6

The results for 11-step ahead prediction of Injured-G1are shown in Figures 6(a) 6(c) 6(e) and Table 4 Fromthese figures and metrics a good fit is observed the highestaccuracy was reached via SSA-AR withMNSE of 876 1198772 of986 MAPE of 20 and RMSE of 11

In Figures 6(b) 6(d) and 6(f) the Relative Error ofInjured-G2 prediction is shown The model SSA-AR shows947 of the predicted points with Relative Error lowerthan plusmn5 HSVD-AR 867 and SSA-ANN 831 The gainwas computed by means of residual metrics and the twobest models (SSA-AR and HSVD-AR) the highest gain wasobserved in MAPE with 25

The results for 11-step ahead prediction of Injured-G2are shown in Figures 7(a) 7(c) and 7(e) and Table 5 allmodels achieve a good fit SSA-AR and HSVD-AR reachthe highest and similar accuracy In Figures 7(b) 7(d) and7(f) the Relative Error is shown SSA-AR presents 858of the predicted points with a Relative Error lower thanplusmn5 HSVD-AR 853 and SSA-ANN 80 The gain wascomputed based on residual metrics and the two best modelsthe highest gain was observed in MAPE with 107

The results for 11-step ahead prediction of Injured-G3are presented in Figures 8(a) 8(c) and 8(e) and Table 6 allmodels achieve also a goodfit and similar accuracy In Figures8(b) 8(d) and 8(f) the Relative Error is illustrated SSA-ARpresents 924 of predicted points with a Relative Error lowerthan plusmn5 HSVD-AR 942 and SSA-ANN 916 The gain

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

4 Mathematical Problems in Engineering

23 Prediction with the Autoregressive Method The predic-tion is the second stage of the traffic accidents forecastingmethodology (illustrated in Figure 1) In order to obtain thetraffic accidents prediction during the preprocessing stagethe low frequency component 119888

119871and the high frequency

component 119888119867were obtainedThe components are estimated

through the autoregressive method and the addition of thecomponents is computed to obtain the prediction as follows

119899+ℎ= 119888119871

119899+ℎ+ 119888119867

119899+ℎ (9)

where 119899 represents the time instant and ℎ represents thehorizon with values ℎ = 1 120591 The component 119888

119871is used

as exogenous variable in the computation of the 119888119867 due to a

high influence of 119888119871over 119888

119867

The predicted components via ARmodel are definedwith

119888119871

119899+ℎ=

119898minus1

sum

119894=0

120572119894119888119899minus119894

119871 (10a)

119888119867

119899+ℎ=

119898minus1

sum

119894=0

120573119894119888119899minus119894

119867+

119898minus1

sum

119894=0

120573119898+119894119888119899minus119894

119871 (10b)

where119898 is the number of lagged values and 120572119894and 120573

119894are the

coefficients of 119888119871and 119888119867 respectively

The coefficients estimation is based on linear Least SquareMethod (LSM) the components 119888

119871and 119888119867are defined with

the linear relationship expressed in matrix form

119888119871= 120572119877 (11a)

119888119867= 120579119885 (11b)

where 119877 and 119885 are the regressor matrices of 119888119871and 119888

119867

respectively 120572 and 120579 are the coefficients vectors of 119877 and 119885respectively The coefficients are computed with the Moore-Penrose pseudoinverse matrices 119877dagger and 119885dagger as follows

120572 = 119877dagger119888119871 (12a)

120579 = 119885dagger119888119867 (12b)

24 Prediction with the Autoregressive Neural Network Inthis case study a single hidden layer Autoregressive NeuralNetwork is used to approach each component the ANN hasa standard multilayer perceptron (MLP) structure of threelayers [28]The training subset is iteratively used to adjust theconnections weights via learning algorithm the ANN withthe lowest error is selected to implement the solution withthe testing subset

The nonlinear inputs are the lagged terms which arecontained in the regressor matrix119877 at hidden layer is appliedthe sigmoid transfer function and at output layer is obtainedthe prediction The ANN output is

119899+ℎ= V119895ℎ119899+ℎ (13a)

ℎ119899+ℎ= 119891[

119898minus1

sum

119894=0

119908119895119894119877119899+ℎminus119894

119894] (13b)

where is the predicted value 119899 is the time instant V119895and119908

119895119894

are the linear and nonlinear weights of the ANN connectionsrespectively the sigmoid transfer function is computed with

119891 (119909) =1

1 + 119890minus119909 (14)

The ANN structure for 119888119871

prediction is denoted withANN(119898 119895 1) with 119898 inputs 119895 hidden nodes and 1 out-put 119888119871while the ANN structure for 119888

119867is denoted with

ANN(2119898 119895 1) with 2119898 inputs 119895 hidden nodes and 1 output119888119867 Levenberg-Marquardt is the learning algorithm applied

for weight updating in both neural networks [29]

3 Efficiency Criteria

The forecasting accuracy is evaluated with conventionalmetrics and an improved evaluationmetricThe conventionalmetrics are Mean Absolute Percentage Error (MAPE) RootMean Square Error (RMSE) Determination Coefficient 1198772and Relative Error (RE) The Modified Nash-Sutcliffe Effi-ciency (MNSE) metric is computed in order to improve theevaluation criteria which is sensitive to significant overfittingor underfitting [30] Consider

MAPE = 1119901119905

119901119905

sum

119894=1

10038161003816100381610038161003816100381610038161003816

119909119894minus 119894

119909119894

10038161003816100381610038161003816100381610038161003816

100

RMSE = radic 1119901119905

119901119905

sum

119894=1

(119909119894minus 119894)2

1198772= [1 minus

var (119909 minus )var (119909)

] 100

RE = [(119909 minus )119909

] 100

MNSE = [1 minussum119901119905

119894=1

1003816100381610038161003816119909119894 minus 1198941003816100381610038161003816

sum119901119905

119894=1

1003816100381610038161003816119909119894 minus 1199091003816100381610038161003816

] 100

(15)

where 119909 is the observed signal is the predicted signal 119901119905is

the testing sample size and var is the varianceFurthermore two statistical tests are computed to eval-

uate the differences and superiorities of either model theWilcoxon test and Pitmanrsquos correlation test respectively

TheWilcoxon (119882) signed rank test evaluates the pairwisedifferences in the squares of each multistep ahead residualsthe differences are ranked in ascending order with no regardto the sign and the ranks are assigned fromone to the numberof the forecast errors available for comparison The sum ofthe ranks of positive differences is then computed to obtain119882 [31] The probability 119901 of finding a test statistic as or moreextreme than the observed value under the null hypothesis isfound using the 119885-statistics given by

119885 =119882 minus 119901

119905(119901119905+ 1) 4

radic119901119905(119901119905+ 1) (2119901

119905+ 1) 24

(16)

Mathematical Problems in Engineering 5

Pitmanrsquos correlations test is applied to identify the superiorityof a model in pairwise comparisons [32] the test is basedon the computation of the correlation 119877 between Υ and Ψ asfollows

119877 =cov (Υ Ψ)

var (Υ) var (Ψ) (17a)

Υ = 1198641 (119894) + 1198642 (119894) (17b)

Ψ = 1198641 (119894) minus 1198642 (119894) (17c)

where cov is the covariance and 119894 = 1 1199011199051198641is the residual

vector obtained with model 1 and 1198642is the residual vector

obtained with model 2 Model 1 would show superiority infront of model 2 at 5 significance level if |119877| gt 196radic119901119905

4 Case Study

The Chilean police (Carabineros de Chile) collects the fea-tures of the traffic accidents and CONASET records the dataThe regional population is estimated in 6061185 inhabitantsequivalent to 401 of the national population Santiagoshows a high rate of the events with severe injuries from 2000to 2014 with 260074 injured people

The entire series of 783 registers is shown in Figure 2which have been collected in the mentioned period fromJanuary to December with weekly sampling The highestnumber of injured people was observed in weeks 84 184254 and 280 while the lowest number of injured people wasobserved in weeks 344 365 426 500 and 756

One hundred causes of traffic accidents have been definedby CONASET and grouped into categories In this casestudy three analysis groups have been created In groups 1and 2 those causes directly related to behavior of driverspassengers or pedestrians have been prioritized Group 3involves the rest of the causes Figures 3(a) 4(a) and 5(a)show the observed time series of the groups Injured-G1Injured-G2 and Injured-G3 respectively the values havebeen normalized via division by the maximum value in eachtime series

The categories of groups 1 and 2 are as follows recklessdriving recklessness in passenger recklessness in pedestrianalcohol in driver alcohol in pedestrian and disobedience tosignal In Table 1 are shown 20 causes of groups 1 and 2 whichcover 75 of the events with injured people The causes arelisted sequentially the cause with the highest importance hasvalue 1 (with the highest number of injured people) and thecause with minor importance has value 20

From Table 1 the first three causes of injuries in trafficaccidents are as follows unwise distance inattention to trafficconditions and disrespect to red light The cause with thelowest importance for injuries in traffic accidents is drunkpedestrian The categories with the highest number of causesare as follows imprudent driving and disobedience to signal

Two groups of analysis were formed from the informationpresented in Table 1 the first group labeled with Injured-G1 corresponds to the first ten primary causes and thesecond group labeled with Injured-G2 corresponds to the tensecondary causes The first group overspreads around 60

1 54 107 160 213 266 319 372 425 478 531 584 637 690 743 783

200

250

300

350

400

450

500

550

Time (weeks)

Num

ber o

f inj

ured

peo

ple

Figure 2 Injured people in traffic accidents

of injured people in traffic accidents whereas the secondgroup overspreads around 15The third group labeled withInjured-G3 corresponds to the categories road deficienciesmechanical failures and undeterminednoncategorized causesthis group overspreads around 25 of injured people

Complementary information was observed about trafficaccidents conditions with high rate of injured people Withregard to vehicles automobiles are involved in 54 of eventsfollowed by vans and trucks with 19 bus and trolley with16 motorcycles and bicycles with 8 and others with 3With regard to environmental conditions 85 of events wasobserved with cloudless conditions Additionally 97 of thetraffic accidentswith injured people have taken place in urbanareas whereas 3 correspond to rural area With regard torelative position 46 of the events have been produced inintersections controlled by traffic signals or police officerswith 46 followed by accidents that happened in straightsections with 37 and other relative positions with 17

5 Empirical Research Result

The results of the methodology implementation with linearand nonlinear models are described by stages componentsextraction and prediction

51 Components Extraction The methodology presented inSection 2 describes the preprocessing stage and the predictionstage The preprocessing stage is based on two types ofmethods Singular Spectrum Analysis and Singular ValueDecomposition of Hankel

Both preprocessing techniques SSA and HSVD embedthe time series in a structure of two dimensions the initialwindow length used is 119903 = 1199012 Once matrix 119884 is obtainedthe decomposition is computedThe differential energy of theeigenvalues is obtained with (5a) A high energy content wasobserved in the first 119905 = 20 eigenvalues the lowest differentialenergywas observed in decompositions based on values of 1517 and 16 of window length for Injured-G1 Injured-G2 andInjured-G3 respectively and these values were set as windowlength

The embedding process is implemented again with theoptimal window 119903 and the decomposition is recomputedThe first elementary matrix 119860 is used by SSA and HSVD to

6 Mathematical Problems in Engineering

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Time (weeks)

Inju

red-

G1

02

04

06

08

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

Time (weeks)

c L

SSAHSVD

(b)

Time (weeks)1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

minus05

0

05

c H

SSAHSVD

(c)

Figure 3 (a) Injured-G1 (b) Low frequency component (c) High frequency component

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 7830

1

Inju

red-

G2

Time (weeks)

02040608

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783Time (weeks)

02

04

06

08

c L

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783minus04minus02

0

Time (weeks)

0204

c H

SSAHSVD

(c)

Figure 4 (a) Injured-G2 (b) Low frequency component (c) High frequency component

obtain the low frequency 119888119871component In SSA to extract

the elements of119860 diagonal averaging is applied while HSVDuses direct extraction from the first row and last column of119860 Finally with (1) the component of high frequency 119888

119867was

computedEach data set of low and high frequency has been divided

into two subsets training and testing the training subset

involves 70 of the samples and consequently the testingsubset involves the remaining 30

The decomposition results by means of SSA and HSVDFigures 3(b) 4(b) and 5(b) show the low frequency com-ponents whereas Figures 3(c) 4(c) and 5(c) show thehigh frequency components of Injured-G1 Injured-G2 andInjured-G3 respectively

Mathematical Problems in Engineering 7

Table 1 Causes of injuries in traffic accidents (group 1 and group 2)

Category Number Cause Importance

Imprudent driving

1 Unwise distance 12 Inattention to traffic conditions 23 Disrespect to pedestrian passing 84 Disrespect for giving the right of way 95 Unexpected change of track 106 Improper turns 117 Overtaking without enough time or space 148 Opposite direction 189 Backward driving 19

Disobedience to signal

10 Disrespect to red light 311 Disrespect to stop sign 412 Disrespect to give way sign 613 Improper speed 13

Alcohol in driver 14 Drunk driver 715 Driving under the influence of alcohol 15

Recklessness in pedestrian16 Pedestrian crossing the road suddenly 517 Reckless pedestrian 1218 Pedestrian outside the allowed crossing 17

Recklessness in passenger 19 Get in or get out of a moving vehicle 16Alcohol in pedestrian 20 Drunk pedestrian 20

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Inju

red-

G3

02040608

Time (weeks)

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

c L

Time (weeks)

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

0

05

Time (weeks)

minus05

c H

SSAHSVD

(c)

Figure 5 (a) Injured-G3 (b) Low frequency component (c) High frequency component

The nonstationary trend in the signals was verified forInjured-G1 Injured-G2 and Injured-G3 through Kwiatkow-ski Phillips Schmidt and Shin test (KPSS) [33] The testassesses the null hypothesis that the signals are trend sta-tionary it was rejected at 5 significance level consequentlynonstationary unit-root processes are present in the signals

The time series Injured-G1 is observed in Figure 3(a)similar dynamic is observed with respect to full series takinginto consideration that this group contains the predominantcauses The 119888

119871components extracted by SSA and HSVD

are shown in Figure 3(b) long-memory periodicity featureswere observedThe 119888

119867resultant components are presented in

8 Mathematical Problems in Engineering

Figure 3(c) short-term periodic fluctuations were identifiedThe components obtained with SSA and HSVD are similarhowever a slight difference is observed in the componentsof low frequency SSA extracts smoother components thanHSVD

Both techniques SSA and HSVD show that the principalten causes (in Table 1 with importance 1 to 10) present thehighest incidence between years 2002 and 2005 (weeks 106 to312) it descends from 2006 until half 2012 (around week 710)an increment is observed between weeks 711 and 732 (secondsemester of 2013 and first semester of 2014)

Figure 4 shows the time series Injured-G2 the lowfrequency component and the high frequency componentAs previous series the components of low frequency showsimilar dynamic of slow fluctuations with decreasing trend(119888119871) while the high frequency shows fast fluctuations In this

group is also observed 119888119871via SSA smoother than 119888

119871viaHSVD

Both techniques SSA and HSVD show that the secondaryten causes (described in Table 1 with importance 11 to 20)present the highest incidence of injured people in years 20002003 and 2004 it descends from 2005 therefore forwarddowntrend is observed in the number of injured people intraffic accidents due to the 10 secondary causes

Figure 5 shows Injured-G3 and its components 119888119871and 119888119867

as in previous analysis the components of low frequency showlong-memory periodicity features whereas the componentsof high frequency show short-term periodic fluctuations and119888119871via SSA is smoother than 119888

119871via HSVD

Both techniques SSA and HSVD show that the causesare related to road deficiencies mechanical failures andundeterminednoncategorized causes (with 25 of incidence)The highest incidence is observed in year 2001 from year2002 it presents strong decay until 2006 and forward uptrendis observed with a temporal decrease in 2009

Prevention plans and punitive laws have been imple-mented in Chile during the analyzed period via educationdrivers licensing reforms zero tolerance law Emiliarsquos law andtransit law reforms among others The effect of a particularpreventive or punitive action is not analyzed in this workhowever the proposed short-term prediction methodologybased on observed causes and intrinsic components is a con-tribution to government and society in preventive plans for-mulation its implementation and the consequent evaluation

52 Prediction The prediction is implemented by means ofthe autoregressive models linear (AR) or nonlinear (ANN)The models use the lagged terms of the components 119888

119871and

119888119867 the optimal number of the lagged terms was fixed in

119898 = 32 weeks which was found through the computation ofthe Autocorrelation Function over the observed time seriesof injured people due to all causes

The models based on Singular Spectrum Analysis (SSA-AR and SSA-ANN) receive the components of SSA prepro-cessing stage whereas the models based on Singular ValueDecomposition of Hankel (HSVD-AR and HSVD-ANN)receive the components of HSVD preprocessing stage

TheANNhas single hidden layer structure (32 1 1) with32 inputs 1 hidden node and 1 outputThe LM algorithmwasused iteratively to adjust the linear and nonlinear weights

The direct method was used to develop multistep aheadforecasting in Tables 2 and 3 are presented the averageprediction results with hybrid models SSA-AR SSA-ANNHSVD-AR and HSVD-ANN with the three time series Thearithmetic mean of the resultant metrics is presented inTables 2 and 3 the results shows that the accuracy decreasesas the time horizon increases therefore the best accuracywasobtained for the nearest weeks and the lowest accuracy wasobtained for the farthest weeks The best mean accuracy wasreached by using SSA-AR model with MNSE of 926 1198772 of993 MAPE of 15 and RMSE of 07 The lowest meanaccuracy was obtained with HSVD-ANNmodel with MNSEof 8561198772 of 982MAPE of 29 andRMSE of 14Thesecond best average accuracy was reached by SSA-ANN andthe third best accuracy was reached by HSVD-AR

The highest gain in average MNSE from 1- to 14-stepahead prediction is 76 while average MAPE is 933However it was observed that HSVD-AR shows a higheraccuracy in farthest horizons from 12- to 14-step aheadprediction which reaches these average results MNSE of894 1198772 of 989 MAPE of 22 and of RMSE 10 thegain in average MNSE from 12- to 14-step ahead prediction is121 and on average MAPE is 833

Prediction horizons higher than 14 weeks provide inaccu-rate results

From previous tables similar accuracy was identified inthe prediction through SSA-AR HSVD-AR and SSA-ANNfor 11-step ahead prediction The predicted signals are shownin Figures 6 7 and 8 whereas the metrics residuals arepresented in Tables 4 5 and 6

The results for 11-step ahead prediction of Injured-G1are shown in Figures 6(a) 6(c) 6(e) and Table 4 Fromthese figures and metrics a good fit is observed the highestaccuracy was reached via SSA-AR withMNSE of 876 1198772 of986 MAPE of 20 and RMSE of 11

In Figures 6(b) 6(d) and 6(f) the Relative Error ofInjured-G2 prediction is shown The model SSA-AR shows947 of the predicted points with Relative Error lowerthan plusmn5 HSVD-AR 867 and SSA-ANN 831 The gainwas computed by means of residual metrics and the twobest models (SSA-AR and HSVD-AR) the highest gain wasobserved in MAPE with 25

The results for 11-step ahead prediction of Injured-G2are shown in Figures 7(a) 7(c) and 7(e) and Table 5 allmodels achieve a good fit SSA-AR and HSVD-AR reachthe highest and similar accuracy In Figures 7(b) 7(d) and7(f) the Relative Error is shown SSA-AR presents 858of the predicted points with a Relative Error lower thanplusmn5 HSVD-AR 853 and SSA-ANN 80 The gain wascomputed based on residual metrics and the two best modelsthe highest gain was observed in MAPE with 107

The results for 11-step ahead prediction of Injured-G3are presented in Figures 8(a) 8(c) and 8(e) and Table 6 allmodels achieve also a goodfit and similar accuracy In Figures8(b) 8(d) and 8(f) the Relative Error is illustrated SSA-ARpresents 924 of predicted points with a Relative Error lowerthan plusmn5 HSVD-AR 942 and SSA-ANN 916 The gain

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

Mathematical Problems in Engineering 5

Pitmanrsquos correlations test is applied to identify the superiorityof a model in pairwise comparisons [32] the test is basedon the computation of the correlation 119877 between Υ and Ψ asfollows

119877 =cov (Υ Ψ)

var (Υ) var (Ψ) (17a)

Υ = 1198641 (119894) + 1198642 (119894) (17b)

Ψ = 1198641 (119894) minus 1198642 (119894) (17c)

where cov is the covariance and 119894 = 1 1199011199051198641is the residual

vector obtained with model 1 and 1198642is the residual vector

obtained with model 2 Model 1 would show superiority infront of model 2 at 5 significance level if |119877| gt 196radic119901119905

4 Case Study

The Chilean police (Carabineros de Chile) collects the fea-tures of the traffic accidents and CONASET records the dataThe regional population is estimated in 6061185 inhabitantsequivalent to 401 of the national population Santiagoshows a high rate of the events with severe injuries from 2000to 2014 with 260074 injured people

The entire series of 783 registers is shown in Figure 2which have been collected in the mentioned period fromJanuary to December with weekly sampling The highestnumber of injured people was observed in weeks 84 184254 and 280 while the lowest number of injured people wasobserved in weeks 344 365 426 500 and 756

One hundred causes of traffic accidents have been definedby CONASET and grouped into categories In this casestudy three analysis groups have been created In groups 1and 2 those causes directly related to behavior of driverspassengers or pedestrians have been prioritized Group 3involves the rest of the causes Figures 3(a) 4(a) and 5(a)show the observed time series of the groups Injured-G1Injured-G2 and Injured-G3 respectively the values havebeen normalized via division by the maximum value in eachtime series

The categories of groups 1 and 2 are as follows recklessdriving recklessness in passenger recklessness in pedestrianalcohol in driver alcohol in pedestrian and disobedience tosignal In Table 1 are shown 20 causes of groups 1 and 2 whichcover 75 of the events with injured people The causes arelisted sequentially the cause with the highest importance hasvalue 1 (with the highest number of injured people) and thecause with minor importance has value 20

From Table 1 the first three causes of injuries in trafficaccidents are as follows unwise distance inattention to trafficconditions and disrespect to red light The cause with thelowest importance for injuries in traffic accidents is drunkpedestrian The categories with the highest number of causesare as follows imprudent driving and disobedience to signal

Two groups of analysis were formed from the informationpresented in Table 1 the first group labeled with Injured-G1 corresponds to the first ten primary causes and thesecond group labeled with Injured-G2 corresponds to the tensecondary causes The first group overspreads around 60

1 54 107 160 213 266 319 372 425 478 531 584 637 690 743 783

200

250

300

350

400

450

500

550

Time (weeks)

Num

ber o

f inj

ured

peo

ple

Figure 2 Injured people in traffic accidents

of injured people in traffic accidents whereas the secondgroup overspreads around 15The third group labeled withInjured-G3 corresponds to the categories road deficienciesmechanical failures and undeterminednoncategorized causesthis group overspreads around 25 of injured people

Complementary information was observed about trafficaccidents conditions with high rate of injured people Withregard to vehicles automobiles are involved in 54 of eventsfollowed by vans and trucks with 19 bus and trolley with16 motorcycles and bicycles with 8 and others with 3With regard to environmental conditions 85 of events wasobserved with cloudless conditions Additionally 97 of thetraffic accidentswith injured people have taken place in urbanareas whereas 3 correspond to rural area With regard torelative position 46 of the events have been produced inintersections controlled by traffic signals or police officerswith 46 followed by accidents that happened in straightsections with 37 and other relative positions with 17

5 Empirical Research Result

The results of the methodology implementation with linearand nonlinear models are described by stages componentsextraction and prediction

51 Components Extraction The methodology presented inSection 2 describes the preprocessing stage and the predictionstage The preprocessing stage is based on two types ofmethods Singular Spectrum Analysis and Singular ValueDecomposition of Hankel

Both preprocessing techniques SSA and HSVD embedthe time series in a structure of two dimensions the initialwindow length used is 119903 = 1199012 Once matrix 119884 is obtainedthe decomposition is computedThe differential energy of theeigenvalues is obtained with (5a) A high energy content wasobserved in the first 119905 = 20 eigenvalues the lowest differentialenergywas observed in decompositions based on values of 1517 and 16 of window length for Injured-G1 Injured-G2 andInjured-G3 respectively and these values were set as windowlength

The embedding process is implemented again with theoptimal window 119903 and the decomposition is recomputedThe first elementary matrix 119860 is used by SSA and HSVD to

6 Mathematical Problems in Engineering

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Time (weeks)

Inju

red-

G1

02

04

06

08

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

Time (weeks)

c L

SSAHSVD

(b)

Time (weeks)1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

minus05

0

05

c H

SSAHSVD

(c)

Figure 3 (a) Injured-G1 (b) Low frequency component (c) High frequency component

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 7830

1

Inju

red-

G2

Time (weeks)

02040608

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783Time (weeks)

02

04

06

08

c L

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783minus04minus02

0

Time (weeks)

0204

c H

SSAHSVD

(c)

Figure 4 (a) Injured-G2 (b) Low frequency component (c) High frequency component

obtain the low frequency 119888119871component In SSA to extract

the elements of119860 diagonal averaging is applied while HSVDuses direct extraction from the first row and last column of119860 Finally with (1) the component of high frequency 119888

119867was

computedEach data set of low and high frequency has been divided

into two subsets training and testing the training subset

involves 70 of the samples and consequently the testingsubset involves the remaining 30

The decomposition results by means of SSA and HSVDFigures 3(b) 4(b) and 5(b) show the low frequency com-ponents whereas Figures 3(c) 4(c) and 5(c) show thehigh frequency components of Injured-G1 Injured-G2 andInjured-G3 respectively

Mathematical Problems in Engineering 7

Table 1 Causes of injuries in traffic accidents (group 1 and group 2)

Category Number Cause Importance

Imprudent driving

1 Unwise distance 12 Inattention to traffic conditions 23 Disrespect to pedestrian passing 84 Disrespect for giving the right of way 95 Unexpected change of track 106 Improper turns 117 Overtaking without enough time or space 148 Opposite direction 189 Backward driving 19

Disobedience to signal

10 Disrespect to red light 311 Disrespect to stop sign 412 Disrespect to give way sign 613 Improper speed 13

Alcohol in driver 14 Drunk driver 715 Driving under the influence of alcohol 15

Recklessness in pedestrian16 Pedestrian crossing the road suddenly 517 Reckless pedestrian 1218 Pedestrian outside the allowed crossing 17

Recklessness in passenger 19 Get in or get out of a moving vehicle 16Alcohol in pedestrian 20 Drunk pedestrian 20

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Inju

red-

G3

02040608

Time (weeks)

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

c L

Time (weeks)

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

0

05

Time (weeks)

minus05

c H

SSAHSVD

(c)

Figure 5 (a) Injured-G3 (b) Low frequency component (c) High frequency component

The nonstationary trend in the signals was verified forInjured-G1 Injured-G2 and Injured-G3 through Kwiatkow-ski Phillips Schmidt and Shin test (KPSS) [33] The testassesses the null hypothesis that the signals are trend sta-tionary it was rejected at 5 significance level consequentlynonstationary unit-root processes are present in the signals

The time series Injured-G1 is observed in Figure 3(a)similar dynamic is observed with respect to full series takinginto consideration that this group contains the predominantcauses The 119888

119871components extracted by SSA and HSVD

are shown in Figure 3(b) long-memory periodicity featureswere observedThe 119888

119867resultant components are presented in

8 Mathematical Problems in Engineering

Figure 3(c) short-term periodic fluctuations were identifiedThe components obtained with SSA and HSVD are similarhowever a slight difference is observed in the componentsof low frequency SSA extracts smoother components thanHSVD

Both techniques SSA and HSVD show that the principalten causes (in Table 1 with importance 1 to 10) present thehighest incidence between years 2002 and 2005 (weeks 106 to312) it descends from 2006 until half 2012 (around week 710)an increment is observed between weeks 711 and 732 (secondsemester of 2013 and first semester of 2014)

Figure 4 shows the time series Injured-G2 the lowfrequency component and the high frequency componentAs previous series the components of low frequency showsimilar dynamic of slow fluctuations with decreasing trend(119888119871) while the high frequency shows fast fluctuations In this

group is also observed 119888119871via SSA smoother than 119888

119871viaHSVD

Both techniques SSA and HSVD show that the secondaryten causes (described in Table 1 with importance 11 to 20)present the highest incidence of injured people in years 20002003 and 2004 it descends from 2005 therefore forwarddowntrend is observed in the number of injured people intraffic accidents due to the 10 secondary causes

Figure 5 shows Injured-G3 and its components 119888119871and 119888119867

as in previous analysis the components of low frequency showlong-memory periodicity features whereas the componentsof high frequency show short-term periodic fluctuations and119888119871via SSA is smoother than 119888

119871via HSVD

Both techniques SSA and HSVD show that the causesare related to road deficiencies mechanical failures andundeterminednoncategorized causes (with 25 of incidence)The highest incidence is observed in year 2001 from year2002 it presents strong decay until 2006 and forward uptrendis observed with a temporal decrease in 2009

Prevention plans and punitive laws have been imple-mented in Chile during the analyzed period via educationdrivers licensing reforms zero tolerance law Emiliarsquos law andtransit law reforms among others The effect of a particularpreventive or punitive action is not analyzed in this workhowever the proposed short-term prediction methodologybased on observed causes and intrinsic components is a con-tribution to government and society in preventive plans for-mulation its implementation and the consequent evaluation

52 Prediction The prediction is implemented by means ofthe autoregressive models linear (AR) or nonlinear (ANN)The models use the lagged terms of the components 119888

119871and

119888119867 the optimal number of the lagged terms was fixed in

119898 = 32 weeks which was found through the computation ofthe Autocorrelation Function over the observed time seriesof injured people due to all causes

The models based on Singular Spectrum Analysis (SSA-AR and SSA-ANN) receive the components of SSA prepro-cessing stage whereas the models based on Singular ValueDecomposition of Hankel (HSVD-AR and HSVD-ANN)receive the components of HSVD preprocessing stage

TheANNhas single hidden layer structure (32 1 1) with32 inputs 1 hidden node and 1 outputThe LM algorithmwasused iteratively to adjust the linear and nonlinear weights

The direct method was used to develop multistep aheadforecasting in Tables 2 and 3 are presented the averageprediction results with hybrid models SSA-AR SSA-ANNHSVD-AR and HSVD-ANN with the three time series Thearithmetic mean of the resultant metrics is presented inTables 2 and 3 the results shows that the accuracy decreasesas the time horizon increases therefore the best accuracywasobtained for the nearest weeks and the lowest accuracy wasobtained for the farthest weeks The best mean accuracy wasreached by using SSA-AR model with MNSE of 926 1198772 of993 MAPE of 15 and RMSE of 07 The lowest meanaccuracy was obtained with HSVD-ANNmodel with MNSEof 8561198772 of 982MAPE of 29 andRMSE of 14Thesecond best average accuracy was reached by SSA-ANN andthe third best accuracy was reached by HSVD-AR

The highest gain in average MNSE from 1- to 14-stepahead prediction is 76 while average MAPE is 933However it was observed that HSVD-AR shows a higheraccuracy in farthest horizons from 12- to 14-step aheadprediction which reaches these average results MNSE of894 1198772 of 989 MAPE of 22 and of RMSE 10 thegain in average MNSE from 12- to 14-step ahead prediction is121 and on average MAPE is 833

Prediction horizons higher than 14 weeks provide inaccu-rate results

From previous tables similar accuracy was identified inthe prediction through SSA-AR HSVD-AR and SSA-ANNfor 11-step ahead prediction The predicted signals are shownin Figures 6 7 and 8 whereas the metrics residuals arepresented in Tables 4 5 and 6

The results for 11-step ahead prediction of Injured-G1are shown in Figures 6(a) 6(c) 6(e) and Table 4 Fromthese figures and metrics a good fit is observed the highestaccuracy was reached via SSA-AR withMNSE of 876 1198772 of986 MAPE of 20 and RMSE of 11

In Figures 6(b) 6(d) and 6(f) the Relative Error ofInjured-G2 prediction is shown The model SSA-AR shows947 of the predicted points with Relative Error lowerthan plusmn5 HSVD-AR 867 and SSA-ANN 831 The gainwas computed by means of residual metrics and the twobest models (SSA-AR and HSVD-AR) the highest gain wasobserved in MAPE with 25

The results for 11-step ahead prediction of Injured-G2are shown in Figures 7(a) 7(c) and 7(e) and Table 5 allmodels achieve a good fit SSA-AR and HSVD-AR reachthe highest and similar accuracy In Figures 7(b) 7(d) and7(f) the Relative Error is shown SSA-AR presents 858of the predicted points with a Relative Error lower thanplusmn5 HSVD-AR 853 and SSA-ANN 80 The gain wascomputed based on residual metrics and the two best modelsthe highest gain was observed in MAPE with 107

The results for 11-step ahead prediction of Injured-G3are presented in Figures 8(a) 8(c) and 8(e) and Table 6 allmodels achieve also a goodfit and similar accuracy In Figures8(b) 8(d) and 8(f) the Relative Error is illustrated SSA-ARpresents 924 of predicted points with a Relative Error lowerthan plusmn5 HSVD-AR 942 and SSA-ANN 916 The gain

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

6 Mathematical Problems in Engineering

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Time (weeks)

Inju

red-

G1

02

04

06

08

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

Time (weeks)

c L

SSAHSVD

(b)

Time (weeks)1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

minus05

0

05

c H

SSAHSVD

(c)

Figure 3 (a) Injured-G1 (b) Low frequency component (c) High frequency component

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 7830

1

Inju

red-

G2

Time (weeks)

02040608

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783Time (weeks)

02

04

06

08

c L

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783minus04minus02

0

Time (weeks)

0204

c H

SSAHSVD

(c)

Figure 4 (a) Injured-G2 (b) Low frequency component (c) High frequency component

obtain the low frequency 119888119871component In SSA to extract

the elements of119860 diagonal averaging is applied while HSVDuses direct extraction from the first row and last column of119860 Finally with (1) the component of high frequency 119888

119867was

computedEach data set of low and high frequency has been divided

into two subsets training and testing the training subset

involves 70 of the samples and consequently the testingsubset involves the remaining 30

The decomposition results by means of SSA and HSVDFigures 3(b) 4(b) and 5(b) show the low frequency com-ponents whereas Figures 3(c) 4(c) and 5(c) show thehigh frequency components of Injured-G1 Injured-G2 andInjured-G3 respectively

Mathematical Problems in Engineering 7

Table 1 Causes of injuries in traffic accidents (group 1 and group 2)

Category Number Cause Importance

Imprudent driving

1 Unwise distance 12 Inattention to traffic conditions 23 Disrespect to pedestrian passing 84 Disrespect for giving the right of way 95 Unexpected change of track 106 Improper turns 117 Overtaking without enough time or space 148 Opposite direction 189 Backward driving 19

Disobedience to signal

10 Disrespect to red light 311 Disrespect to stop sign 412 Disrespect to give way sign 613 Improper speed 13

Alcohol in driver 14 Drunk driver 715 Driving under the influence of alcohol 15

Recklessness in pedestrian16 Pedestrian crossing the road suddenly 517 Reckless pedestrian 1218 Pedestrian outside the allowed crossing 17

Recklessness in passenger 19 Get in or get out of a moving vehicle 16Alcohol in pedestrian 20 Drunk pedestrian 20

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Inju

red-

G3

02040608

Time (weeks)

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

c L

Time (weeks)

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

0

05

Time (weeks)

minus05

c H

SSAHSVD

(c)

Figure 5 (a) Injured-G3 (b) Low frequency component (c) High frequency component

The nonstationary trend in the signals was verified forInjured-G1 Injured-G2 and Injured-G3 through Kwiatkow-ski Phillips Schmidt and Shin test (KPSS) [33] The testassesses the null hypothesis that the signals are trend sta-tionary it was rejected at 5 significance level consequentlynonstationary unit-root processes are present in the signals

The time series Injured-G1 is observed in Figure 3(a)similar dynamic is observed with respect to full series takinginto consideration that this group contains the predominantcauses The 119888

119871components extracted by SSA and HSVD

are shown in Figure 3(b) long-memory periodicity featureswere observedThe 119888

119867resultant components are presented in

8 Mathematical Problems in Engineering

Figure 3(c) short-term periodic fluctuations were identifiedThe components obtained with SSA and HSVD are similarhowever a slight difference is observed in the componentsof low frequency SSA extracts smoother components thanHSVD

Both techniques SSA and HSVD show that the principalten causes (in Table 1 with importance 1 to 10) present thehighest incidence between years 2002 and 2005 (weeks 106 to312) it descends from 2006 until half 2012 (around week 710)an increment is observed between weeks 711 and 732 (secondsemester of 2013 and first semester of 2014)

Figure 4 shows the time series Injured-G2 the lowfrequency component and the high frequency componentAs previous series the components of low frequency showsimilar dynamic of slow fluctuations with decreasing trend(119888119871) while the high frequency shows fast fluctuations In this

group is also observed 119888119871via SSA smoother than 119888

119871viaHSVD

Both techniques SSA and HSVD show that the secondaryten causes (described in Table 1 with importance 11 to 20)present the highest incidence of injured people in years 20002003 and 2004 it descends from 2005 therefore forwarddowntrend is observed in the number of injured people intraffic accidents due to the 10 secondary causes

Figure 5 shows Injured-G3 and its components 119888119871and 119888119867

as in previous analysis the components of low frequency showlong-memory periodicity features whereas the componentsof high frequency show short-term periodic fluctuations and119888119871via SSA is smoother than 119888

119871via HSVD

Both techniques SSA and HSVD show that the causesare related to road deficiencies mechanical failures andundeterminednoncategorized causes (with 25 of incidence)The highest incidence is observed in year 2001 from year2002 it presents strong decay until 2006 and forward uptrendis observed with a temporal decrease in 2009

Prevention plans and punitive laws have been imple-mented in Chile during the analyzed period via educationdrivers licensing reforms zero tolerance law Emiliarsquos law andtransit law reforms among others The effect of a particularpreventive or punitive action is not analyzed in this workhowever the proposed short-term prediction methodologybased on observed causes and intrinsic components is a con-tribution to government and society in preventive plans for-mulation its implementation and the consequent evaluation

52 Prediction The prediction is implemented by means ofthe autoregressive models linear (AR) or nonlinear (ANN)The models use the lagged terms of the components 119888

119871and

119888119867 the optimal number of the lagged terms was fixed in

119898 = 32 weeks which was found through the computation ofthe Autocorrelation Function over the observed time seriesof injured people due to all causes

The models based on Singular Spectrum Analysis (SSA-AR and SSA-ANN) receive the components of SSA prepro-cessing stage whereas the models based on Singular ValueDecomposition of Hankel (HSVD-AR and HSVD-ANN)receive the components of HSVD preprocessing stage

TheANNhas single hidden layer structure (32 1 1) with32 inputs 1 hidden node and 1 outputThe LM algorithmwasused iteratively to adjust the linear and nonlinear weights

The direct method was used to develop multistep aheadforecasting in Tables 2 and 3 are presented the averageprediction results with hybrid models SSA-AR SSA-ANNHSVD-AR and HSVD-ANN with the three time series Thearithmetic mean of the resultant metrics is presented inTables 2 and 3 the results shows that the accuracy decreasesas the time horizon increases therefore the best accuracywasobtained for the nearest weeks and the lowest accuracy wasobtained for the farthest weeks The best mean accuracy wasreached by using SSA-AR model with MNSE of 926 1198772 of993 MAPE of 15 and RMSE of 07 The lowest meanaccuracy was obtained with HSVD-ANNmodel with MNSEof 8561198772 of 982MAPE of 29 andRMSE of 14Thesecond best average accuracy was reached by SSA-ANN andthe third best accuracy was reached by HSVD-AR

The highest gain in average MNSE from 1- to 14-stepahead prediction is 76 while average MAPE is 933However it was observed that HSVD-AR shows a higheraccuracy in farthest horizons from 12- to 14-step aheadprediction which reaches these average results MNSE of894 1198772 of 989 MAPE of 22 and of RMSE 10 thegain in average MNSE from 12- to 14-step ahead prediction is121 and on average MAPE is 833

Prediction horizons higher than 14 weeks provide inaccu-rate results

From previous tables similar accuracy was identified inthe prediction through SSA-AR HSVD-AR and SSA-ANNfor 11-step ahead prediction The predicted signals are shownin Figures 6 7 and 8 whereas the metrics residuals arepresented in Tables 4 5 and 6

The results for 11-step ahead prediction of Injured-G1are shown in Figures 6(a) 6(c) 6(e) and Table 4 Fromthese figures and metrics a good fit is observed the highestaccuracy was reached via SSA-AR withMNSE of 876 1198772 of986 MAPE of 20 and RMSE of 11

In Figures 6(b) 6(d) and 6(f) the Relative Error ofInjured-G2 prediction is shown The model SSA-AR shows947 of the predicted points with Relative Error lowerthan plusmn5 HSVD-AR 867 and SSA-ANN 831 The gainwas computed by means of residual metrics and the twobest models (SSA-AR and HSVD-AR) the highest gain wasobserved in MAPE with 25

The results for 11-step ahead prediction of Injured-G2are shown in Figures 7(a) 7(c) and 7(e) and Table 5 allmodels achieve a good fit SSA-AR and HSVD-AR reachthe highest and similar accuracy In Figures 7(b) 7(d) and7(f) the Relative Error is shown SSA-AR presents 858of the predicted points with a Relative Error lower thanplusmn5 HSVD-AR 853 and SSA-ANN 80 The gain wascomputed based on residual metrics and the two best modelsthe highest gain was observed in MAPE with 107

The results for 11-step ahead prediction of Injured-G3are presented in Figures 8(a) 8(c) and 8(e) and Table 6 allmodels achieve also a goodfit and similar accuracy In Figures8(b) 8(d) and 8(f) the Relative Error is illustrated SSA-ARpresents 924 of predicted points with a Relative Error lowerthan plusmn5 HSVD-AR 942 and SSA-ANN 916 The gain

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

Mathematical Problems in Engineering 7

Table 1 Causes of injuries in traffic accidents (group 1 and group 2)

Category Number Cause Importance

Imprudent driving

1 Unwise distance 12 Inattention to traffic conditions 23 Disrespect to pedestrian passing 84 Disrespect for giving the right of way 95 Unexpected change of track 106 Improper turns 117 Overtaking without enough time or space 148 Opposite direction 189 Backward driving 19

Disobedience to signal

10 Disrespect to red light 311 Disrespect to stop sign 412 Disrespect to give way sign 613 Improper speed 13

Alcohol in driver 14 Drunk driver 715 Driving under the influence of alcohol 15

Recklessness in pedestrian16 Pedestrian crossing the road suddenly 517 Reckless pedestrian 1218 Pedestrian outside the allowed crossing 17

Recklessness in passenger 19 Get in or get out of a moving vehicle 16Alcohol in pedestrian 20 Drunk pedestrian 20

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

1

Inju

red-

G3

02040608

Time (weeks)

(a)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 78302

04

06

08

c L

Time (weeks)

SSAHSVD

(b)

1 53 105 157 209 261 313 365 417 469 521 573 625 677 729 783

0

05

Time (weeks)

minus05

c H

SSAHSVD

(c)

Figure 5 (a) Injured-G3 (b) Low frequency component (c) High frequency component

The nonstationary trend in the signals was verified forInjured-G1 Injured-G2 and Injured-G3 through Kwiatkow-ski Phillips Schmidt and Shin test (KPSS) [33] The testassesses the null hypothesis that the signals are trend sta-tionary it was rejected at 5 significance level consequentlynonstationary unit-root processes are present in the signals

The time series Injured-G1 is observed in Figure 3(a)similar dynamic is observed with respect to full series takinginto consideration that this group contains the predominantcauses The 119888

119871components extracted by SSA and HSVD

are shown in Figure 3(b) long-memory periodicity featureswere observedThe 119888

119867resultant components are presented in

8 Mathematical Problems in Engineering

Figure 3(c) short-term periodic fluctuations were identifiedThe components obtained with SSA and HSVD are similarhowever a slight difference is observed in the componentsof low frequency SSA extracts smoother components thanHSVD

Both techniques SSA and HSVD show that the principalten causes (in Table 1 with importance 1 to 10) present thehighest incidence between years 2002 and 2005 (weeks 106 to312) it descends from 2006 until half 2012 (around week 710)an increment is observed between weeks 711 and 732 (secondsemester of 2013 and first semester of 2014)

Figure 4 shows the time series Injured-G2 the lowfrequency component and the high frequency componentAs previous series the components of low frequency showsimilar dynamic of slow fluctuations with decreasing trend(119888119871) while the high frequency shows fast fluctuations In this

group is also observed 119888119871via SSA smoother than 119888

119871viaHSVD

Both techniques SSA and HSVD show that the secondaryten causes (described in Table 1 with importance 11 to 20)present the highest incidence of injured people in years 20002003 and 2004 it descends from 2005 therefore forwarddowntrend is observed in the number of injured people intraffic accidents due to the 10 secondary causes

Figure 5 shows Injured-G3 and its components 119888119871and 119888119867

as in previous analysis the components of low frequency showlong-memory periodicity features whereas the componentsof high frequency show short-term periodic fluctuations and119888119871via SSA is smoother than 119888

119871via HSVD

Both techniques SSA and HSVD show that the causesare related to road deficiencies mechanical failures andundeterminednoncategorized causes (with 25 of incidence)The highest incidence is observed in year 2001 from year2002 it presents strong decay until 2006 and forward uptrendis observed with a temporal decrease in 2009

Prevention plans and punitive laws have been imple-mented in Chile during the analyzed period via educationdrivers licensing reforms zero tolerance law Emiliarsquos law andtransit law reforms among others The effect of a particularpreventive or punitive action is not analyzed in this workhowever the proposed short-term prediction methodologybased on observed causes and intrinsic components is a con-tribution to government and society in preventive plans for-mulation its implementation and the consequent evaluation

52 Prediction The prediction is implemented by means ofthe autoregressive models linear (AR) or nonlinear (ANN)The models use the lagged terms of the components 119888

119871and

119888119867 the optimal number of the lagged terms was fixed in

119898 = 32 weeks which was found through the computation ofthe Autocorrelation Function over the observed time seriesof injured people due to all causes

The models based on Singular Spectrum Analysis (SSA-AR and SSA-ANN) receive the components of SSA prepro-cessing stage whereas the models based on Singular ValueDecomposition of Hankel (HSVD-AR and HSVD-ANN)receive the components of HSVD preprocessing stage

TheANNhas single hidden layer structure (32 1 1) with32 inputs 1 hidden node and 1 outputThe LM algorithmwasused iteratively to adjust the linear and nonlinear weights

The direct method was used to develop multistep aheadforecasting in Tables 2 and 3 are presented the averageprediction results with hybrid models SSA-AR SSA-ANNHSVD-AR and HSVD-ANN with the three time series Thearithmetic mean of the resultant metrics is presented inTables 2 and 3 the results shows that the accuracy decreasesas the time horizon increases therefore the best accuracywasobtained for the nearest weeks and the lowest accuracy wasobtained for the farthest weeks The best mean accuracy wasreached by using SSA-AR model with MNSE of 926 1198772 of993 MAPE of 15 and RMSE of 07 The lowest meanaccuracy was obtained with HSVD-ANNmodel with MNSEof 8561198772 of 982MAPE of 29 andRMSE of 14Thesecond best average accuracy was reached by SSA-ANN andthe third best accuracy was reached by HSVD-AR

The highest gain in average MNSE from 1- to 14-stepahead prediction is 76 while average MAPE is 933However it was observed that HSVD-AR shows a higheraccuracy in farthest horizons from 12- to 14-step aheadprediction which reaches these average results MNSE of894 1198772 of 989 MAPE of 22 and of RMSE 10 thegain in average MNSE from 12- to 14-step ahead prediction is121 and on average MAPE is 833

Prediction horizons higher than 14 weeks provide inaccu-rate results

From previous tables similar accuracy was identified inthe prediction through SSA-AR HSVD-AR and SSA-ANNfor 11-step ahead prediction The predicted signals are shownin Figures 6 7 and 8 whereas the metrics residuals arepresented in Tables 4 5 and 6

The results for 11-step ahead prediction of Injured-G1are shown in Figures 6(a) 6(c) 6(e) and Table 4 Fromthese figures and metrics a good fit is observed the highestaccuracy was reached via SSA-AR withMNSE of 876 1198772 of986 MAPE of 20 and RMSE of 11

In Figures 6(b) 6(d) and 6(f) the Relative Error ofInjured-G2 prediction is shown The model SSA-AR shows947 of the predicted points with Relative Error lowerthan plusmn5 HSVD-AR 867 and SSA-ANN 831 The gainwas computed by means of residual metrics and the twobest models (SSA-AR and HSVD-AR) the highest gain wasobserved in MAPE with 25

The results for 11-step ahead prediction of Injured-G2are shown in Figures 7(a) 7(c) and 7(e) and Table 5 allmodels achieve a good fit SSA-AR and HSVD-AR reachthe highest and similar accuracy In Figures 7(b) 7(d) and7(f) the Relative Error is shown SSA-AR presents 858of the predicted points with a Relative Error lower thanplusmn5 HSVD-AR 853 and SSA-ANN 80 The gain wascomputed based on residual metrics and the two best modelsthe highest gain was observed in MAPE with 107

The results for 11-step ahead prediction of Injured-G3are presented in Figures 8(a) 8(c) and 8(e) and Table 6 allmodels achieve also a goodfit and similar accuracy In Figures8(b) 8(d) and 8(f) the Relative Error is illustrated SSA-ARpresents 924 of predicted points with a Relative Error lowerthan plusmn5 HSVD-AR 942 and SSA-ANN 916 The gain

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

8 Mathematical Problems in Engineering

Figure 3(c) short-term periodic fluctuations were identifiedThe components obtained with SSA and HSVD are similarhowever a slight difference is observed in the componentsof low frequency SSA extracts smoother components thanHSVD

Both techniques SSA and HSVD show that the principalten causes (in Table 1 with importance 1 to 10) present thehighest incidence between years 2002 and 2005 (weeks 106 to312) it descends from 2006 until half 2012 (around week 710)an increment is observed between weeks 711 and 732 (secondsemester of 2013 and first semester of 2014)

Figure 4 shows the time series Injured-G2 the lowfrequency component and the high frequency componentAs previous series the components of low frequency showsimilar dynamic of slow fluctuations with decreasing trend(119888119871) while the high frequency shows fast fluctuations In this

group is also observed 119888119871via SSA smoother than 119888

119871viaHSVD

Both techniques SSA and HSVD show that the secondaryten causes (described in Table 1 with importance 11 to 20)present the highest incidence of injured people in years 20002003 and 2004 it descends from 2005 therefore forwarddowntrend is observed in the number of injured people intraffic accidents due to the 10 secondary causes

Figure 5 shows Injured-G3 and its components 119888119871and 119888119867

as in previous analysis the components of low frequency showlong-memory periodicity features whereas the componentsof high frequency show short-term periodic fluctuations and119888119871via SSA is smoother than 119888

119871via HSVD

Both techniques SSA and HSVD show that the causesare related to road deficiencies mechanical failures andundeterminednoncategorized causes (with 25 of incidence)The highest incidence is observed in year 2001 from year2002 it presents strong decay until 2006 and forward uptrendis observed with a temporal decrease in 2009

Prevention plans and punitive laws have been imple-mented in Chile during the analyzed period via educationdrivers licensing reforms zero tolerance law Emiliarsquos law andtransit law reforms among others The effect of a particularpreventive or punitive action is not analyzed in this workhowever the proposed short-term prediction methodologybased on observed causes and intrinsic components is a con-tribution to government and society in preventive plans for-mulation its implementation and the consequent evaluation

52 Prediction The prediction is implemented by means ofthe autoregressive models linear (AR) or nonlinear (ANN)The models use the lagged terms of the components 119888

119871and

119888119867 the optimal number of the lagged terms was fixed in

119898 = 32 weeks which was found through the computation ofthe Autocorrelation Function over the observed time seriesof injured people due to all causes

The models based on Singular Spectrum Analysis (SSA-AR and SSA-ANN) receive the components of SSA prepro-cessing stage whereas the models based on Singular ValueDecomposition of Hankel (HSVD-AR and HSVD-ANN)receive the components of HSVD preprocessing stage

TheANNhas single hidden layer structure (32 1 1) with32 inputs 1 hidden node and 1 outputThe LM algorithmwasused iteratively to adjust the linear and nonlinear weights

The direct method was used to develop multistep aheadforecasting in Tables 2 and 3 are presented the averageprediction results with hybrid models SSA-AR SSA-ANNHSVD-AR and HSVD-ANN with the three time series Thearithmetic mean of the resultant metrics is presented inTables 2 and 3 the results shows that the accuracy decreasesas the time horizon increases therefore the best accuracywasobtained for the nearest weeks and the lowest accuracy wasobtained for the farthest weeks The best mean accuracy wasreached by using SSA-AR model with MNSE of 926 1198772 of993 MAPE of 15 and RMSE of 07 The lowest meanaccuracy was obtained with HSVD-ANNmodel with MNSEof 8561198772 of 982MAPE of 29 andRMSE of 14Thesecond best average accuracy was reached by SSA-ANN andthe third best accuracy was reached by HSVD-AR

The highest gain in average MNSE from 1- to 14-stepahead prediction is 76 while average MAPE is 933However it was observed that HSVD-AR shows a higheraccuracy in farthest horizons from 12- to 14-step aheadprediction which reaches these average results MNSE of894 1198772 of 989 MAPE of 22 and of RMSE 10 thegain in average MNSE from 12- to 14-step ahead prediction is121 and on average MAPE is 833

Prediction horizons higher than 14 weeks provide inaccu-rate results

From previous tables similar accuracy was identified inthe prediction through SSA-AR HSVD-AR and SSA-ANNfor 11-step ahead prediction The predicted signals are shownin Figures 6 7 and 8 whereas the metrics residuals arepresented in Tables 4 5 and 6

The results for 11-step ahead prediction of Injured-G1are shown in Figures 6(a) 6(c) 6(e) and Table 4 Fromthese figures and metrics a good fit is observed the highestaccuracy was reached via SSA-AR withMNSE of 876 1198772 of986 MAPE of 20 and RMSE of 11

In Figures 6(b) 6(d) and 6(f) the Relative Error ofInjured-G2 prediction is shown The model SSA-AR shows947 of the predicted points with Relative Error lowerthan plusmn5 HSVD-AR 867 and SSA-ANN 831 The gainwas computed by means of residual metrics and the twobest models (SSA-AR and HSVD-AR) the highest gain wasobserved in MAPE with 25

The results for 11-step ahead prediction of Injured-G2are shown in Figures 7(a) 7(c) and 7(e) and Table 5 allmodels achieve a good fit SSA-AR and HSVD-AR reachthe highest and similar accuracy In Figures 7(b) 7(d) and7(f) the Relative Error is shown SSA-AR presents 858of the predicted points with a Relative Error lower thanplusmn5 HSVD-AR 853 and SSA-ANN 80 The gain wascomputed based on residual metrics and the two best modelsthe highest gain was observed in MAPE with 107

The results for 11-step ahead prediction of Injured-G3are presented in Figures 8(a) 8(c) and 8(e) and Table 6 allmodels achieve also a goodfit and similar accuracy In Figures8(b) 8(d) and 8(f) the Relative Error is illustrated SSA-ARpresents 924 of predicted points with a Relative Error lowerthan plusmn5 HSVD-AR 942 and SSA-ANN 916 The gain

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

Mathematical Problems in Engineering 9

Table 2 Multistep ahead predictionmdashaverage results MNSE and 1198772

ℎ (week) MNSE () 1198772 ()

SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN1 995 994 964 957 999 999 999 9982 988 987 946 931 999 999 997 9963 979 976 932 909 999 999 996 9934 969 967 917 892 999 999 994 9895 958 953 906 869 998 998 992 9866 946 936 894 852 997 997 990 9837 934 926 882 835 996 995 988 9778 922 913 875 819 994 993 986 9749 909 894 874 802 992 991 985 97510 896 879 874 817 990 988 985 97611 884 863 876 822 988 985 986 97212 872 845 886 866 985 982 987 98513 862 773 890 834 983 967 989 97914 853 739 905 773 981 959 991 966Min 853 739 874 774 981 959 985 966Max 995 994 964 957 999 999 999 998Mean 1ndash14 steps 926 903 901 856 993 989 990 982Mean 12ndash14 steps 862 786 894 824 983 969 989 977

Table 3 Multistep ahead predictionmdashaverage results MAPE and RMSE

ℎ (week) MAPE () RMSE ()SSA-AR SSA-ANN HSVD-AR HSVD-ANN SSA-AR SSA-ANN HSVD-AR HSVD-ANN

1 01 01 08 09 005 005 036 0432 03 03 11 14 013 013 054 0673 04 05 14 18 021 022 067 0874 06 07 17 22 030 032 079 1035 09 10 19 28 040 045 089 1236 11 13 22 31 052 06 099 1377 13 15 24 35 064 07 111 1598 16 18 26 37 076 08 119 1709 18 23 26 38 088 09 122 17210 21 25 26 38 10 11 122 21911 24 29 25 38 11 13 118 21812 26 33 23 29 12 14 113 13413 29 41 23 35 13 20 106 19814 31 47 20 42 14 23 091 207Min 01 01 08 09 005 006 036 04Max 31 47 26 42 14 23 122 22Mean 1ndash14 steps 15 19 20 29 07 09 10 14Mean 12ndash14 steps 29 40 22 35 13 19 10 18

was computed based on residual metrics and the two bestmodels the highest gain was observed in RMSE with 77

In the next section the differences andor superioritiesof either linear model SSA-AR or HSVD-AR are identifiedthrough the application of the statistical tests

53 Models Statistical Tests The performance of the linearhybrid models SSA-AR and HSVD-AR is evaluated with

the Wilcoxon hypothesis test and with Pitmanrsquos correlationstest ((16)ndash(17a) (17b) and (17c)) The Wilcoxon hypothesistest results are shown in Table 7 and Pitmanrsquos correlation testresults are shown in Table 8

From Table 7 in 37 comparisons between residuals ofSSA-AR and HSVD-AR the test rejects the null hypothesisthat there is no difference in the prediction at 5 significancelevel In the remaining 5 comparisons the null hypothesis that

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

10 Mathematical Problems in Engineering

1 33 65 97 129 161 193 22002

04

06

08

1In

jure

d-G

1

Testing sample (weeks)

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220minus10

minus6

minus2

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedHSVD-AR

02

04

06

08

Inju

red-

G1

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

ObservedSSA-ANN

02

04

06

08

Inju

red-

G1

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 6 Injured-G1 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

there is no difference in the prediction is accepted In thiscase there is no difference in 12-step ahead prediction fortime series Injured-G1 the same situation was found in 10-and 11-step ahead prediction for time series Injured-G2 andInjured-G3

Pitmanrsquos correlations test is applied with the residualvalues to identify the superiority of SSA-AR over HSVD-AR

or the oppositeThe correlations between Υ andΨ are shownin Table 8

The null hypothesis of Pitmanrsquos correlation is true at 5significance level if |119877| gt 196radic119901119905 where 119901119905 = 225 testingsamples The results of the correlations are shown in Table 8From Table 8 the results present similarities with respect toWilcoxon test In 5 predictions there is no superiority of either

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

Mathematical Problems in Engineering 11

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 2200

1

Testing sample (weeks)

02

04

06

08

Inju

red-

G2

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 7 Injured-G2 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 4 11-step ahead prediction results Injured-G1

SSA-AR HSVD-AR SSA-ANN GainMNSE 876 847 829 331198772 986 980 980 06

MAPE 20 25 31 25RMSE 11 13 15 182RE plusmn 5 947 867 831 84

model (SSA-AR and HSVD-AR) SSA-AR shows superioritywith respect to HSVD-AR in 30 predictions (when |119877| gt0123 for nearest horizons) whereas the opposite is observed

Table 5 11-step ahead prediction results Injured-G2

SSA-AR HSVD-AR SSA-ANN GainMNSE 903 906 884 031198772 991 991 987 0

MAPE 28 25 36 107RMSE 09 09 10 0RE plusmn 5 858 853 800 06

in 7 predictions HSVD-AR shows superiority with respect toSSA-AR (when |119877| le 0123 for farthest horizons)

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

12 Mathematical Problems in Engineering

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-AR

(a)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(b)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedHSVD-AR

(c)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(d)

1 33 65 97 129 161 193 220

1

Testing sample (weeks)

03

05

07

09

Inju

red-

G3

ObservedSSA-ANN

(e)

1 33 65 97 129 161 193 220

2

6

10

Testing sample (weeks)

Rel

erro

r (plusmn

5)

minus10

minus6

minus2

(f)

Figure 8 Injured-G3 11-step ahead prediction (a) SSA-AR Prediction (b) SSA-AR Relative Error (c) HSVD-AR Prediction (d) HSVD-ARRelative Error (e) SSA-ANN Prediction (f) SSA-ANN Relative Error

Table 6 11-step ahead prediction results Injured-G3

SSA-AR HSVD-AR SSA-ANN GainMNSE 872 876 876 051198772 986 987 987 01

MAPE 23 23 23 0RMSE 14 13 14 77RE plusmn 5 924 942 916 19

6 Conclusions

In this paper has been developed multistep ahead trafficaccidents forecasting approach based on singular values andautoregressivemodelsThe nonstationary and nonlinear timeseries of injured people in traffic accidents of Santiago deChile was used

Before the models methodology stages ranking wasapplied to detect the relevant causes of injuries in traffic

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 13: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

Mathematical Problems in Engineering 13

Table 7 Wilcoxon hypothesis testmdashpairwise differences between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 1 1 1 1 1 1 1 1 1 1 1 0 1 12 1 1 1 1 1 1 1 1 1 0 0 1 1 13 1 1 1 1 1 1 1 1 1 0 0 1 1 1

Table 8 Pitmanrsquos correlation test 119877mdashpairwise comparisons between SSA-AR and HSVD-AR

Series Horizon1 2 3 4 5 6 7 8 9 10 11 12 13 14

1 minus09 minus09 minus09 minus08 minus08 minus07 minus06 minus06 minus05 minus04 minus02 minus01 01 032 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus05 minus03 minus02 00 02 03 053 minus09 minus09 minus09 minus08 minus07 minus06 minus05 minus04 minus03 minus01 00 02 03 05

accidents causes related to behavior of drivers pedestriansor passengers are predominant Unwise distance inattentionto traffic conditions and disrespect to red light are the firstimportant causes of injuries in traffic accidents in concor-dance with previous studies that determine disrespect towardsthe road signs as a principal cause of traffic accidents Com-plementary information was observed about traffic accidentsconditions with high rate of injured people automobilestype environmental conditions and relative position amongothers

This approach was described in two stages preprocessingand prediction in the first stage two methods for compo-nents extraction were developed Singular SpectrumAnalysisand Singular Value Decomposition of Hankel whereas inthe second stage the linear autoregressive model and anAutoregressive Neural Network with Levenberg-Marquardtalgorithm were used

Four hybrid models were implemented SSA-AR HSVD-AR SSA-ANN and HSVD-ANNThemodels were evaluatedfor 14-week ahead forecasting comparative analysis showsthat the proposed models SSA-AR and SSA-ANN achievedthe highest accuracy with an average MNSE of 926 and903 respectively the highest gain in average MNSEachieved by SSA-AR is 76 However it was observed thatHSVD-AR shows a higher accuracy in farthest horizons from12 to 14 steps which reaches an average MNSE of 894 inthis case the highest gain achieved by HSVD-AR in MNSE is12

The statistical tests application through Wilcoxon andPitman has shown that SSA-AR is superior to HSVD-ARin 30 of 42 comparisons of resultant efficiency criteria (atnearest horizon) at 5 significance level 5 comparisonsshow equivalence and 7 comparisons show the superiorityof HSVD-AR over SSA (at farthest horizon)

In further works more strategies of components extrac-tion will be explored spectral analysis could help to explainthe nature of traffic accidents in other geographic zonesDetailed work focused on the causes of traffic accidents willbe done to support prevention plans aimed at promotinggood habits on roads and highways

Competing Interests

The authors declare that they have no competing interestsregarding the publication of this paper

References

[1] World Health Organization 2015 httpwwwwhoint[2] J Abellan G Lopez and J de Ona ldquoAnalysis of traffic accident

severity using decision rules via decision treesrdquo Expert Systemswith Applications vol 40 no 15 pp 6047ndash6054 2013

[3] L-Y Chang and J-T Chien ldquoAnalysis of driver injury severity intruck-involved accidents using a non-parametric classificationtree modelrdquo Safety Science vol 51 no 1 pp 17ndash22 2013

[4] J De Ona G Lopez R Mujalli and F J Calvo ldquoAnalysis oftraffic accidents on rural highways using LatentClass Clusteringand Bayesian Networksrdquo Accident Analysis and Prevention vol51 pp 1ndash10 2013

[5] Y-R Shiau C-H Tsai Y-H Hung and Y-T Kuo ldquoTheapplication of data mining technology to build a forecastingmodel for classification of road traffic accidentsrdquoMathematicalProblems in Engineering vol 2015 Article ID 170635 8 pages2015

[6] L Lin Q Wang and A W Sadek ldquoA novel variable selectionmethod based on frequent pattern tree for real-time traffic acci-dent risk predictionrdquo Transportation Research Part C EmergingTechnologies vol 55 pp 444ndash459 2015

[7] M Deublein M Schubert B T Adey J Kohler and M HFaber ldquoPrediction of road accidents a Bayesian hierarchicalapproachrdquo Accident Analysis and Prevention vol 51 pp 274ndash291 2013

[8] S Ognjenovic R Donceva and N Vatin ldquoDynamic homo-geneity and functional dependence on the number of trafficaccidents the role in urban planningrdquo in Proceedings of theInternational Scientific Conference Urban Civil Engineering andMunicipal Facilities (SPbUCEMF rsquo15) vol 117 p 551 2015

[9] K Keay and I Simmonds ldquoRoad accidents and rainfall in a largeAustralian cityrdquo Accident Analysis and Prevention vol 38 no 3pp 445ndash454 2006

[10] M Aron R Billot N E Faouzi and R Seidowsky ldquoTrafficindicators accidents and rain some relationships calibrated on

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 14: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

14 Mathematical Problems in Engineering

a French urban motorway networkrdquo Transportation ResearchProcedia vol 10 pp 31ndash40 2015

[11] Comision Nacional de Seguridad de Transito 2015 httpwwwconasetcl

[12] H Viljoen and D G Nel ldquoCommon singular spectrum analysisof several time seriesrdquo Journal of Statistical Planning andInference vol 140 no 1 pp 260ndash267 2010

[13] C A F Marques J A Ferreira A Rocha et al ldquoSingularspectrum analysis and forecasting of hydrological time seriesrdquoPhysics and Chemistry of the Earth Parts ABC vol 31 no 18pp 1172ndash1179 2006

[14] L Telesca M Lovallo A Shaban T Darwich and N AmachaldquoSingular spectrum analysis and Fisher-Shannon analysis ofspring flow time series An application to Anjar SpringLebanonrdquo Physica A vol 392 no 17 pp 3789ndash3797 2013

[15] H Hassani S Heravi and A Zhigljavsky ldquoForecasting Euro-pean industrial production with singular spectrum analysisrdquoInternational Journal of Forecasting vol 25 no 1 pp 103ndash1182009

[16] H Hassani A Webster E S Silva and S Heravi ldquoForecastingUS tourist arrivals using optimal singular spectrum analysisrdquoTourism Management vol 46 pp 322ndash335 2015

[17] J Wang S Jin S Qin and H Jiang ldquoSwarm intelligence-based hybrid models for short-term power load predictionrdquoMathematical Problems in Engineering vol 2014 Article ID712417 17 pages 2014

[18] H Li L Cui and S Guo ldquoA hybrid short-term power loadforecasting model based on the singular spectrum analysis andautoregressive modelrdquo Advances in Electrical Engineering vol2014 Article ID 424781 7 pages 2014

[19] W Zhang Z Su H Zhang Y Zhao and Z Zhao ldquoHybrid windspeed forecasting model study based on SSA and intelligentoptimized algorithmrdquo Abstract and Applied Analysis vol 2014Article ID 693205 14 pages 2014

[20] Y Xiao J J Liu Y Hu Y Wang K K Lai and S Wang ldquoAneuro-fuzzy combination model based on singular spectrumanalysis for air transport demand forecastingrdquo Journal of AirTransport Management vol 39 pp 1ndash11 2014

[21] M Abdollahzade AMiranian HHassani andH IranmaneshldquoA new hybrid enhanced local linear neuro-fuzzy model basedon the optimized singular spectrum analysis and its applicationfor nonlinear and chaotic time series forecastingrdquo InformationSciences vol 295 pp 107ndash125 2015

[22] L Barba N Rodrıguez and C Montt ldquoSmoothing strategiescombined with ARIMA and neural networks to improve theforecasting of traffic accidentsrdquoTheScientificWorld Journal vol2014 Article ID 152375 12 pages 2014

[23] G H Golub and C F V LoanMatrix Computations The JohnsHopkins University Press 1996

[24] N Golyandina V Nekrutkin and A A Zhigljavsky Analysis ofTime Series Structure Chapman amp HallCRC 2001

[25] J Elsner and A Tsonis Singular Spectrum Analysis A New Toolin Time Series Analysis Plenum 1996

[26] H Hassani R Mahmoudvand and M Zokaei ldquoSeparabilityand window length in singular spectrum analysisrdquo ComptesRendus Mathematique vol 349 no 17-18 pp 987ndash990 2011

[27] R Wang H-G Ma G-Q Liu and D-G Zuo ldquoSelection ofwindow length for singular spectrum analysisrdquo Journal of theFranklin Institute vol 352 no 4 pp 1541ndash1560 2015

[28] J A Freeman and D M Skapura Neural Networks AlgorithmsApplications and Programming Techniques Addison-Wesley1991

[29] M Hagan H Demuth and M Bealetitle Neural NetworkDesign Hagan Publishing 2002

[30] P Krause D P Boyle and F Base ldquoComparison of differentefficiency criteria for hydrological model assessmentrdquoAdvancesin Geosciences vol 5 pp 89ndash97 2005

[31] R L Ott and M Longnecker An Introduction to StatisticalMethods and Data Analysis Cengage Learning Boston MassUSA 6th edition 2001

[32] K Hipel and A McLeod Time Series Modelling of WaterResources and Environmental Systems Elsevier 1994

[33] D Kwiatkowski P C B Phillips P Schmidt and Y Shin ldquoTest-ing the null hypothesis of stationarity against the alternative ofa unit root how sure are we that economic time series have aunit rootrdquo Journal of Econometrics vol 54 no 1ndash3 pp 159ndash1781992

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 15: Research Article Hybrid Models Based on Singular Values and …downloads.hindawi.com/journals/mpe/2016/2030647.pdf · 2019. 7. 30. · Research Article Hybrid Models Based on Singular

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of