comparison of machine learning algorithms for optimization and improvement of process quality in...

ORIGINAL ARTICLE

Comparison of machine learning algorithmsfor optimization and improvement of process qualityin conventional metallic materials

Susana Ferreiro & Basilio Sierra

Received: 12 January 2011 /Accepted: 7 August 2011 /Published online: 6 September 2011# Springer-Verlag London Limited 2011

Abstract This paper presents a particular problem dealingwith the apparition of burr during the drilling process in theaeronautic industry. This burr cannot exceed a height limitof 127 μm as set out by the aeronautical guidelines andmust be eliminated before riveting. If this is not performed,it can cause structural damage which would constitute adanger due to the lack of safety. Moreover, the industryneeds to find an automated and optimised process in whichthe drilling and deburring can be carried out in real time,eliminating those other unnecessary tasks, in order to obtainhigh-quality pieces. The work presents the applicability ofdata mining and machine learning techniques so as toobtain a real time burr detection model. This model couldbe implanted in the computer numerical control of themachine allowing the whole process to be automated andoptimised. These techniques can be applied to other typesof processes.

Keywords Machine learning algorithms . Data mining .

Drilling process . Burr detection . Quality control .

Process optimization

1 Introduction

The drilling process in the aeronautical industry is carriedout on big pieces such as the fuselage and the main beam

(or wings), where most of the pieces are joined using rivets.Although this seems to be an easy procedure, it istechnically complicated due to the occurrence of burrduring the process. If it is not carried out with enoughprecision and skill, unwanted burr is generated at theentrance and the exit of the hole, which implies the carryingout of other tasks and the use of additional tools toeliminate the burr in the pieces before continuing with theriveting.

The burr generated must be eliminated because itcauses stress lines in critical areas and support surfaceswhich can become deformed under weight, this beingdamaging for the fatigue properties in the holes. It alsogenerates tensions and may even cause structuraldamage. If a lot of burr remains, the rivet may not befirmly attached and become loose when the aircraft is inuse, which implies a great danger. The aeronauticalindustry insists on drilling processes of high quality, andthe elimination of the burr generated is an importantaspect to be dealt with so as to guarantee the reliabilityand security of the piece.

In the aviation industry, certain pieces such as themain beam and the fuselage, amongst others, are drilledin a semi-manual way with high-power drills. Theoperating procedure for these pieces is the following: astructure is positioned to secure the piece and work on it,assembling of the piece, manual or semi-manual drillingof the piece, disassemble the piece, eliminate the burr,riveting and final assembly of the piece (integration) onthe aircraft.

The size of the burr, its shape, situation and the finish ofthe surface are the main factors to take into account whenchoosing the deburring procedure. When this deburringoperation is not considered until the end of the buildingprocess, there can be a potential loss due to any failure in

S. Ferreiro (*)Diagnosis and Prediction Technologies Unit, Tekniker,Eibar, Gipuzkoa, Spaine-mail: [email protected]

B. SierraUniversity of the Basque Country UPV-EHU,San Sebastian, Gipuzkoa, Spain

Int J Adv Manuf Technol (2012) 60:237–249DOI 10.1007/s00170-011-3578-x

the selection, planning or execution of the final process offinishing the surface. After the drilling process and prior tothe riveting, a visual inspection and manual deburring iscarried out, this involves an increment in the cost of theprocess. The appearance of burr during the drilling processdoes not allow the optimization of the assembly and thecost of deburring may be over 30% of the cost of thefinished pieces.

This problem is being approached in different ways. Anattempt is being made to control the procedure in such away that the adequate choice of the conditions of theprocess will assure that the tools will not wear out too muchand, thus, the size of the burr which is generated is withinthe permitted limits. Dynamic studies of the procedure arecarried out. Moreover, there are also studies which, throughsimulation and modelling, focus on the categorization ofthe burr and its behaviour under different workingconditions.

The industry needs to improve productivity and auto-mate the whole process, optimising the drilling process andgetting rid of unnecessary tasks such as the manualelimination of the burr. A monitoring system, capable ofautomatically detecting in real time whether or not the burrgenerated in the hole is within the limits established in theguidelines for carrying out riveting of the aeronautic sector.

Aerospace manufacturers have been drawn towardsrobotised solutions and Gantries due to their cost-effectiveness and to the flexibility which they providewhen automating certain operations such as the drillingand/or the assembly. It is for this reason that more and moresolutions are available. The challenge is to find anautomatic deburring system integrated either in the processor in the machines themselves. The monitoring systemshould be integrated in the computer numerical control(CNC) system, and be responsible for the positioning of themoving mechanical organ through instructions created andgiven in a fully automated manner, and from the real timenumerical information.

As previously mentioned, there is a great demand for anautomated and optimised system which can guarantee thequality of the final pieces for the drilling process.

2 Motivation

The aeronautic and aerospace industries insist on high-quality drilling processes; with this in mind, the automaticelimination of the burr generated during the process isimportant.

Within this area, fundamentally, two types of solution arebeing developed to solve this problem: dynamic study ofthe process, or the integration of simulation models orothers.

The dynamic study tries to improve conditions, both ofthe operation and of the tool itself, in order to stabilise theprocess. This is done through predicting the strength andprecision, the choice and design of the tools, experimentalanalysis and the simulation of different cutting speeds.

The integration of simulation models tries to optimisethe trajectories and improve the mechanical processes bycalculating the strength of the cutting and deformations,identifying problems related with vibrations, selecting theadequate tool, defining the usage well and carrying out ananalysis before the process. These models need a wideknowledge of the behaviour of the process (the underlyingaspects) in order to establish the constitutive equations thatthe phenomenon demands. The software that is on themarket comes with these calculating equations alreadyintegrated; they are like black boxes, which are usedwithout really knowing their associated implicit process,where the equations are already established and remainhidden to the end-user. If the available software does notoffer the equations which are needed to be modelled, it allbecomes too complicated to face from a functionalviewpoint. Even if the software does have these equations,this type of analysis allows in determining the behaviour ofthe burr, modifying certain operational parameters of theprocess. These are techniques which are used to evaluate(in a guiding manner) how the changes in certainoperational parameters of the process affect the result (e.g.to see how a change in the angle of entry of the tool wouldaffect the burr on the piece being worked on). These are nottools with a predictive facility; however, at times it isbelieved that they are used in this way.

Cooling, for example, is a widely used method to avoidthe wearing out of the tool. It is usually carried out usingcoolant or also what is known as cutting oil (mixed with apercentage of water), which is pumped onto the tool duringthe drilling process in order to lubricate and cool the workarea. The aim of this is to achieve a longer life of the bitand keep its wear and generated residue down to aminimum.

The aforementioned techniques are costly, time consumingand use up resources; they reduce the performance and do notallow the optimization of the process. Inevitably, a lot ofresearch is being carried out on new techniques to achieve amodel which detects burr above the permitted limit and whichcan be integrated in the CNC of the machine and operate inreal time. This model would allow to fully set, from thebeginning, the planning process of the finished product. Theburr generated and its behaviour would not be reasons forconcern because they would have been previously detectedand automatically eliminated, without the need for either avisual inspection or for the operator to carry out deburring.The drilling process would be optimised, increasing efficiencyand also the quality of the end product.

238 Int J Adv Manuf Technol (2012) 60:237–249

3 Related works

The aeronautic industry, along with other industrial sectors,has to modify some of its manufacturing processes and itscurrent maintenance strategy. Regarding the maintenancestrategy, in order to substitute the traditional correctivemaintenance for one which is preventative and predictive, itis necessary to reduce the cost as much as possible and toincrease the operative reliability, as explained in Ferreiroand Arnaiz [9]. As for manufacturing, the main need is toincrease productivity and optimise certain processes, suchas drilling, at the same time guaranteeing the quality of theproduct. It is necessary to explore new technologies bothfor the manufacturing and the maintenance and there is a lotof work already carried out in this field such as thefollowing:

Bukkapatnam et al. [3] present a method based on chaostheory, wavelets and neural networks for the analysis ofacoustic emission signals (EA). These signals are used todetect surface faults in the structures. In the study, thecharacterisation of the signal is presented, followed by itsrepresentation through the use of wavelet packets, and theestimation of the state of the structure through the posterioruse of neural networks. Bukkapatnam et al. [4] develop amethodology which uses the fractal properties of themechanised dynamic to carry out the estimation of thewear of the flank of the tool through wavelets and a neuralnetwork. Kamarthi et al. [15] research a way to estimate thewear of the flank (flank in turning) through EA signals. Agroup of experiments is performed on steel AISI6150 andK68 (C2) grade without uncoated carbide inserts, applyingwavelets to the signals, from which an architecturallysimple neural network is built and used to relate thecharacteristics extracted from the EA signals with the wearon the flank. Pittner and Kamarthi [25] present a newmethod for the extraction of characteristics using a fastwavelet transformer and verify its efficiency using it in aproblem to estimate the wear on the flank (in turningprocess). Sick [28] carries out a state of the art with 138publications which use different tool-monitoring methodsand the neural network to carry out estimates on its wear.The study compares the different methodologies and criteriaused for the selection of certain methods to performsimulation experiments, to evaluate and present the results.Rangwala and Dornfeld [26] use the neural network (inturning operation) to learn the effects of certain inputparameters of the process (such as advance speed, depthand speed of the cut) on the output variables (force of thecut, power, temperature or the finish of the surface of thepiece).

These techniques have evolved and changed throughoutthe years with the aim of improving the results achieved. InHow et al. [14], a neural network is implemented through

an Ethernet network, connected to various sensors of theprocess and a logical programmable controller connected tothe control computers of the Ethernet. The neural networkdetermines the quality of certain parameters, such as theroughness of the finished surface, through certain processparameters. Umbrello et al. [29] use clustering to identifythe relationship between the residual stress and theparameters of the process in order to be able to optimiseit given that these residual tensions can be damaging andrepresent a negative characteristic for fatigue, rust, resis-tance and other functional aspects. Malakooti and Raman[18] develop a neural network to establish the operatingparameters of the machine. The network is used todetermine the relationship between the input parameters(speed of the machine, feed and depth of the cut) and thoseof the output (temperature of the cut, strength of the cut,history of the tool and roughness of the finish), with thedefinitive aim of reducing the cost and maximisingproductivity and the finish of the surface. Corea et al. [7]present the use of a Bayesian network to carry out theprediction of the roughness of the surface, and in a laterstudy, Correa et al. [8] compare the results achieved in thatnetwork with those obtained with a neural network.

However, drilling is still one of the most importantprocesses of manufacturing and requires great attention.The main problem in the study of this process is theapparition of burr, and in this regard a lot of research hasbeen carried out such as that put forward by Kim et al. [16]and Min et al. [21] where control graphics were used toexamine the drilling process on stainless steel pieces AISI304L and AISI 4118 with the aim of examining thebehaviour of the process under certain conditions. Hambli[12] describes a focus of finite elements combined with aneural network to predict the height of the burr. Thenumerical results obtained through the calculation of finiteelements which include a model of the damage, fracturesand the effect of wear of the tool are used to train thesimulation environment developed through the neuralnetwork. Heisel et al. [13] propose a method based onempirical tests of the cut and the correlation between theburr generated and the operating conditions of the machinesuch as the speed of the cut or the geometry of the tool.Lauderbaugh [17] presents a method to combine differentexperimental tools, simulation and statistics with the aim ofpredicting the height of the burr, the strength and flow ofthe heat and the temperature. The material used wasaluminium 2024-T351 and 7075-T6. Gaitonde et al. [10]research the application of Genetic Algorithms along withthe RMS value to obtain the optimum values of the process(such as speed of the cut, diameter of the bit) whichminimise the size of the burr in the drilling of stainless steelAISI 316L pieces with HSS bits. Gaitonde et al. [11] putforward the application of the Taguchi method to minimise

Int J Adv Manuf Technol (2012) 60:237–249 239

the height and thickness of the burr through the conditionsof the cut and the geometry of the bit. Chang et al. [5]present an analytical model to predict the height of theburr in vibrated assisted drilling on 6061-T6 aluminiumpieces.

In the work of Peña et al. [24], a study was carried outon the way of achieving clean holes, free of burr, in aneffort to reduce or eliminate the lubricant used and so easethe task of cleaning the structure. In 2007, the CooperativeInvestigation Center for High Performance Manufacturing‘C.I.C. marGUNE’ patented an experimentally adjustedmethod capable of detecting in real time whether or not thesize of the burr generated during the drilling process iswithin the limits permitted. The patent presented in Peña etal. [24] named ‘Optimization of the Drilling Process ofAeronautic Material using On-line Monitoring of the BurrGenerated ’ proposes and evaluates a monitoring method todetect the generation of burr based on parameters directlyextracted from the internal signals of the machine, thesignal of the electro-spindal pair with regards time. Themodel achieved is based on rules which were defined fromthe thresholds established by the parameters extracted andits accuracy is over 92%.

4 Data mining and machine learning

The data mining prepares, tests and explores data to extractinformation which is hidden in it. It consists of theextraction of meaningful knowledge which implicitlyresides in the data. This knowledge is previously unknownand can be useful as explained in Witten and Frank [31].Often, terms such as ‘Intelligent Data Analysis’ (IDA)Berthol and Hand [2] and ‘Knowledge Discovery inDatabases’ (KDD) are used as synonyms.

Within data mining, there are sets of techniques whichare based on artificial intelligence and the statisticalanalysis of data. The models developed with the techniquesfacilitated by data mining may present problems withdetection, classification and prediction. The most typicalprocess consists of the general stages which are shown in(Fig. 1):

& The selection and preparation of a data set: selectionand pre-processing of the data.

& The selection of the variables: extraction of the mostsignificant characteristics which identify the event to bedetected.

& Analysis: knowledge extraction. The algorithm to beused is selected and the knowledge model whichrepresents the observed behaviour pattern is obtained.

& Interpretation and evaluation of the model: if, throughthe different techniques, different models have beenobtained, they must be compared in order to see the onewhich best fits. If none of the models reaches theexpected results, some of the previous stages have to bemodified in order to obtain new models.

It is both an iterative and interactive process as explainedin Witten and Frank [31].It is iterative because the output ofsome of the stages might mean returning to one of theprevious stages and because several iterations are necessaryin order to extract high-quality knowledge. It is interactivedue to the fact that the expert who controls the problem willhave to help in the preparation of the data and thevalidation of the knowledge extracted.

Sometimes it is necessary to examine several modelsin order to find that which is best suited to solve theproblem. Once this model has been found, and from theresults achieved, a new model can be constructed varyingthe techniques used in the data mining process. In thesearch for the final model, it may be necessary to returnto previous stages of the process and to carry outappropriate changes to the data, and the problem mighteven have to be redefined so as to approach it in adifferent way.

5 Machine learning

Machine learning, as defined in Mitchell [22], is a sub-fieldof artificial intelligence [27] which is closely related withdata mining [20]. Its aim is to develop algorithms whichallow the machines to learn from the data, this means thatprogrammes are developed that are capable of inducingmodels from this data. It involves a knowledge inductionprocess, a generalisation or specialisation process on a setof input examples which, moreover, allow to classify aposteriori new examples from the previously obtainedconcepts.

Fig. 1 Data mining stages


The inductive learning can be seen as a method oflearning a function. An example is a pair (x; f(x)) where x isthe input and f(x) the output. The pure inductive inferenceprocedure (or induction) is: give a set of input data f, returna function h such that it approximates f. The function h iscalled the hypothesis or model.

In machine learning, several different types of algorithmscan be established according to their functionality andapplication:

& Supervised learning: offers the possibility of learningfrom a set of categorised or labelled data, providing apredictive model capable of representing and general-ising the behaviour pattern in the data. Once the modelis created, it is capable of classifying and categorisingthe new cases of the problem that it is trying to solve.

& Unsupervised learning: starts with a set of data whosecategory and label is unknown, which is analysed bythe algorithms in order to recognize the different groupsof cases with common characteristics. The creation ofthese groups allows the extraction of information fromthe data sets available and brings to light somecharacteristics which hide information.

While the unsupervised learning algorithms consist offinding the most appropriate distribution of the input setfrom similarities in the data, the supervised learningalgorithms attempt to extract those properties which allowthe category of each example to be determined and,consequently, need a previous classification (supervision)of the training set. In the case of Supervised algorithms, theexamples which make up the training set normally consistof pair types: (input object, object class), where the inputobject is usually represented by a vector of variables (orproperties of the object). The aim of the automaticsupervised learning algorithms is to find the set of variableswhich allows predicting, with the greatest accuracy, theclass of each object of the training set.

As a result of its nature and qualities, there is a wideapplication scope for Machine Learning, giving rise to itsuse in such diverse areas as medicine [32, 19] or robotics[6, 1]. Moreover, little by little these algorithms are beingincorporated in new fields of work such as aeronautics andaerospace, or machine tools, where the algorithms allow theoptimization of certain processes and the improvement inseveral applications of the current maintenance [8; 23].

6 Experimental setup

The development of this work began with the ARKUNEProject (intelligent manufacturing workspaces, monitoringof startup processes), whose aim was to approach thedevelopment of new techniques and monitoring systems of

the mechanised processes in which the applications thatwere selected were:

& Monitoring of the wear of the turning tool through theprocessing of acoustic emission signals and acceleration.

& Monitoring of the apparition of burr in the drillingprocess (carried out in aeronautics) through the machinesignal.

The drilling process is particularly important in theaeronautic industry due to the necessity to guarantee safetyof the product and fulfil the legal requisites as regards thesize of the burr. Holes with burr in excess of the officiallimit of 127 μm are not permitted, even when thepercentage of this number of holes is low. One of theresults of the project was the development of a method ofmonitoring to detect the formation of burr in the drillingprocess of pieces, based on the characteristics of the pairsignal of the main head, which was patented in 2007 by ‘C.I.C. marGUNE’ [24]. The algorithm developed is based ona series of thresholds established for these characteristicsand has a reliability rate of approximately 92%.

In this paper, the use of supervised classification anddata mining for the detection of burr above the set limitduring the drilling process is presented, as mentioned in[30]: ‘..We will never find the rules which establish certainaspects of the process through the manual study of datasets..’ and ‘… we will not be able to discover the hiddenknowledge in the data sets without the help of computeriseddata analysis and the different approaches that Data Miningprovides…’.

Within the ARKUNE project, one of the preliminarytasks to be carried out was the study of the sensitivity of thedifferent signals of the machine to the detection of burr, theuse and treatment of these signals for the development of anon-line monitoring system. Several internal signals of themachine were analysed such as the torque of the axis, thepower and the strength of the thrust.

Figure 2 shows a signal gathered during one of the testscarried out in the drilling process. It corresponds to thetorque of the electro-spindal during the drilling process inone of the holes, from when the poppet starts (beginningof the acceleration zone) until it completely stops (end ofthe deceleration zone).The use of this type of signalpresents a series of advantages with respect to othersignals given that: the signals create a simple acquisitionsystem which does not require the placing of additionalelements in the machine, they constitute a non-intrusivemethod because it is not necessary to add externalelements to the work piece, and they are easy to integratein the control of the machine.

Through the pre-analysis of this type of signal (carriedout by Fatronik) it was detected that the waveform of theelectro-spindal torque signal with respect to time is related


with the size of the burr created in the process. The signal isdivided into four zones: ‘acceleration zone’, ‘approachzone’, ‘cutting zone’ and ‘deceleration zone’. The mostrepresentative zone corresponds to the ‘cutting zone’ shownin Fig. 3.

6.1 Characteristics of the process

The structure of planes must be able to withstand certainspeeds, forces, pressure or impacts which have to be dealtwith. Light materials are required, with good mechanicalcharacteristics, easy to mechanise, resistant to corrosion,with the ability to undergo surface treatment, etc. Alumin-ium and its alloys possess the best characteristics to fulfilthese requisites and, thus, is a widely used material in theaeronautical industry. A series of tests was carried out onpieces of Al7075-T6 aluminium.

The tests were performed in a three-axis CNC mecha-nised centre, in high-speed conditions which were also dry(no lubrication) in order not to have to carry out the aposteriori cleaning of the structure. The machine operates ata maximum speed of 24,000 rpm and has a maximum feedspeed limit of 120 m/min, an acceleration of 2 g (9.8 m/s2),a nominal power of 27 kW and a nominal torque of16.9 Nm.

The tool used was a three-edged carbide solid drill, withtwo types of angle: tip angle (130°) and helix angle (30°).

6.2 Preparing the data sets

In the preliminary study five parameters were detectedwhich were associated with the height of the burr andwhich were indifferent to the parameters of the process:minimum (Min), maximum (Max), angle (S), height (H)and width (W), shown in Fig. 3.

However, it was decided to continue working to optimisethis monitoring strategy by using supervised classificationand data mining techniques. For this end, a set of 105 testsunder different operating conditions were created. For eachtest, the set of variables given in Table 1 was gathered,which represents the structure of the final data set.

The variables represent parameters which correspond tooperating conditions ‘Configuration’, or parameters extracteddirectly from the internal machine signal, ‘Sensor’. Thoseparameters of ‘Configuration’ are the type of drill used (Drill),the velocity (VC), the approach speeds of the tool (AV1 andAV2), the length of the entrance (LE) and the exit (LX) of thetool with respect to the work piece, and the thickness of thepiece (T). Those parameters of ‘Sensor’ are the minimum(Min), the maximum (Max), the angle (S), the height (H) andthe width (W) all gathered through the signal (see Fig. 3).

Moreover, for every test carried out, the profile of theburr was measured at different angles (0°, 90°, 120°, 180°,

Fig. 2 Signal of the electro-spindal

Fig. 3 Cutting zone


240°, 270°) with the Mitutoyo (SV-2000 N2) measuringsystem. The average of the heights at the different angleswas calculated and the class of the test was establishedaccording to the following criteria:

& Admissible burr: burr=‘no’, the height of the burr doesnot go over 127 μm.

& Non-admissible burr: burr=‘yes’, the height of the burrexceeds 127 μm.

7 Experimental results

Two approximations were carried out following the stagesin Fig. 1. Different machine learning algorithms were tested,with and without the selection of variables. Next the set ofalgorithms was extended and a Bayesian network was createdwith the intention of improving the initial result.

The tables which are now shown to demonstrate theresults of the algorithms have a general structure. They areformed by the type of algorithm used, the name of thealgorithm, its success rate, the standard deviation and therate of false negatives (FN rate), and in some cases, thesubset of variables used. FN are cases which the algorithmclassifies in the ‘no’ category but, however, belong to the‘yes’ category. These cases are of special interest because,as previously mentioned, it is necessary to detect all thoseholes whose burr exceeds the permitted limit. The aero-nautic industry does not permit the existence of holes withburr which exceeds 127 μm, which, apart from loss ofquality of the final piece, is an unacceptable loss of safety ifwe bear in mind the field in concern.

7.1 Machine learning: results

Initially, the following machine learning algorithms weretested with the use of the open software WEKA (http://www.cs.waikato.ac.nz/ml/weka/index.html): classificationtrees (J48, ID3), induction rules (JRip, Prism), distance-basedtechniques (k-nearest neighbour with k=1 and k=3), theprobabilistic model based on Bayes' theorem (Naïve Bayes)and logistic regression (logistic). The criteria used to selectthese algorithms were simplicity, their ease of interpretationand the ease of their a priori implementation in the CNC.

7.1.1 Data without transformation

The result obtained with the previous algorithms ispresented in Table 2. Some of them could not be appliedbecause the data needed transformation.

7.1.2 Discretization of variables

The results of the aforementioned algorithms but with aprevious discretization are presented in Table 3.

Apparently the results of the algorithms are verysimilar when they are applied to the dataset which hasbeen discretized but not transformed. The paired t test isused to determine whether the average of the difference ofthe results between the observations (with and withoutdiscretization) was significant. This test determines whetherthe transformation of the data (discretized) makes a differenceto the result of the algorithms, namely, if applying thealgorithms to the discretized dataset, increases its accuracy.The result of applying this statistic was of a p value=0.138>

Table 1 Data setsDrill VC LE AV1 AV2 LX T Min Max S H W Burr

SR 150 35 0.3 0.3 35 25 0.11 0.31 −42.3 2.09 10.44 Yes

SR 200 35 0.4 0.4 35 25 0.21 0.24 −55.7 2.23 14.04 Yes

Hard 150 8 0.3 0.5 20 25 0.07 2.08 −43.3 4.93 8.82 No

Hard 250 35 0.2 0.2 35 12 0.69 0.3 31.1 3.78 14.32 Yes

SR 200 20 0.3 0.5 20 12 0.23 −0.84 −22.5 3.69 10.77 No

Table 2 Results without trans-formation of variables Type of classification Algorithm Mean value (%) Standard deviation FP rate

Classification trees J48 93.14 0.98 0.028

ID3 – – –

Induction rules JRip 89.24 1.19 0.0853

Prism – – –

Distance based techniques k-NN (k=1) 92.10 1.10 0.0504

k-NN (k=3) 93.62 1.10 0.0347

Techniques based on probabilities Naïve Bayes 86.29 1.12 0.1215

Statistical techniques Logistic 88.95 1.97 0.084


http://www.cs.waikato.ac.nz/ml/weka/index.html

http://www.cs.waikato.ac.nz/ml/weka/index.html

0.05, where α=0.05 corresponds to the level of significance,which determines that there is not enough evidence todiscard the null hypothesis H0 and assures that the differ-ences found in the result of the algorithms cannot simply beput down to noise and not as a result of the transformationcarried out on the dataset. As no significant differences werefound in the two datasets, it was decided to continue workingwith the discretized dataset.

7.1.3 Selection of variables

The following evaluation and search methods were establishedin the selection of variables.

Evaluation criteria to determine the quality of thevariable set in order to distinguish the class:

& C1: assesses the value of the subset of variables takinginto account the predictive ability for each variabletogether with the degree of redundancy between them.The subset of variables which is highly correlated withthe class but lowly correlated between themselves isselected.

& C2: assesses the value of a variable by calculating thevalue of the Chi-squared statistic with respect to theclass, obtaining the level of correlation between this andeach attribute.

& C3: assesses the value of a subset of variables through thelevel of consistency of the values of the class when thetraining instances are projected in a subset of variables.

& C4: assesses the value of a variable by measuring thegain rate with respect to the class.

& C5: assesses the value of a variable by using themeasurement of the gain of information with respect tothe class.

Search methods to carry out the search of the set ofvariables:

& S1: performs an exhaustive search beginning with theempty subset of variables and returning the best subsetfound. List all the possibilities and evaluate them inorder to select the best.

& S2: establishes a range by evaluating each one of thevariables and using different criteria. The first five variablesin the established range are those that will be used.

The variables selected through the evaluation criteria andsearch methods previously described are presented inTable 4. Some of these combinations such as the pairsformed by (C2, S2), (C4, S2) and (C5, S2) provide thesame final subset of variables. We obtained three differentsubsets of variables after performing the selection.

The results obtained by applying the machine learningalgorithms, after having previously selected the variables,are shown in Table 5.

Once again a paired t test was performed to determine ifthe average of the difference between the results using thethree subsets of variables was significant or not. The testwas applied to the different pairs of subsets of variables,and the result showed that the differences were notstatistically significant. There was not enough evidence tosuppose that a specific subset of variables provides betterresults in the algorithms used.

However, at a glance and using the mean of thealgorithms, it could be thought that the ID3, the k-nearestneighbour (k-NN; k=1) and the Prism could be the threealgorithms whose success rate is better, along with theNaïve Bayes. Bearing in mind that three of the fouralgorithms mentioned were obtained with the subset ofvariables (Drill, VC, LE, W, H and Min), it was decided toapply a combination of these three classifiers with thissubset in order to achieve an improvement in the results.

Table 3 Results of the discreti-zation of variables

* The ten algorithms with asuccess rate over the 96% orvery close to it.

Type of classification Algorithm Mean value (%) Standard deviation FP rate

Classification trees J48 92.19 1.17 0.0592

ID3 94.95 1.62 0.0374

Induction rules JRip 89.24 1.56 0.0853

Prism 95.81* 1.12 0.0072

Distance based techniques k-NN (k=1) 95.43 1.67 0.036

k-NN (k=3) 93.43 0.30 0.058

Techniques based on probabilities Naïve Bayes 94 0.78 0.055

Statistical techniques Logistic 94 1.68 0.039

Table 4 Variables selected with the different criteria and searchmethods

Criterion Search method Variables

C1 S1 VC, LE, W, Min

C2 S2 VC, W, H, Min, AV1

C3 S1 Drill, VC, LE, W, H, Min




7.1.4 Combination of classifiers

In the combination of algorithms, two approaches are used:

& Subset combination of ID3, k-NN (k=1) and Prismaltogether, through the (stacking and vote).

& Combination (individually) of ID3, k-NN (k=1) andPrism, through the (boosting and bagging) algorithms.

The results achieved are shown in Table 6. Based on themean of the evaluation criteria of the algorithms, it could be

said that the vote and the boosting (applied to k-NN (k=1)and to Prism) are those which give a higher success rate.However, if one of them had to be chosen, it would be thevote because, apart from providing the highest success rate(96.76%), it also has a lower FN rate than the others.

The FN rate=0.0072, which means that only (onaverage) 0.72% of the cases which belong to the ‘yes’category are being classified wrong. The algorithm onlywrongly classifies between 0 and 1 cases of those whichshould belong to the ‘yes’ category. The rest of the wrongly

Table 5 Results with the selec-tion of variables


Type of classification Algorithm Mean value (%) Standard deviation FP rate

VC, LE, W, Min J48 94.95 2.20 0.0405

ID3 95.52 1.56 0.0333

JRip 90 1.12 0.0983

Prism 91.14 0.46 0

k-NN (k=1) 92.19 1.54 0.0577

k-NN (k=3) 94.95 0.90 0.0333

Naïve Bayes 96.19* 0 0.029

Logistic 95.52 1.10 0.0391

VC, W, H, Min, AV1 J48 94 1.96 0.0518

ID3 94.29 0.90 0.0416

JRip 88.76 1.08 0.11

Prism 92.57 1.33 0

k-NN (k=1) 93.05 1.68 0.0564

k-NN (k=3) 94.48 1.89 0.0434

Naïve Bayes 95.24 0 0.043

Logistic 93.24 1.14 0.052

Drill, VC, LE, W, H, Min J48 92.48 1.14 0.0549

ID3 96.1* 1.65 0.0198

JRip 90.1 1.20 0.0926

Prism 95.90* 1.19 0.0072

k-NN (k=1) 96.29* 1.82 0.02

k-NN (k=3) 93.71 0.49 0.058

Naïve Bayes 94.29 0 0.058

Logistic 95.14 1.14 0.0302

Table 6 Results with classifiercombination


Type of combination Algorithms Mean value (%) Standar deviation FP rate

Classifiers of different types

Stacking KNN(k=1), ID3, Prism 95.24 1.42 0.0287

Vote KNN(k=1), ID3, Prism 96.76* 1.57 0.0072

Classifiers of the same type

Boosting KNN(k=1) 96.48* 1.56 0.02

ID3 95.33 1.31 0.0316

Prism 96.38* 1.17 0.0043

Bagging KNN(k=1) 96.1* 1.82 0.0215

ID3 95.62 1.81 0.0302

Prism 95.9* 1.19 0.0072


classified cases, namely 2.52% of the remaining caseswrongly classified, are holes which do not have burr inexcess of the limit permitted and are classified as thoughthey do have. These cases are not as important, deburring iscarried out on them (even if not necessary) but they do nothave any associated risk.

7.1.5 Results

With the amount of experiments carried out, it is difficult todetermine which the best of all of the algorithms presentedis. Ten algorithms were selected, marked with an asterisk(*), with a success rate over 96% or very close to it.

In order to compare the results of the ten algorithms, thesingle-factor ANOVA was applied. The p value=0.91indicates whether the situation of the experimental statisticis in the acceptation zone or not, which means that thehigher the p value, the more confident we can be of theacceptation of H0, which is: the population means of theten repetitions of the tenfold cross-validation are the same.In this case, p value=0.91 which means that significativedifferences in the averages for a un /> 0:91 are accepted,so, with a confidence level of 95%, there is not enoughevidence in the data to reject the equality of the success rateof the ten algorithms evaluated.

There are no significative differences in the results of themodels. Accordingly, another type of criterion was estab-lished when selecting the final model. The use of thecombinations of algorithms was discarded because it couldbe more difficult to integrate in the CNC of the machine;ID3 and k-NN (k=1) with the subset of variables (Drill,VC, LE, W, H, Min) for the high deviation which they have,and Prism applied to all the initial set of variables to reduceprocessing time.

Finally, it was decided that the most efficient algorithmsto solve the problem were Naïve Bayes with the (VC, LE,W and Min) subset, and Prism with (Drill, VC, LE, W, Hand Min). Naïve Bayes gives a success rate of 96.19%, apractically null standard deviation and its FN rate is low.However, Prism has a much lower FN rate; this is a factorthat we must take into consideration bearing in mind theconsequences of these cases. Its success rate is 95.9% witha standard deviation of 1.19 and its use would be simple toimplement a priori.

The success rate provided by several of the aforemen-tioned algorithms exceeds the success rate of the modelspresented in Peña et al. [24]. However, in order to automatethe process, the algorithm must allow the detection of thoseholes which have burr in excess of the limit permitted. Sofar the only algorithm which fulfils this requisite is Prismwith the subset of variables made up of (VC, W, H, Min andAV1). This algorithm gives a success rate of 92.57%, thesame as that of Peña et al. [24] and a standard deviation of

1.33. Its implementation in the CNC of the machine wouldbe easy and the model would be made up of the set of rulesas shown in Table 7:

However, given that Naïve Bayes has a high success rateand a null standard deviation, it was decided to evaluatethat of the Bayesian network as a second approach to theproblem.

7.2 Bayesian network

This algorithm is based on the Bayes' theorem and a seriesof individual characteristics which are interesting toexplore: among them, the ability to adapt when confrontedwith new input data, which, with this particular application,would provide added value to the final system. Thealgorithm carries out its learning in the structure of thenetwork based on an evaluation function or metric, alongwith a search method: score and search. The metric used todetermine the quality of the structure is based on aBayesian statistic. The search method used was K2, whichis based on the Hill Climbing algorithm, limiting the orderof the variables in order to carry out the search. As regardsthe learning of the parameters, the estimation of theconditional probabilities of the network was performedusing a simple estimation of these probabilities from thedata.

The following subsections assume the same order as theprevious section. The results are gathered from thealgorithm with and without the transformation of thedataset (discretization), followed by a selection of variables.

7.2.1 Data without transformation

The success rate achieved by the algorithm was 92.48%,with a 0.94 standard deviation and a 0.0417 FN rate.

Table 7 Set of rules

1. If W=‘(7.254−inf)’ then yes

2. If Min=‘(0.0825−inf)’ andH=‘(−inf−1.1065)’ and VC=‘(112.5−inf)’then yes

3. If Min=‘(0.0825−inf)’ and VC=‘(112.5−inf)’ and AV1=‘(0.175−inf)’and H=‘(1.1065−inf)’ and W=‘(−inf−7.254)’ then yes

4. If AV1=‘(0.175−inf)’ andMin=‘(0.0325−0.0825)’ andVC=‘(112.5−inf)’ and H=‘(−inf−1.1065)’ and W=‘(−inf−7.254)’ then yes

5. If VC=‘(−inf−112.5)’ then no

6. If Min=‘(−inf−0.0325)’ then no

7. IfH=‘(−inf−1.1065)’ and Min=‘(0.0325−0.0825)’ and AV1=‘(−inf−0.175)’ then no

8. IfW=‘(−inf−7.254)’ and Min=‘(0.0325−0.0825)’ and VC=‘(112.5−inf)’ and AV1=‘(0.175−inf)’ and H=‘(−inf−1.1065)’ then no

9. If W=‘(−inf−7.254)’ and H=‘(1.1065−inf)’ and VC=‘(112.5−inf)’and AV1=‘(0.175− inf)’ and Min=‘(0.0825− inf)’ then no


7.2.2 Discretization of the variables

The result of applying discretization to the data was asuccess rate of 95.24%, with a 0.77 standard deviation anda 0.0184 FN rate. The algorithm improves with atransformation of the dataset.

7.2.3 Selection of variables

The same selection of variables was performed as in thefirst approach and the results achieved are those shown inTable 8.

The FN rate is still positive and, although it is very lowin the case of the subset of data formed by (Drill, VC, LE,W, H and Min), it is still unacceptable. At this point of theresearch, it was decided to increase the variable selectionmethods with two more evaluation criteria:

& C6: evaluates the subsets of attributes in the trainingdata using the classifier.

& C7: evaluates the subsets of attributes using theclassifier and employing cross-validation to estimatethe fitness of the learning scheme in each subset.

The results of the Bayesian network applying theaforementioned combination of criteria together with theS1 search method are shown in Table 9.

The tuple formed by C6, S1 provides a subset ofvariables which had previously been evaluated. However,that formed by C7, S1 provides a different subset which hasa higher percent success rate, a lower standard deviationand an FN rate of zero. The structure of this Bayesiannetwork is represented in Fig. 4. Two of the parametersextracted from the internal signal of the machine (H andMin), along with the bit type (Drill) and the speed of the cut(VC) are those which determine if the burr which is goingto be generated is above the limit permitted. The networkshows the relations which exist between the parameters andthe fact that the burr exceeds 127 μm.

7.3 Evaluation

The evaluation includes mainly two objectives: to estimatethe real error rate of the prediction and to select the modelamongst two or more models.

In order to estimate the error rate of each model, it wascalculated the average of the ten rates, obtained from theexecution of the ten times the tenfold cross-validation (eachtime with a different seed to add noise).

The comparison amongst models was done by means ofthe error rates and some statistical tests such as paired t testor ANOVA+Scheffé method, to compare two or moremodels in the same data set respectively.

8 Conclusions and future works

The drilling process is one of the conventional mechanisedprocedures when working with metallic materials inaeronautics. Moreover, it is the most frequent and complexprocess, and one of the fundamental operations in theriveting assembly process being previous to it.

The main requirements of the aeronautic industry are toreduce costs and increase productivity, while guaranteeingquality and safety. In order to achieve these aims in the drillingprocess, production time is being reduced by using high-speedmechanised techniques, eliminating lubrication to avoidhaving to clean the structures. Also, good-quality holes withthe absence of burr are sought so as not to have to disassembleand deburr the parts before riveting. The challenge is toachieve quality holes in materials such as aluminium, in highproductivity systems and in the absence of lubricants.

Table 8 Results of the variables selection—Bayesian network

Selected variables Meanvalue (%)

Standarddeviation

FP rate

VC, LE, W, Min 94.67 1.02 0.0376

VC, W, H, Min, AV1 95.43 0.4 0.0402

Drill, VC, LE, W, H, Min 96 0.88 0.0099

Table 9 Results of the variable selection II- Bayesian network

Criteria Searchmethod

Selectedvariables

Meanvalue(%)

Standarddeviation

FPrate

C6 S1 Drill, VC, LE,H, W, Min

96 0.88 0.0099

C7 S1 Drill, VC, H,Min

96.95 0.4 0

Fig. 4 Bayesian network structure


The drilling process needs to be optimised and automatedso as to eliminate a series of unnecessary tasks which wouldreduce economic costs. Nowadays, the drilling is carried outmanually and unwanted burr is generated in the holes due tothe fact that both the cut and the extrusion are combined in thecentre of the tool. The posterior deburring process is alsocarried out manually through visual inspection of each holeand the elimination of the burr where necessary.

Most of the research carried out to reduce the burrfocuses on dynamic studies of the process and simulationmodels which try to categorise the type of burr and itsbehaviour. However, the use of other techniques such asthose provided by data mining and artificial intelligence isstill not a very common occurrence, even though thesetechniques are trying to open new ways in different sectorswhich, until now, have been stagnant and closed to newworking methods. In this sense, the study verifies that thesesupervised classification, machine learning and statisticstechniques are efficient and effective to solve certain typesof problems, whatever the sector and/or source application.These techniques are simple and easy to understand. It isnecessary to extend their use or, at least, explore andevaluate the results which can be achieved in other areaswhere these techniques are currently unknown.

The cost efficiency and flexibility which robotic sol-utions provide is sought after by the aeronautic industry.The automation of the drill would pass for the implemen-tation of the model in the CNC system, however, thisimplementation would not be valid or viable if the modeldoes not guarantee a null FN rate, given that it would notguarantee the quality of the final piece or the safetyassociated with it.

Two of the models presented in this paper guarantee anFN rate equal to zero, thus, once implemented in the CNCof the machine, they would allow the elimination ofdisassembly tasks of the piece and manual deburring, sinceit would be the software which determines, in real time, ifthere is burr in excess of the permitted limit, and in whichcase it would redirect the mechanical arm of the mecha-nised centre to carry out an automatic deburring. Moreover,the Bayesian network has a high reliability rate of 96.95%.

The CNC of the machines usually consists of two parts.One of these is fairly fast so as to carry out the calculationsof the interpolations of the movement, and the other isequivalent to a PC. Most of the existing numeric controlsare based on those of a PC so there should be no problem toinsert the type of model obtained in this work (although thisdepends to a large extent on the manufacturer and howflexible the control is if it needs to communicate with theCNC). However, in general, they allow the installation ofadditional software, with C being one of the most usedlanguages. Consequently, either of the two models obtainedcould be implemented without too much difficulty.

The models acquired are simple, intuitive and easy tointerpret by any person without needing to be expert in thefield. Moreover, they are auto-explanatory in order to beable to establish, without any deep knowledge of theprocess and its physics, the relation between certainparameters/process conditions and the burr generated. Theyallow the extraction of knowledge, often new and previouslyinexistent, about the relation of the parameters attached to theprocess. This is the case of the Bayesian network allowing theobservation that there could be a relation between theminimum of the signal with the type of bit, the speed of cutand the burr. And that this parameter is also related with theheight. They constitute a good tool compared with others inwhich previous expertise is fundamental in order to laterhandle and use.

The models obtained through machine learning couldachieve models which take into account the differentconditions under which the machine can be operated. Andin some cases, the ideal conditions will be gathered fromthe model. The models will allow to evaluate, for example,under which conditions the permitted limit of burr is neverexceeded.

Finally, to sum up, the models obtained and theirimplementations in the CNC of the machine provide theaeronautic industry with:

& Automation of the drilling process, which supposesreduction in costs and an increase in productivity.

& Assured quality of the final pieces and the associatedsafety.

& Extraction of new knowledge about the process:possible connections between the parameters of theprocess and even the exploration of the optimumworking conditions.

& New analysis techniques in these work areas.

References

1. Apolloni B, Ghosh A, Jain FALC, Patnaik S (2005) Machinelearning and robot perception. Springer, Heidelberg

2. Berthold M, Hand DJ (2003) Intelligent data analysis. Springer,Berlin Heidelberg

3. Bukkapatnam STS, Kumara SRR, Lakhtakia A (1999) Analysis ofacoustic emission signals in machining. J Manuf Sci Eng 121(4):568–571

4. Bukkapatnam STS, Kumara SRR, Lakhtakia A (2000) Fractalestimation of flank wear in turning. J Dyn Syst Meas Control122:89–94

5. Chang SF, Simon SF, Bone GM (2010) Burr height model forvibration assisted drilling of aluminum 6061-T6. Precis Eng34:369–375

6. Connell J, Mahadevan S (1993) Robot Learning. KluwerAcademic, Boston, MA

7. Correa M, Bielza C, de Ramírez MJ, Alique JR (2008) ABayesian network model for surface roughness prediction in themachining process. Int J Syst Sci 39(12):1181–1192


8. Correa M, Bielza C, Pamies-Teixeira J (2009) Comparison ofBayesian networks and artificial neural networks for qualitydetection in a machining process. Expert Syst Appl 36:7270–7279

9. Ferreiro, S. and Arnaiz, A. (2010) Improving aircraft maintenancewith innovative prognostics and health management techniques.Case of study: brake wear degradation, 2nd InternationalConference and Artificial intelligence, pp. 568–575.

10. Gaitonde VN, Karnik SR, Achyutha BT, Siddeswarappa B (2008)Genetic algorithm-based burr size minimization in drilling of AISI316 L stainless steel. J Mater Process Technol 197(1–3):225–236

11. Gaitonde VN, Kanik SR, Achyutha BT, Siddeswarappa B (2008)Taguchi optimization in drilling of AISI 316 L stainless steel tominimize burr size using multiperformance objective based onmembership function. J Mater Process Technol 202(1–3):374–379

12. Hambli R (2002) Prediction of burr height formation in blankingprocesses using neural network. Int J Mech Sci 44:2089–2102

13. Heisel U, Luik M, Eisseler R, Schaal M (2005) Prediction ofparameters for the burr dimensions in short-hole drilling. Annalsof the CIRP 54(1):79–82

14. How T, Liu W, Lin L (2003) Intelligent remote monitoring anddiagnosis of manufacturing processes using an integrated ap-proach of neural networks and rough sets. J Intell Manuf 14:239–253

15. Kamarthi SV, Kumara SRT, Cohen PH (2000) Flank wearestimation in turning through wavelet representation of acousticemission signals. J Manuf Sci Eng 122:12–19

16. Kim J, Min S, Dornfeld DA (2000) Optimization and controldrilling burr formation of AISI 304 L and AISI 4118 based ondrilling burr control charts. Int J Mach Tool Manufact 41:923–936

17. Lauderbaugh LK (2009) Analysis of the effects of processparameters on exit burrs in drilling using a combined simulationand experimental approach. J Mater Process Technol 209(4):1909–1919

18. Malakooti B, Raman V (2000) An iteractive multi-objectiveartificial neural network approach for machine setup optimization.J Intell Manuf 11:41–50

19. Melvin DG, Niranjan M, Prager RW, Trull AK, Hughes VF (2000)Neurocomputting versus linear statistical techniques applied to

liver transplant monitoring: a comparative study. IEEE TransBiomed Eng 47:1036–1043

20. Michalsky RS, Bratko I, Kubat M (1998) Machine learning anddata mining: methods and applications. Wiley, New York

21. Min S, Kim J, Dornfeld DA (2001) Development of a drilling burrcontrol chart for low alloy steel AISI 4118. J Mater ProcessTechnol 113:4–9

22. Mitchell TM (1997) Machine learning. McGrawHill, New York23. Nieves J, Santos I, Penya YK, Rojas S, Salazar M, Bringas PG

(2009) Mechanical properties prediction in high-precision foundryproduction. In: Proceedings of the 7th IEEE InternationalConference on Industrial Informatics. pp. 31–36

24. Peña, B., Aramendi, G. and Rivero, M.A. (2007) Method formonitoring burr formation in processes involving the drilling ofparts. Spain, patent application, WO2007/065959A1.

25. Pittner S, Kamarthi SV (2002) Feature extraction from waveletcoefficients for pattern recognition tasks. Neural Network 1997(3):1484–1489

26. Rangwala SS, Dornfeld DA (2002) Learning and optimization ofmachining operations using computing abilities of neural net-works. IEEE Transactions on Systems, Man and Cybernetics 19(2):299–314

27. Russel S, Norvig P (1995) Artificial intelligence: a modernapproach. Prentice Hall, Englewood Cliffs, NJ

28. Sick B (2002) On-line and indirect tool wear monitoring inturning with artificial neural networks: a review of more than adecade of research. Mech Syst Signal Process 16(4):487–546

29. Umbrello D, Ambrogio G, Filice L, Guerriero F, Guido R (2009)A clustering approach for determining the optimal processparameters in cutting. J Intell Manuf 21(6):787–795

30. Wang K (2007) Applying data mining to manufacturing: thenature and implications. J Intell Manuf 18(4):487–495

31. Witten IH, Frank E (2000) Data mining: practical machinelearning tools and techniques with Java implementations. MorganKaufmann, San Francisco

32. Wolber WH, Street WN, Mangasarian OL (1994) Machinelearning techniques to diagnose breast cancer from fine-needleaspirates. Cancer Letters 77:163–171


comparison of machine learning algorithms for optimization and improvement of process quality in...

Documents