feedforward neural networks for fault diagnosis and severity assessment of a screw compressor

Mechanical Systems and Signal Processing (1995) 9(5), 485–496

FEEDFORWARD NEURAL NETWORKS FORFAULT DIAGNOSIS AND SEVERITY ASSESSMENT

OF A SCREW COMPRESSOR

T K

Department of Mechanical Engineering, Columbia University, New York,NY 10027, U.S.A.

C. J L

Department of Mechanical Engineering, Aeronautical Engineering and Mechanics,Rensselaer Polytechnic Institute, Troy, NY, U.S.A.

(Received 14 September 1993, accepted 2 May 1994)

Using feedforward neural networks (FNNs), a fault diagnosis and severity assessment(FDIA) scheme for a screw compressor has been established. This FDIA method consistsof non-linear model identification of the compressor and pattern classification ofparameters of identified models corresponding to the various faulty conditions. First, anon-linear input/output model is identified in the form FNNs. Then an FNN classifies theFNN model into one of the possible faults or the baseline conditions. If the model isclassified as a faulty one, another FNN is used to assess the severity of the fault. A fullyautomatic structure and weight learning algorithm for FNNs is utilised to identify an FNNnon-linear model from operating data of the compressor. To establish training data for theclassifiers, measurements of motor current and shaft speed were made under baselineconditions and faulty conditions such as various extents of gaterotor wear and increasedrolling friction. Experimental results show that the scheme is capable of not only diagnosingthe faults but also assessing the magnitude of faults.

7 1995 Academic Press Limited

1. INTRODUCTION

Mechanical equipment is becoming increasingly more complicated to provide morefunctionality and flexibility. Concurrently, the pressure to reduce the cost for maintainingthe equipment will necessitate reduced manning and staff levels for maintenance. Thesetwo driving forces work in opposition to each other and a new method of maintenance,other than a planned maintenance system (PMS) for the upkeep of machinery, must bedeveloped.

Condition based maintenance (CBM) has been deemed as a viable alternative to PMS.In a fielded system, an effective condition based maintenance (CBM) system detects faultsin a machine well before the failure stage. Thus, monitoring systems must be in place tomeasure variables of interest. Diagnostic algorithms must be available to convert the dataobtained from the monitoring system into variables which signify the state of the machine.

In the area of mechanical diagnostics, most work has concentrated on developingprecursors for ‘stationary’ faults that generate distinct frequency tones in the spectrumsuch as imbalances and bent-shafts. Current state-of-the-art does not provide us with thecapability to detect and classify faults that alter the dynamics of the system such as stiffness

485

0888–3270/95/050485+12 $12.00/0 7 1995 Academic Press Limited

. . . 486

changes of a spring in a suspension system. We must have the capability to provide a soundassessment of faults of this kind to establish the present condition of the machine.

Basic research of the past decade has derived potentially promising new methods fordetecting faults that alter the dynamics of the system. Analytic modeling identifies thechange in dynamics due to a fault from the observations of operational data of amechanical system.

A modeling approach to mechanical diagnostics has much to offer. In its most robustform, the model would allow the user to input symptoms and use the model to determinethe fault or faults associated with those symptoms and vice versa. The ability of inferringsymptoms associated with a given fault would allow selection of necessary sensors at theearliest stage of design without the need for extensive testing programs to produce a faultdata base. By comparing sensor outputs on a continuous basis with the output of atemplate model that has been identified through on-line self-learning, changes in systemstate associated with a fault can be detected without prior knowledge about the fault. Thiscapacity would allow the detection of new faults which we have no prior experience of.

These models initially take the form of linear ordinary differential and differenceequations. Thus far, the utility of linear modeling technique has been established for faultdetection and isolation. For a single screw rotary compressor, linear models were successfulin enabling limited fault detection without prior knowledge of the failure modes, and faultisolation [1]. However, it was found that the operating range exceeded the linearity of themodel and therefore practical non-linear modeling techniques are needed to detect, isolateand assess faults more accurately. In comparison with linear models, non-linear modelsprovide the complexity and computational power necessary to obtain satisfactory solutionsto a wider class of dynamic systems and have the potential to yield better diagnostic resultsbecause this usually provide a better approximation to the dynamics of a system than alinear model. In addition, it is essential to make the basic model of the machine non-linear,because in that way, the linear systems are the special case. Otherwise, there is nosatisfactory way to generalise linear systems to any broad range of non-linear cases.

This study investigates the utility of non-linear system modeling by feedforward neuralnetworks (FNNs) for more accurate mechanical system fault detection and magnitudeassessment. This work also applies robust pattern recognition techniques based on FFNsto translate models identified for a system into state variables that signify the magnitudeof faults.

2. STATEMENT OF PROBLEM

A 20 h.p., rotary, low-pressure, single screw air compressor was chosen as the testplatform for the development of new machinery diagnostic algorithms. The compressoris a rotating machine which has an air-end consisting of an alloy steel rotor driven by athree-phase induction motor (Fig. 1). The rotor, which has six helical grooves, meshes withtwo 18-fingered nylon gaterotors. The air-end compresses air by the following means. (1)The induction motor turns the rotor and the rotor drives the gaterotors. (2) Air is trappedin a groove of the rotor by one finger of the gaterotor and the air-end casing and iscompressed as the finger moves from one end of the groove to the other. (3) Thecompressed air is discharged through an opening on the casing of the air-end to the air–oilseparator. (4) The oil is separated from the air by a filter in the separator. (5) The hotoil is cooled by a radiator.

The failure statistics study showed that typical failure modes of the air-end are, indescending frequency of occurrence, pressure sensor valve failure, air-end failure such asworn gaterotors and rotor, rotor small-end bearing failure and separator fire. The objective

487

Figure 1. Sectional view of compressor.

of this study is the diagnosis and damage severity assessment of gaterotor wear andincreased rolling friction due to degradation of the small-end bearing.

3. PROPOSED SOLUTION

The proposed fault diagnosis and severity assessment (FDIA) system is non-linearmodel-based. It identifies a non-linear model in the form of FNNs from the observationsof input, u(t), and output, y(t), of the compressor. Weights of this identified neuralnetwork model are used by other FNNs to diagnose and assess the magnitude of faults.The block diagram of the FDIA system is shown in Fig. 2.

3.1.

Neural networks have been proposed as solutions for a wide variety of tasks, researchershave applied neural networks (NN) to identify unknown non-linear systems [2–5]. Usingtime domain voice signals as input, a single-layer connectionist NN was developed forvoice recognition [6]. Hataoka and Waibel [7] developed integrated NNs in which anumber of NNs in the first NN block are trained to cope with different time duration voicesas input and a single NN in the second block is trained to produce final results. Hoskinset al. [8] and Yamamoto and Venkatasubramanian [9] developed chemical process

Figure 2. Proposed FDIA scheme.

. . . 488

Figure 3. Three-layer neural network.

diagnosis schemes using NNs. NNs for detecting leaky valves of a high pressure aircompressor were developed by Li and Yu [10]. Radial basis function networks were appliedfor classifying process faults and applied for pattern classification of impulse radarwaveforms [11, 12].

Since a three layer NN is universal in the sense that essentially any function can beimplemented to any desired degree of accuracy with sufficient hidden neurons [13–15], NNsin this study will be limited to this class of FFNs. Structural learning of this class involvesthe determination of the number of hidden neurons. A general three layer FNN is shownin Fig. 3.

The network’s input and output layers consist of linear neurons while its hidden layerconsists of non-linear ones. Each of its hidden neurons computes a non-linear function,S, of a weighted sum of network inputs where W1 and W2 are input layer and output layerweights, and f is a threshold. When the network is presented with an input vector, I(i):(I1, . . . , IM ), it computes an output vector, y(i): (y1, y2, . . . , yk ). The functionalrelationship of this three-layer FFN can be expressed as:

yi (t)= sj

W2jiSj0s W1

kjIk (t)+fj1. (1)

In short, the above equation can be written as

y(t)=NNM (I(t), u) (2)

where u consists of W1, W2, and f, and M stands for model.To model dynamic systems with FNNs which are essentially static functions, an

autoregressive type NN whose inputs consist of past history of inputs and outputs isimplemented.

I(t)= [y(t−1), y(t−2), . . . , y(t− n), u(t− d), u(t−1− d), . . . , u(t−m− d)]t

where d is the delay.NN training involves finding parameters, u, which include weights and thresholds of

hidden and output layers, to minimize the mismatch between outputs of NNM andmeasured output y(t). The NN training algorithm is elaborated in Section 3.3.

3.2.

After a FNN model of the compressor is identified, means for interpreting the modelhave to be available. This FDIA scheme has to classify the condition of the compressorinto one of the possible conditions such as baseline, gaterotor wear, etc., and determinethe magnitude of the fault. Because FNN models are black box models whose coefficientsdo not bear an explicit physical relationship with parameters characterising behaviour ofthe compressor’s components and subsystems, this relationship has to be learned fromexamples.

489

Specifically, these examples contain models labeled with type and magnitude of faultsunder which the models were identified. Let us denote the parameters of the model as u

and condition indicator vector as h consisting of h1, which labels the class of fault, andh2, which gives the magnitude of fault.

Our objective is to establish an FNN based FDIA scheme such that when a u vectoris presented, it will give the corresponding h. Our FDIA adopts a hierarchical structurethat uses separate FNNs to perform fault classification and severity assessment. At thehigher level, a single FNN classifies the compressor into one of the following conditions:baseline, fault 1, fault 2, etc. Once the higher level has classified the fault, one of the NNsat a lower level estimates the severity of the fault. Let us denote the fault detection andisolation (FDI) NN in the higher level as NNFDI and the fault severity assessment (FSA)NN in the lower level as NNFSA . Mathematically, the desired outputs for the former andlatter are h1 and h2 respectively.

h1 =NNFDI (u). (3)

In the learning process the physical wear value, h2, is provided (for example, the amountof gaterotor wear). For each u of known faulty conditions, NNs (NNFSA ) are trained withgiven fault severity h2:

h2 =NNFSA (u). (4)

One FNN is used for fault detection and isolation and c NNs are required for faultseverity assessment. Therefore, the total number of FNNs is c+1 (Fig. 4).

In summary, the FDIA scheme works as follows: (1) The parameter vector u, of the NNM

is estimated from the operating data of the compressor. (2) The U is fed to the NNFDI tofind h1, i.e. the class of fault. (3) If h1 indicates the existence of a fault, the correspondingNNFSA,i will estimate the magnitude of fault, h2,i .

3.3.

To establish FNNs which behave in the way specified by the training examples, one musthave efficient NN learning algorithms. NN training has two components. First, an NNthat has enough, but not too many neurons should be adopted. Second, weight values haveto be determined so that the output of the network will match the desired output. Arecently developed, fully automatic FNN structural and weight learning algorithm, is usedfor finding such an NN.

Figure 4. Second model for FDIA.

. . . 490

The augmentation by training with residuals, (ATR), requires neither initial weightvalues nor the number of neurons in the hidden layer [2]. The algorithm takes anincremental approach in which a hidden neuron is trained to model the mapping betweenthe input and output of current exemplars, and is augmented to the existing network. Theexemplars are then made to be orthogonal to the newly identified hidden neuron and usedfor training of the next hidden neuron. The betterment continues until a desired accuracyis reached.

To summarise, the algorithm for training a single output NN is described as follows:

Define the exemplar set, D=[(I(1), ydi (1)), . . . , ((I(N), yd

i (N))].Define the basis as an empty set, B=[9].Define the existing network as a null network.

1. Let d=D.2. The input layer weights, W1, and the threshold of a single hidden NN are trained with

a NN learning algorithm and the set d to obtain a neuron function.3. Augment this network into the existing network and add this neuron function into the

basis, B.4. The exemplar set, D, is projected to the new basis by minimising the norm of the error

vector, E=[E(1), . . . , E(N)]= [(ydi (1)− yi (1)), (yd

i (2)− yi (2)), . . . , (ydi (N)− yi (N))]

where yi (j), 1E jEN, is the output of the expanded NN.5. Check the stopping criterion such that the norm of the E vector is smaller than a

prescribed value. If the criterion is met, stop. Otherwise, continue.6. Define d=[(I(1), E(1)), . . . , (I(N), E(N))].7. Go to 2.

4. EXPERIMENTAL SET-UP AND EXPERIMENTS

In this study, we treat the combination of the compressor’s induction motor and itsair-end as a system. Naturally, the motor current is the input to this system. The actualoutput of the air-end is an air flow which can be characterised by its flow rate, temperature,and pressure. However, due to the inertia and compressibility of the air, the measurementsof the flow rate and pressure provided very little information at frequencies higher thana few Hz. Consequently, we selected the instantaneous rotational speed as the output.

The current sensor is based on a Hall-effect transducer. It passes outputs of theHall-effect sensor through the built-in rms converter to provide readings of the current’srms value. However, to obtain dynamic measurements at the rotor meshing frequency(6×rpm) and its harmonics, we had to bypass this rms converter which has a very lowbandwidth and employ the Hilbert transform for amplitude demodulation.

To measure the angular speed of the shaft connecting the motor and the air-end, theDC-generator type tachometer and the optical encoder are not appropriate since thecompressor’s shaft does not have a free-end. We developed a new sensor (Fig. 5) basedon a photo reflective sensor and a reflective lined tape attached to the flywheel of the

Figure 5. Rotational speed sensor.

491

compressor. The photo reflective sensor, which consists of an LED, a photo diode andtransistors, produces a voltage modulated by the passing of reflective lines. Thephase-locked-loop (PLL) demodulates the modulated voltage of the reflective sensor andyields the instantaneous passing frequency of the lines which is proportional to therotational speed. This type of sensor has plenty of practical applications since it requiresno free-end of the shaft.

We performed the measurement under the baseline conditions and two failure modes,the wear of the gaterotor and the increase of the friction from the degradation of the rotorbearing. The friction was conveniently simulated by applying contact frictions to the motorshaft because seeding a faulty bearing would be a major task and the simulation shouldprovide the similar effect as a real faulty bearing. The gaterotor wear was implemented byadjusting the gaterotor wear compensating screw and therefore the rotor meshing clearance.

Measurements were taken from the baseline condition under three different operatingpressures. In addition, data were taken under each of the following conditions: gaterotorwears from 0.003 in. to 0.021 in., 0.003 in. apart; small, medium and large rolling friction.Therefore, there are 13 conditions under which data were taken. The sampling frequencywas 1 kHz.

5. EXPERIMENTAL RESULTS

To identify a NN model (NNM ), which describes the dynamics of the compressor, 400points long input/output time series were used. For each of 13 conditions, 100 NNs wereidentified by using the aforementioned structural/weight learning algorithm. Three-layerNNs with three hidden neurons and 18 delayed outputs and 18 delayed inputs producemodeling errors that are ‘white’. The number of parameters for each NN is(36+1+1)×3=114 since each hidden neuron has 36 W1, one W2, and one threshold.One hundred NNM were obtained for each of 13 conditions. Eighty sets of parameters ofNNM for each condition were used to train NNFDI and NNFSA . Twenty of them were usedto test the performance of the diagnosis system.

5.1.

To classify the faults, NNFDI s output, h1, is assigned to be 0, 1, and −1 for the baselinecondition, the gaterotor wear, and the friction, respectively. Values of h1 after training areshown in Fig. 6 and predictions with test data which has not been shown to the NN areshown in Fig. 7.

Figure 6. Estimated h1 for the training data. Baseline condition: (b1) 87.50, (b2) 105.00 and (b3) 106.18 psi.Gaterotor wear: (g1) 0.003, (g2) 0.006, (g3) 0.009, (g4) 0.012, (g5) 0.015, (g6) 0.018 and (g7) 0.021 in. Friction:(f1) small; (f2) medium; (f3) large.

. . . 492

Figure 7. Predicted h1 for the test data.

T 1

Performance of NNFDI with decision boundary −0.5 and 0.5

Training Test

Failure False Missed Detection False Missed Detectionmode alarm detection rate (%) alarm detection rate (%)

Gaterotor wear 19/240 78/560 86.07 6/60 16/140 88.57

Friction 0/240 0/240 100 0/60 2/60 96.67

The upper and lower thresholds for the baseline condition were set as −0.5 and 0.5.For the 240 training and 60 testing baseline data in the case of the gaterotor wear faultdetection and isolation, there were 19 and six false alarms, respectively (Table 1). In thecase of friction fault detection and isolation, no false alarm occurred. Examining Figs 6and 7, all the false alarms in the case of the gaterotor wear are from baseline conditionwith an operating pressure of 87.50 psi.

For the 560 training and 140 testing data of gaterotor wears, 78 and 16 were missed bythe NNFDI respectively. However, 74 and four of the missed detections occurred with twoof the smallest gaterotor wears, 0.003 in. and 0.006 in., respectively. For the 240 trainingand 60 testing data of the rolling friction, 0 and 2 were missed by the NNFDI , respectively.

Figure 8. Estimate of gaterotor wear for the training data.

493

Figure 9. Estimate of gaterotor wear for the test data.

Figures 6 and 7 indicate that all the missed detections are from large friction. Onemisclassification between gaterotor wear and friction occurred.

5.2.

The magnitude of physical wear h2 for the gaterotor wear are from 0.003 to 0.021 in.,0.003 in. apart. The fault magnitude h2 for the friction are assigned with 1, 2, and 3for small, medium, and large, respectively. After training a NNFSA for gaterotor wearseverity estimation, its estimate of the magnitude of wear h2 is shown in Figs 8 and 9 forthe training and testing data, respectively. Similarly, Figs 12 and 13 give the estimatesfor friction.

Figures 10 and 11 show 95% confidence intervals of the actual gaterotor wear vs. theestimated gaterotor wear for the training and the testing data, respectively. This shows thatthe estimates are very close to the actual magnitudes. The variations are noticeable butacceptable. The overlapping between different magnitudes is small.

Figures 14 and 15 show 95% confidence intervals of the actual friction vs. the estimatedfriction for the training and the testing data, respectively. This shows that the estimatesare very close to the actual magnitudes. The variations are larger than when comparedto gaterotor wear and it is the worst in the case of small friction.

Figure 10. Mean and 95% confidence intervals of actual gaterotor wear vs. estimated wear.

. . . 494

Figure 11. Mean and 95% confidence intervals of actual gaterotor wear vs. predicted wear.

Figure 12. Estimate of friction for the training data.

Figure 13. Estimate of friction for the test data.

495

Figure 14. Mean and 95% confidence interval of actual friction vs. estimated friction.

Figure 15. Mean and 95% confidence interval of actual friction vs. predicted friction.

6. CONCLUSION

In this paper, the non-linear model based FDIA scheme which utilises FNNs for a screwcompressor is described. Its usefulness in detecting and assessing some common compres-sor faults such as gaterotor wear and friction faults was demonstrated with experimentaldata. In spite of the wide variation swing of operating pressures, the scheme has nodifficulty in distinguishing the baseline apart from faulty ones. The FNN structural/weightlearning algorithm is effective and efficient in constructing NNs for modeling compressorsas well as classifying a model of the compressor. It would not be difficult to extend thisapproach to other mechanical systems FDIAs. Nevertheless, some noticeable variationsin the estimation of fault magnitude, which may be due to measurement uncertainty andthe stochastic nature of the compressing process, were observed. One simple way ofreducing the variance is to take an arithmetic average of the results from multi-assessment.

ACKNOWLEDGMENT

This paper is based on work supported by the Department of Navy under awardN61533-91-K-0026/P00001.

REFERENCES

1. C. J. L, T. K and G. W. N 1991 Proceeding of Symposium on Sensors, Controls andQuality Issues in Manufacturing. PED Vol. 55, Atlanta, GA. pp. 95–106. New York: ASME.Linear model based fault detection and isolation for a screw compressor.

. . . 496

2. C. J. L and T. K 1992 Proceeding of Symposium on Neural Networks in Manufacturing andRobotics, PED-Vol. 57. Anaheim, CA. pp. 65–74. New York: ASME. A new feedforward neuralnetworks structural and weight learing algorithm-augmentation by training with residuals.

3. D. P, A. S and A. Y 1988 IEEE Control System Magazine, April, 17–21.A multilayered neural network controller.

4. K. N and K. P 1990 Neural Networks 1, 4–26. Identification and controlof dynamical systems using neural networks.

5. S. C, S. A. B and P. M. G 1990 International Journal of Control 51, 1191–1214.Nonlinear system identification using neural networks.

6. F. F 1989 Neurocomputing: Algorithm, Architectures and Applications, pp. 265–284.(F. F. Soulie and J. Herault, editors). Analysis of linear predictive data as speech and of ARMAprocess by a class of single-layer connectionist models. Berlin: Springer.

7. N. H and A. H. W 1990 Proceeding of the the International Joint Conference onNeural Networks, Vol. 1, San Diego, CA, pp. 57–62. New York: IEEE. Speaker-independentphoneme recognition on TIMIT database using integrated time-delay neural networks(TDNNs).

8. J. C. H, K. M. K and D. M. H 1990 Proceeding of the InternationalJoint Conference on Neural Networks, Vol. 1, San Diego, CA, pp. 81–86. New York: IEEE.Insipient fault detection and diagnosis using artificial neural networks.

9. Y. Y and V. V 1990 Proceeding of the International JointConference on Neural Networks, Vol. 1, San Diego, CA, pp. 317–326. New York: IEEE.Integrated approach using neural networks for fault detection and diagnosis.

10. C. J. L and X. Y 1992 Proceedings of Symposium on Intelligent Design and Manufacturing,PED Vol. 64, New Orleans, LA, pp. 375–380. New York: ASME. High pressure air compressorvalve fault diagnosis using feed forward neural networks.

11. J. A. L and M. A. K 1991 Proceeding of IEEE Control Systems 11, 31–38. Radialbasis function networks for classifying process faults.

12. G. V, C. R. C and S. H 1990 Proceeding of the International JointConference on Neural Networks, Vol. 1, San Diego, CA, pp. 45–50. New York: IEEE. Radialbasis function classification of impulse radar waveforms.

13. G. C 1989 Mathematics of Control, Signal and Systems 2, 303–314. Approximation bysuperpositions of a sigmoid function.

14. R. H-N 1989 Proceeding of the International Joint Conference on Neural Networks,Vol. 1. Washington D.C., pp. 593–606. New York: IEEE. Theory of the back-propagation neuralnetworks.

15. K. F 1989 Neural Networks 2, 183–192. On the approximate realization of continuousmappings by neural networks.

feedforward neural networks for fault diagnosis and severity assessment of a screw compressor

Documents