analysis of machine learning based fault diagnosis ... · g. murali department of mechatronics...

http://www.iaeme.com/IJARET/index.asp 80 [email protected]

International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 11, Issue 10, October 2020, pp.80-94, Article ID: IJARET_11_10_008 Available online at http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=10

ISSN Print: 0976-6480 and ISSN Online: 0976-6499

DOI: 10.34218/IJARET.11.10.2020.008

© IAEME Publication Scopus Indexed

ANALYSIS OF MACHINE LEARNING BASED

FAULT DIAGNOSIS APPROACHES IN

MECHANICAL AND ELECTRICAL

COMPONENTS

S. Sharanya

Department of Computer Science and Engineering, SRM Institute of Science and

Technology, Kattankulathur, Tamil Nadu, India

Revathi Venkataraman

Department of Computer Science and Engineering, SRM Institute of Science and


G. Murali

Department of Mechatronics Engineering, SRM Institute of Science and


ABSTRACT

Condition monitoring and fault diagnosis plays a vital role in extending the lifespan

of any equipment. Diagnosing faults at right time is crucial in life saving appliances

and applications. Fault diagnosis for any equipment or system involves handling of

large voluminous data, which is far beyond human computing capability. So deploying

automatic fault diagnosis approaches would be an intelligent solution that has opened

the gates for Artificial Intelligence (AI), Data Mining and Machine Learning

algorithms. This work reviews the Machine Learning based fault diagnosis algorithms

and models in detecting bearings, pumps and power transformer faults. A performance

comparison of the models is presented based on their accuracy of fault diagnosis. This

analysis also critiques the models with possible scope for improvement. The inferences

from the analysis limelight the need for development of Extreme Learning (EL) models

that are less dependent on explicit feature selection

Keywords: Bearings, Condition Monitoring (CM), Fault Diagnosis, Machine Learning,

Power Transformers, Pumps.

Cite this Article: S.Sharanya, Revathi Venkataraman and G. Murali, Analysis of

Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical

Components, International Journal of Advanced Research in Engineering and

Technology (IJARET), 11(10), 2020, pp.80-94

http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=10

Mahilet Reta, Dr. Subash Thanappan, Sathya Prabha and Eyob Mekonnen


1. INTRODUCTION

Fault diagnosis is a generic term that explains the process of finding the deviation of any system

or a component of a system from its normal operational profile. A comprehensive Condition

Monitoring (CM) is continuous surveillance of the system which involves sequence of activities

such as system monitoring, fault detection, fault diagnostics and fault prognostics. The

condition monitoring of the equipment is performed quantitatively by assessing some of the

critical variables. When the variables being monitored and observed values are abnormal, then

it is an indication of occurrence of fault in the machinery which can be termed as fault detection.

When a fault is detected, a diagnostic module is immediately activated so as to identify and

characterize the fault. It is vital to characterize the fault because each fault will contribute to the

failure of the system in its own unique way. After identifying the potential fault, the prognostic

model will predict the probability of failure distribution from the current and past environmental

and usage, conditions, past and current operational sensor data, and historical failure data [1].

Condition monitoring involves assessing the characteristic properties of the systems either

continuously or in regular time intervals and the result must be expressed as a quantitative value.

Fault detection is finding the variables that are crosses their bound values [2][32]. Fault

identification is checking the values of related variables. Sometimes the variables are fused or

combined to assess the system. Prognostics involve the sequence of recovery actions to make

the plant to regain its safe state [3].

Manual condition monitoring of equipment is a tiresome process that demands more

manpower. Apart from inaccuracies, the human intervention in condition monitoring of

complex systems that operate in hazardous environments is a serious threat to life and

properties. Two types of condition monitoring strategies are followed to avoid or to reduce

these overheads: physical models and data-driven approaches [4].

Physical model: The physical models of complex systems can be constructed by exploiting

the physical relations of the systems and its associated variables. This approach makes many

assumptions in building the physical model of the systems thus limiting its applicability in the

real time systems [5].

Data Driven approaches: This approach is also known as data mining or machine learning

approach, uses the historic data obtained from the system to learn about the system and model

its behaviour [6]. The models that are developed through this approach take the advantage of

the availability of large volume of data thus tuning the condition monitoring models for

enhanced accuracy [4]. The data driven approaches are broadly classified into two sub divisions

namely statistical methods and artificial intelligence methods [7].

The statistical methods involve the use of many mathematical and statistical approaches.

The artificial intelligence methods like Artificial Neural Network (ANN), Support Vector

Machines (SVM) [33], Genetic Algorithms (GA), Principle Component Analysis (PCA),

Expert System models are gaining more significance in fault diagnosis and health monitoring

because of the rapid development in the field of machine learning. The researchers have geared

up their work in exploring the possibilities of extending machine learning algorithms in almost

all domains. Notable works have been done in deploying machine learning algorithms for better

fault diagnosis in mechanical and electrical machineries. There is definitely a gap in knowledge

transfer between diverse communities in engineering discipline.

This article reviews the application of machine learning models and algorithms in

diagnosing beaning, pump and electrical faults with detailed emphasis on feature selection,

parameter tuning and its impending limitations. This article would be of a concern to the experts

in the field of machine learning and machine health monitoring whose primary research goal is

Analysis of Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical

Components


to invent cutting-edge techniques in fault diagnosis. The crucial aims of this article can be

conceptualized as:

Review the state-of-art techniques in applying machine learning in fault diagnosis.

Pin point the advantages and limitations of each work in the context of machine

learning and health monitoring.

Giving deeper insight to the potential feature selection procedures along with

parameter tuning mechanisms.

Though extensive research have been done in using machine learning algorithms for fault

diagnosis of machinery parts, this work focuses on bearing, pumps and transformer faults

because of their impact on the mechanical and production industries.

The organization of the paper is as follows: Section I gives a brief introduction about various

phases in the fault diagnosis and also the classification of fault diagnosis approaches. Section

II presents the models and algorithms used in fault diagnosis of bearings, pumps and

transformers. A detailed comparative analysis is given in Section III and the scope for future

enhancements is discussed in Section IV. Section V concludes the work.

2. MACHINE LEARNING ALGORITHMS IN FAULT DIAGNOSIS

The machine learning algorithms attempt to use the historic data presented to them, thus

predicting the future results. In terms of fault diagnosis, the machine learning algorithms extract

interesting patterns and features from the past data to diagnose the faults.

2.1. Fault Diagnosis in Bearings

Bearings in mechanical machinery are responsible for reducing the wear and tear caused due to

shaft rotation. There is wide usage of bearings in mechanical, automobile and textile industry

which leads to its enormous production. So there is an impulsive need for fault diagnosis of

these bearings in the production line. The literature shows the improvisations in every aspect

of the fault diagnosis. So the incremental progress could not be picked up.

Relevance Vector Machines (RVM) and Support Vector Machines (SVM) are deployed by

Achmad Widodo et al. after performing component analysis of the signals from the acoustic

emission sensor [8]. The features are directly extracted from the signals and thus scaled down

using the Principle Component Analysis (PCA). The localized errors are diagnosed by SVM

using RBF kernel and integrating Sequential Minimal Optimization (SMO) for fixing the

classification hyperplane. The performance of both RVM and SVM models with PCA and

Independent Component Analysis (ICA) is measured. The testing error for RVM and SVM with

ICF is as low as 2.04%. Since the features are directly extracted from the acoustic signals, there

is a possibility of missing out more prominent features.

The feature engineering is also a notable field in AI. It has become the chief research interest

because of its dominance in performance of the algorithm. Feature selection and parameter

tuning by nature inspired algorithms has tremendously improved the performance of the many

algorithms in different disciplines. Expending this fact, Xiaoyuan Zhang et al. designed a novel

hybrid model that optimizes the parameters of SVM using Barebones Particle Swarm

Optimization and Differential Evolution (BBDE) [9]. The search space for the BBDE is formed

by the Intercluster Distance in Feature Space (ICDF). The parameters that are optimized in

SVM includes C, the penalty factor and , the learning rate. The results of the model are

obtained by testing it on the Case Western Reserve University bearing dataset. The model gives

an accuracy of 99.79% with running time of 0.42 seconds, which much lower when compared

with Differential Evolution SVM. This method offers good accuracy with low running time.



Another novelty in feature selection is given by Ben Ali et al. [10] by employing Intrinsic

Mode Functions (IMFs). IMFs are extracting using mathematical analysis of energy entropy

from the vibration signal data method that combined the empirical mode decomposition with

collected from NSF I/UCR Center for Intelligent Maintenance Systems (IMS). This data is

then processed by the neural network. This work also proposes Health Index (HI) metric, which

indicates the degree of wear on the bearing because of degradation. The method gives an

accuracy of 93% and it can act as a prognostic method to predict the lifetime of the equipment.

The mathematical feature selection may not be economic in terms of computational costs.

Machine learning has taken a big leap after the introduction of Extreme Learning machines

(ELM) which softens the feature selection and parameter optimization in machine learning

algorithms. An intelligent method that employs a self- adaptive Local Mean Decomposition

(LMD) and Single Valued Decomposition (SVD) enabled Extreme Learning Network (ELM)

is framed by Ye Tian et al. [11] for diagnosing the bearing faults from Case Western Reserve

University 6205-2RS JEM SKF dataset. The ELM is known for its non-dependency on

activation functions. It is actually a generic extension of Single Layered Feed Forward Neural

Network (SLFN). The signals are decomposed into Product Functions (PFs) of envelop signals

and frequency modulated signals. The PFs form the features for ELM. Similarly the features of

SVD namely the singular values are trained using the ELM. A comparative study is made

between these methods in terms of feature coincidence. Then the trained model is tested validate

against SVM and BP network in terms of manpower, accuracy and running time. The results of

the study indicated that ELM out- perform the other two with average accuracy of 99% and

average running time in the order of 0.1 seconds. The SVM also offers tough competition to

the ELM but the BP suffers from more computational time. This method could be generalised

to fit to rotating machinery components like shafts and there is definitely great scope for

investigation of new modelling techniques.

Model based improvements open a new aspect for research. Many machine learning

algorithms are supported by hierarchical models which also prove to be potential solutions in

fault diagnosis. Hierarchical Deep Learning (HDN) network constructed by MengGan et al.

[12] identified the potential weak links using severity ranking to make the model more reliable.

The classification of fault types and the severity ranking process is implemented on the two

layered hierarchical network. The layers of the network are interconnected by extractors that

transmit fault layers from top deep belief network to the bottom layer. The model was

experimented on the dataset collected from Case Western Reserve University Bearing Data

Center with an accuracy measure of 99.03%, which is better than Back Propagation Neural

Network (BPNN) and SVM that showed 95.14% and 96.48% respectively. The limitation of

this work is that the network parameters must be well tuned which is a tedious task. This model

can be enhanced by fitting a self-tuning HDN that mitigates the overhead of parameter turning.

The parameter tuning and feature selection is further simplified by Convolution Neural

Network (CNN) which does not limit itself only to image processing. A novel Adaptive Deep

Convolution Neural Network (ADCNN) is built by XiaojieGuo et al. [13] to diagnose the faults

and also to measure the severity of the fault through pattern recognition. The ADCNN is built

with four layers namely transformation, ConvNet , Connection and Classification layer. The

first layer converts the vectored data to matrix input that is clearly recognized by the DNN. The

ConvNet layer is a fusion of convolution layer and Max pooling layer to extract useful features

form the input by using filters. The classification layer applies logistic regression with softmax

function that estimates the probability of the fault class. The critical parameters to be tuned in

this model arelearning rate, batch size andnumber of kernels in each layer.Any erroneous choice

of the parameters would dissuade the performance of the model. The model is tested on Case

Western Reserve Universitydataset of bearing health which resulted in 97.9% accuracy in cross


Components


and mean validation. The accuracy of ADCNN opens up the scope for the shallow learning

models with minor tricks and tuning mechanisms.

Another neural network model proposed by Yaguo Lei et al. used unsupervised data in

building an intelligent two stage method using mechanical big data for diagnosing bearing faults

[14]. The sparse filtering layer extracts the local representative features from Case Western

Reserve University’s vibration data. As the neural network is trained on big data, the features

are made to converge by whitening process to increase the learning speed. The classification

layer uses softmax regression function to classify the health of the bearing. This method reduces

human involvement in the feature selection since the sparse filtering technique is able to extract

features from the large volume of labelled and unlabelled mechanical big data. Experimental

analysis of the proposed method on the locomotive and motor bearing data set showed an

average accuracy of 92.5%. The proposed method is very much adaptive in handling big data

where the feature selection is very difficult.

To mitigate the impact of feature, Ran Zhang et al. proposed a fault diagnosis model for

bearing faults using Deep Neural Network by directly reading the time varying vibration signals

without any feature selection and signal processing [15]. This model predicts its output by

considering the temporal coherent time series data in its near past. The training of the model is

done only with local time invariant data so as to learn the local characteristics of the data. But

the fault diagnosis process considers the temporal data with specified memory to classify the

state of the bearing. The model is tested on two benchmark datasets obtained from University

of Cincinnati Center for Intelligent Maintenance Systems and Bearing Data Center of Case

Western Reserve University. The fault recognition error of the model is computed using cross

entropy. An accuracy of 99% is achieved in training a neural network with temporal coherence

which opens the door for machine fault prognostics. This method is fruitful for people with

limited expertise in signal processing and feature extraction. The convergence of the model may

be slowed down when the memory limit is large.

The integration of paybacks from various machine leaning algorithms could apparently

result in a better model. The progress of deep learning networks with ensembling motivated

the researchers to develop new techniques that could avoid overfitting. Wei Zhang et.al framed

a deep learning convolutional neural network for diagnosing the faults in bearings [16] that

avoids overfitting with less domain expert knowledge. The proposed Training Interference

Convolution Neural Networks (TICNN), extracts visual feature maps from the data set created

at Case Western Reserve University (CWRU) Bearing Data center. This method takes the

temporal vibration signals and applies Convolutional Neural Network (CNN) with Batch

Normalization (BA) to catalyst the training process thus reducing the internal covariance of the

data [17]. This novel step eliminated the need for de-noising the temporal signals. The ReLU

activation unit converges the developed deep neural network with max-pooling thus extracting

the local invariant features.This method also deploys the dropout trick to reduce overfitting of

the model by deactivating the neurons with probability p, which is a hyper parameter [18].The

stability of the model is established by majority voting technique, an ensemble method which

significantly reduces the impact of the choice of initial parameters to train the model. This

model is validated by testing its performance under noisy environment and across varying loads

thus obtaining the accuracy of 95.5 % when compared with its rival methods like Support

Vector Machines (SVM), Multi Layer Perceptron (MLP) and Deep Neural Networks (DNN).

This method is suitable for applications that handle time varying data.

The machine learning algorithms discussed above shows a good versatility in bearing fault

diagnosis. This survey inclines to the following implications: the performance of learning

algorithms greatly depends on feature selection and parameter tuning. Though the algorithms



shows good performance, achieving acceptable accuracy in real on-line systems is a challenging

task.

2.2. Leakage Detection in Pumps

The leakage in pumps is a common phenomenon when the pump wears out. So detection of

faulty pumps will save energy and resources. This section describes the major works done in

fault diagnosis of pumps.

PengXu et al. studied the performance applying Artificial Neural Network (ANN) in the

leakage detection of suck rod pumps in Jiangsu oilfield, China [19]. The study is done with five

classes of dynamometer cards each representing one competitive layer of neurons. The self-

organized winning neuron’s weight is modified by the Kohonen rule [20]. The classification

accuracy of the proposed method is 99.9% which is more than the back propagation network.

This method selects the right input neuron using the Kohonen rule that attempts to improve the

network’s accuracy. In real time systems, the competitive layers may increase drastically

affecting the computing time.

Decision trees naturally select the most prominent feature which is analogous to competitive

layers in neural networks. The extensive literature on fault diagnosis using decision trees with

proper pruning proves to be one of the optimal solutions. This is supported by Sakthivel et al.

through a fault diagnosis model for monoblock centrifugal pump that used Top Down Inductive

Decision Tree (TDIDT) with pessimistic pruning [21]. The statistical features namely mean,

maxima, minima, kurtosis, variance and skewness are measured from the vibrational data. The

feature selection is done by entropy and information gain at each branching point. Decision

trees are always excellent solutions for developing models with continuous data like vibrations.

The test data indicates classification accuracy of 100%but when presented with real world data,

the algorithm produces 99.7%. This method is simple but when the trees are not pruned, it may

lead to an overfitting model. Advanced tree pruning methods may reduce the tree size

drastically. This method may suffer from bas convergence when the dataset is large with

unprocessed features.

A better approach for feature extraction is proposed by Muralidharan et al. [22], was

employed in leakage detection for centrifugal pump using stationary wavelet transforms. The

representative signals for of the healthy pump are recorded and the defective pumps were

isolated by comparing them with healthy signals. J48 algorithm is used for feature selection as

well as for classification. The WEKA implementation of the J48 along with Stationary Wavelet

transformation shows classification accuracy of 93.84%. Tree pruning mechanism could be

integrated inside the model to avoid overfitting. Most of the works in fault diagnosis are based

on data emanating from single source. A more generic model should be supported by

multisource information, which would naturally avoid biasing.

A work that fused the multi-sensory data that uses Bayesian networks for detecting pump

leakage detection [23] is given by Baoping Cai et al. This layered model is supported by two

Bayesian networks to detect the faults: one that detects the single fault and another to detect

simultaneously occurring faults. The exponential growth of conditional probabilities of the

Bayesian network is controlled by using Noisy-MAX function which monitors only the cause

and effect nodes that has shown anomalistic behaviour, thus reducing the number of conditional

probabilistic parameters to be monitored. The work concluded with an accuracy of 99.69% for

single faults. The posterior probabilities of diagnosing multiple faults were not promising with

data from single sensor. So fusion of multiple input sources could be an effective way to achieve

better fault diagnosis. There is definitely room for developing completely automatic fault

diagnosis software using Bayesian networks as the backbone.


Components


The application of Bayesian network motivated the researchers to further investigate the

performance of Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM). A

novelty detector framework is developed based on One Class SVM (OC-SVM) for the leakage

detection that is likely to occur in infrastructures like gas and water pipes [23]. The statistical

features of the data are captured from the two datasets namely Almanac of Minutely Power

dataset (AMPds) and the Department of International Development (DFID) dataset. Feature

selection is done by selecting the winning feature from the pool of features using Selection

Feature Algorithm (SFA), which selects one feature more than the previous iteration. The

leakage detection is done based on the normality model that uses log likelihood computation.

The comparative study of applying SFA over the OC-SVM, GMM and HMM indicates that the

GMM and HMM models showed good accuracy of 90%. This work formulated a framework

for selecting best feature, which could be extended to other machine learning models to achieve

better results.

Another notable work in feature selection is given by Muralidaran and Sugumaran [25]. A

comparison in the accuracy and learning time of SVM and ELM in leakage detection of

monoblock centrifugal pump is done by extracting the wavelet features of the vibration signals

and Discrete Wavelet Transform (DWT) is used to get time related information from the

vibration signals. The study shows that the SVM achieves an accuracy of 98.84 with running

time of 0.25 seconds whereas the ELM reported an accuracy of 99.92 in 0.13 seconds. The

accuracy may deteriorate when the dataset is large. The numerical figures promises that ELM

can be further explored to obtain better results.

The study leakage detection in pumps highlights the need for constructing a model or

algorithm that avoids overfitting of data. The models must always be augmented with proper

convergence criteria. Using decision trees and Bayes network will always need parameter

pruning and optimization, which is time consuming task.

2.3. Fault Diagnosis in Power Transformers

The transmission of power from the generating unit to the consumption site is done through the

power transformers. Since the transformers are constantly subjected to high voltage power

supply, there are more susceptible to failures. So fault diagnosis in transformers is vital for

optimised usage of power. This section of the paper reviews the fault diagnosis methods for

power transformers. The classical fault diagnosis in power transformers are done by measuring

the concentration of the insulating oil using Dissolved Gas Analysis (DGA) method. Now the

AI has led to the development of many intelligent methods.

A notable work that could detect multiple faults used a multi-layer SVM model that

classifies the transformer into four states namely normal state, thermal heating, low-energy

discharge and high-energy discharge [26]. The first SVM (SVM1) isolates the normal

transformer from other three states. The SVM2 isolates the faulty transformers into thermal

heating and low-energy discharge states. This model does not stuck up in local optimal, since

it uses complex quadratic hyperplane. This multi class SVM model is tested on history data of

a 500 kV main transformer, located at Pingguo Substation of South China Electric Power

Company which showed an accuracy of 99% with negligible training time (<1 second) which

is far better than its competitors like neural networks and fuzzy logic. The model has to be

validated against real time on-line data. Kernel and parameter optimizations will make the

model more suitable for multi class fault classification.

Need for parametric optimizations led to the development of parametric tuning techniques.

Each technique tunes the parameters in its unique way. One attempt was to use Least Square

SVM (LS-SVM) [27], that does parametric optimizations over the penalty factor and kernel



function using genetic algorithms for diagnosing transformer faults. This uses a non-linear

method well suited for DGA transformers. The earlier works optimizes the penalty or cost factor

(C) to improve accuracy. But over tuning of the penalty factor will increase the misclassification

cost, so this work focuses in optimizing both penalty factor and penalty degree which will

maintain a trade off between overtuning and undertuning. The features are selected based on

the accuracy of the fitness function of GA. The choice of learning rate and kernel parameter has

great impact on the overall performance of SVM. So tuning them with GA will improvise the

results. This work compared the accuracy of LS-SVM and DA-LS-SVM. The former showed

an accuracy of 55.32% while the latter has accuracy of 68.09%. The work prominently

concentrated on tuning the parameters to optimal value which will contribute positively to the

final performance of the model to diagnose single fault.

Extracting features from the data confines the diagnosis to rely only on selected features. A

more robust model that uses Logical Data Analysis and pattern recognition is developed to

classify the states of power transformers [28] that dynamically increase the fault classes based

on the signal patterns by transforming the signals to its equivalent binary attributes through

binarization. The method classifies the health of the machine into various sub classes each

labelled with different fault type. Class purity is controlled by discriminating factor, which is a

measure of minimum number of patterns that is required to place an observation to a particular

class. This method is not restricted for binary classification of faults. This model showed an

average accuracy of 93.4% when tested on the UCI repository data [29]. This is a robust model

with reasonable accuracy without overfitting, since the model is trained dynamically by new

vibration signal patterns. The binarization will subside the severity of the fault and the number

faulty class may not be deterministic.

A wrong initial prediction in the multi class classifier may deteriorate the final performance

of the model. This motivated Hao Xu [30] to develop a plurality voting based SVM for

diagnosing multiple faults in power transformers by monitoring the insulating oil. The model

uses synthetic minority over-sampling technique (SMOTE) to sample the data obtained from

the benchmark DGC dataset, IEC TC 10 database. SMOTE will sample the minority data to

form synthetic sampled data. The plurality voting SVM model is a combination of many binary

SVM classifiers where equal weightage is given for every classifier. This model contradicts the

multi-class SVM classifier where the wrong prediction at the initial classifier will degrade the

performance of the entire model. The penalty factor and learning rate are tuned using Genetic

Algorithms (GA). These parameters gain weightage depending on the severity of the faults. The

proposed model shows a classification accuracy of 76.2% which may seem low, since the other

hybrid methods with SVM gives better results.

An improvement in accuracy is brought by integration of SVM with the popular Particle

Swarm Optimization (PSO) for detecting faults in transformers . The model used Radial Basis

Function (RBF) kernel and stepwise regression [31] which guarantees good classification

accuracy with shorter running time and it also avoids overfitting of the model to the test data.

The penalty factor (C) and the slack margin of this SVM model is tuned using Modified

Evolutionary Particle Swarm Optimization –Time Variant Acceleration Coefficient (MEPSO-

TVAC) that expands the search space to obtain global maximum. The model is tested on the

benchmark S1 Table that has dissolved gas analysis (DGA) data.The accuracy of the model is

found to be 99.50% at a running time of 90.8578 seconds which is very much encouraging. The

model is definitely a major leap in using SVM for fault diagnosis.

The extensive research work done in fault diagnosis of DGA transformers shows that there

is still a gap for unambiguous multi-class model. Feature independent models like ANN and

extreme learning algorithms can be explored in fault diagnosis of DGA power transformers.

Only a very few models focus on diagnosing multiple faults.


Components


3. COMPARATIVE ANALYSIS OF FAULT DIAGNOSIS METHODS

Machine Learning algorithms are extensively used in the machine fault diagnosis. Each model

and method possesses its own limitations and powers. Though the processing time is also a

critical factor in evaluating the models, the advent of Graphical Processing Unit (GPU) has

reduced its impact. This section compares the classification accuracy of the machine learning

methods that have been reviewed in this paper.

3.1. Analysis of Machine Learning Algorithms for Fault Diagnosis

The literary works on bearing fault diagnosis has shown its developments in all phases namely

feature selection, parameter tuning and modelling. A brief comparative study will clearly

indicate the significant progression of the fault diagnosis. The Table I summarizes the

performance of Machine Learning methods in the fault diagnosis of bearings.

Fig 1. shows the comparative study of the models using in diagnosing bearing faults. Deep

learning networks proves to be a promising solution for fault diagnosis of bearings as the

models deploying neural networks and extreme learning machines offers better result. The

classification accuracy of all the methods is almost same. But there substantial difference

between every model in terms of other metrics like computation cost, Remaining Useful Life

(RUL) etc. Though the classification accuracy of unsupervised Neural Network is relatively

low, it is a good approach for fault diagnosis of unlabelled big data. The fault diagnosis of the

discussed algorithms need not be confined only to bearings. There is a wide scope in health

monitoring of various equipment such as hydraulic breaks, avionic instruments, gear boxes and

rotating machineries

Figure 1 Comparison of classification accuracy in bearings

The relative study in leakage detection in pumps shows many novel developments in

modelling. The feature selection and parametric optimizations are done by new algorithms like

selection feature selection. Table II gives an overview of the fault diagnosis methods used in

leakage detection in pumps.



Figure 2 Comparison of classification accuracy in pumps

The comparative study of pump leakage diagnosis models are shown in fig 2. It is evident

that decision trees and neural networks dominate the domain of leakage detection in pumps.

Though all of the models discussed here demonstrate good accuracy, the models differ in the

way of tuning their parameters. The Bayesian network with multi source fusion shows good

performance in detecting multiple faults with less computational costs. Also it is evident that

the models have to be validated on other metrics like F-Score, time of computation etc. The

models can also be deployed in fault diagnosis of aircraft fuel systems, pressure valves, natural

gas pipelines, turbines etc.

Comparison of the literary works in transformer fault diagnosis demonstrates tremendous

advancements in parameter optimizations. This is because a minor tuning in a parameter

resulted in good improvement in performance. The table3 recapitulate the faults Diagnosis

methods in power transformers.

Figure 3 Comparison of classification accuracy in power transformers

The relative study of the fault diagnosis models in DGA power transformers shown in Fig

5, indicates that SVM proves to be an effective method with better accuracy. The LDA with

Pattern Recognition is a novel approach that allows the number of faults classes to be expanded,

thus facilitating detection of new faults that was not trained earlier. Also kernel tricks and

fixing hyper planes plays a vital role in improving the performance of the SVM models. The

models portrays their extensibility to various other components like wiring faults and high

voltage circuit breakers.

4. OPEN RESEARCH ISSUES

The detailed study and analysis of the fault diagnosis of the electrical and mechanical faults

limelight wide research gaps in building fault diagnosis models. The major areas that has good

scope for research includes:

i) Automatic Feature extraction

It is a well-known fact that the proper selection of features holds lion’s share in performance of

any algorithm or model. Almost all the models spend considerable amount of time and

computation in feature selection. The Extreme Learning Machines can be exploited for

automatic feature selection.

ii) Parameter tuning mechanisms

Parameters are the backbone of any machine learning algorithm. The right choice of parameter

values will boost the performance of the model. The learning rate and penalty factor are the two


Components


parameters which are tuned to obtain good results. But there is a space for considering other

important parameters which dominate the functioning of the model.

iii) Deploying Nature Inspired Algorithms in building models

From this review article it is apparent that majority of the fault diagnosis models use the existing

algorithms or framed as a hybrid model of existing algorithms. There is tremendous scope for

development of new nature inspired algorithms. There are many aspects in nature that are to

be unearthed which may be budding solutions or algorithms.

iv) Dynamic models for unsupervised data

Almost all the models reviewed in this article are supervised algorithms. Unsupervised

algorithms can be unleashed to handle mechanical big data that are streaming from the sensors.

Also training the model for real time on-line applications is very tedious.

v) Machine Health Prognostic

Though diagnosis of equipment is useful for on-time isolation of the defective machinery parts,

it is futile when it comes to the estimation of lifespan of the machinery part. Fault Prognostic

would be a better approach that assesses the lifetime of the equipment and predicts the fault

occurrence.

The above mentioned issues are not exhaustive. Data uncertainty is a notable issue while

handling real time data. New generic models can be developed that are less feature dependent

capable of handling heterogeneous

Table 1 Bearing Fault Diagnosis Methods

S.No Methods Salient Features Optimizations Metric

1. Multi-class

Relevance

Vector

Machine

(M-RVM)

Extraction of features from mean

peak ratio

SMO is used for setting the

hyper

plane

Independent Component

Analysis

Possibility of missing prominent

features

Component analysis,

Tuning penalty factor

(C) and by cross

validation

Accuracy:

97.98%

Testing error:

2.02%

2. SVM with

BBDE

(SVM-

BBDE)

Nature Inspired algorithm

Processes raw time series data

No feature selection and pre

processing

Intercluster distance

in the feature space

(ICDF), Usage of

BBDE to reduce

penalty factor (C)

and kernel parameter

().

Accuracy:

99%

Running time:

19.42s

3. Neural

Network

with EMD

(NN-EMD)

Time domain and Time

Frequency domain feature

selection.

Formulation of Health index

Estimates degree of wear

Prognostic model

Feature extraction by

Energy entropy

Accuracy:

93%

RUL

4. LMD-SVD

Extreme

Self adaptive model

No dependency on activation

function

Automatic feature

selection by ELM

Accuracy:99%

Running time:

0.77 s



S.No Methods Salient Features Optimizations Metric

Learning

Machine

(LMD-SVD)

Product function as features

Check on feature coincidence

5. Hierarchical

deep learning

network

(H-DLN)

Identification of weak links

Overhead of parameter tuning

Extractors acts as layer

interfaces

Scalable model

Severity ranking to

identify weak links

Accuracy:

99.03%

6. Adaptive

Deep

Convolution

Neural

Network

(ADCNN)

Recognition of fault pattern and

sizes

Tuning of learning rate, batch

size and number of kernels

ConvNet: fusion of convolution

layer and Max pooling layer

Tuning of learning

rate, batch size and

number of kernels in

each layer.

Accuracy:97.9%

7. Unsupervised

Neural

Network

(U-NN)

Suited for mechanical big data

Simplifies the feature selection

Sparse filtering to extract local

signals

Whitening to

improve convergence

speed

Accuracy:

92.2%

Standard

deviaton:0.19%

8. Deep Neural

Networks

with

Temporal

Coherence

(DNN-TC)

Uses local time invariant data to

learn

Temporal data is stored in

memory

Fault recognition error is

calculated using entropy

No expertise is required for

signal processing and feature

extraction.

Low signal

processing and

feature extraction

time, Temporal

coherence is

considered during

fault recognition

Accuracy:94.4%

9. Training

Interference

Convolution

Neural

Networks

(TICNN)

Reduce internal covariance in

data

Minimizes the impact of choice

of parameters

Suitable for noisy data

Reduces over fitting by drop

outs

Batch Normalisation,

Dropouts, Ensemble

learning

Accuracy:95.5%

Table 2 Leakage Detection Methods

S.

No

Methods Salient Features Optimization Metrics

1. Self

organized

Competitive

NN

(SOC NN)

Competitive neural layer

Faster neural network

Self organised feature

selection

93.84%

2. Top Down

Inductive

Decision

Tree

(TDI-DT)

Feature selection and classification

is done by same algorithm.

Feature selection is based on

entropy and information gain

Error based post

pruning

Accuracy:

99.7%

Confusion

Marix

3. J48- SWT Feature extraction by SWT and

J48

Classification by Induction tree

Chances of overfitting

NIL 93.84%


Components


S.

No


4. Bayesian

network

Multisource

fusion using

(Bayes)

Multiple fault detection points

Fusion of data from multiple

source

Can detect single and

simultaneously occurring faults.

Noise-MAX function

to limit the count of

conditional probability

99.69%

5. OC-SVM

with

Normality

model

(OC-SVM)

Normality model

Threshold from non-replaceable

pool of features

Novelty detection by log

likelihood computation

SFA to select best

features

Accuracy:90%

6. SVM-DWT Discrete Wavelet Transform to

extract features

Application of ELM

Reduced computation time

NIL Accuracy:

99.92%

Time

complexity:

0.25s

Table 3 Transformer Fault Diagnosis Methods

S

.No


1. Multi layer

SVM

Quadratic hyper plane

Don’t stuck up with local optima

Multi fault detection

Can detect only trained faults

NIL Accuracy: 99%

Average

Running Time:

<1 s

2. LS-SVM Non-linear method

Improved global searching

Fitness based feature selection

Optimised tuning of

penalty factor and

kerel parameter by

GA

Accuracy:

68.09%

3. Logical

Data

Analysis

and pattern

recognition

Binarization of attributes

Multi fault detection

Dynamic growth fault classes

Ranking of features

More Robust

Cut points will

distinguish features

in the patterns,

Discriminating

factor to create

distinct classes

Accuracy:

93.4%

Average

Running Time:

10 s

4. SVM-

SMOTE

Synthetic data

Plurality voting SVM -equal

weightage for every classifier

Detection of fault severity

Tuning of penalty

factor and learning

rate

Accuracy:

76.2%

5. SVM-

MEPSO-

TVAC

RBF Kernel

Stepwise Regression

Good accuracy

Shorter running time

Reduces over fitting

Tuning of penalty

factor and kernel

parameter

Accuracy:

99.5%

Average

Running Time:

74.36 s

5. CONCLUSION

The field of condition monitoring and fault diagnosis is a wide area of research which is now

progressing rapidly by deploying intelligent methods. New inventions and advanced machine

designing demands zero-hardware defect. Many models and frameworks are built to detect

faults and have been validated against benchmark datasets. The field of AI springs as paramount

way to diagnose faults. The work presents a comprehensive review of fault diagnosis in



bearings, pumps and power transformers. The fault diagnosis accuracy of the models is

compared and the scope for improvements is suggested.

Deep Learning and Extreme Machine Learning are finding their applications in various

domains. They form the latest buzz words for the researchers. The automatic feature selection

with more accurate results naturally fits them into machine health monitoring and fault

diagnosis.

REFERENCES

[1] Report on Advanced Surveillance, Diagnostic and Prognostic Techniques in Monitoring

Structures, Systems and Components in Nuclear Power Plants, No: NP-T-3.14.

[2] E. Zio, F. Di Maio, M. Stasi, A data-driven approach for predicting failure scenarios in nuclear

systems, Annals of Nuclear Energy, Vol. 37, 2015, pp. 482-491.

[3] E. Zio, Diagnostics and Prognostics of Engineering Systems: Methods and Techniques, Chapter

17, Engineering Science Reference, USA.

[4] J.P. Ma and J. Jiang, Applications of fault detection and diagnosis methods in nuclear power

plants: A review, Progress in Nuclear Energy, Vol. 53, 2011, pp.255-266.

[5] Fan Li, May, Dynamic Modeling, Sensor Placement Design, and Fault Diagnosis of Nuclear

Desalination Systems, The University of Tennessee, Knoxville, 2001.

[6] Mark Schwabacher, A Survey of Data-Driven Prognostics, Infotech Aerospace Conferences,

2015.

[7] Enrico Zio, Francesco Di Maio, Marco Stasi, A data-driven approach for predicting failure

scenarios in nuclear systems, Annals of Nuclear Energy, Elsevier Masson, Vol.37, 2011,pp.482-

491.

[8] XiaojieGuo, Liang Chen, ChangqingShen, Hierarchical adaptive deep convolution neural

network and its application to bearing fault diagnosis, Journal of Measurements, Vol.93, 2016,

pp. 490-502.

[9] ZhenPeng, Lifeng Wu, Beibei Yao and Yong Guan, Fault Diagnosis from Raw Sensor Data

Using Deep Neural Networks Considering Temporal Coherence, Article from sensors, 2017.

[10] Ben Ali, Nader Fnaiech, LotfiSaidi, Brigitte Chebel-Morello, FarhatFnaiech, Application of

empirical mode decomposition and artificial neural network for automatic bearing fault

diagnosis based on vibration signals, Journal of Applied Acoustics, Vol. 8, 2015,pp. 15-27.

[11] Xiaoyuan Zhang a,n, DaoyinQiub,c, Fuan Chen, Support vector machine with parameter

optimization by a novel hybrid method and its application to fault diagnosis, Journal of

Neurocomputing, Vol.149, 2014, pp. 641-651.

[12] MengGan, Cong Wang n, Chang'an Zhu, Construction of hierarchial diagnosis network based

on deep learning and its application in the fault pattern recognition on rolling element bearings,

Journal on Mechanical Systems and signals. Vol.72, 2016, pp.94-102,

[13] Ye Tian, Jian Ma, Chen Lu, Zili Wang, Rolling bearing fault diagnosis under variable conditions

using LMD-SVD and extreme learning machine, Journal of Mechanics and Machine Theory,

Vol. 90, 2015, pp.175-186.

[14] Yaguo Lei, Member, Jing Lin, Saibo Xing, Steven X. Ding, An Intelligent Fault Diagnosis

Method Using Unsupervised Feature Learning Towards Mechanical Big Data , IEEE

Transactions on Industrial Electronics, Vol. 63, 2016, pp.3137 – 3147.

[15] Marco Fagiani , Stefano Squartini, Leonardo Gabrielli, Marco Severini and Francesco Piazza,

A Statistical Framework for Automatic Leakage Detection in Smart Water and Gas Grids,

Journal of Energies, Vol.9, 2016.

[16] Wei Zhang, Chuanhao Li, GaoliangPeng , Yuanhang Chen, Zhujun Zhang, A deep

convolutional neural network with new training methods for bearing fault diagnosis under noisy

environment and different working load , Journal of Mechanical Systems and Signal Processing,

Vol.100, 2017, pp. 439-453.


Components


[17] Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal

covariate shift, 2015.

[18] N. Srivastava, G.E. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple

way to prevent neural networks from overfitting, Journal of Machineries, Vol.15, 2014, pp.

1929–1958.

[19] V. Muralidharan, V.Sugumaran, GauravPandey, Fault Diagnosis of Monoblock Centrifugal

Pump using Stationary Wavelet Fatures and J48 Algorithm, International Journal of Production

Technology and Management, Vol.1, 2011, pp. 0976 – 6383.

[20] Kohonen T, Self-organized formation of topologically correct feature maps, Journal of Biol.

Cyber, Vol. 43, 1982, pp. 59–69.

[21] BaopingCai, Yonghong Liu, Qian Fan, Yunwei Zhang, Zengkai Liu, Shilin Yu, RenjieJi, Multi-

source information fusion based fault diagnosis of ground-sourceheat pump using Bayesian

network, Journal of Applied Energy, Vol.114, 2014, pp.1-9.

[22] V. Muralidharan, V.Sugumaran, GauravPandey, Fault Diagnosis of Monoblock Centrifugal

Pump using Stationary Wavelet Fatures and J48 Algorithm, International Journal of Production

Technology and Management, Vol.1, 2011, pp. 0976 – 6383.

[23] BaopingCai, Yonghong Liu, Qian Fan, Yunwei Zhang, Zengkai Liu, Shilin Yu, RenjieJi, Multi-

source information fusion based fault diagnosis of ground-sourceheat pump using Bayesian

network, Journal of Applied Energy. Vol. 114, 2014, pp.1-9.

[24] Feng Jia, Yaguo Lei, Jing Lin, Na Lu, Deep neural networks: A promising tool for fault

characteristic mining and intelligent diagnosis of rotating machinery with massive data, Journal

of Mechanical Systems and Signal Processing, 2015.

[25] V. Muralidharan and V. Sugumaran, A Comparative Study between Support Vector Machine

(SVM) and Extreme Learning Machine (ELM) for Fault Detection in Pumps, Indian Journal of

Science and Technology, Vol.9, 2016, pp. 0974-6846

[26] Mohamad-Ali Mortada, SoumayaYacout, AouniLakis, Fault diagnosis in power transformers

using multi-class logical analysis of data, Journal of Intelligent Manufacturing, Vol.25, 2013,pp.

1429–1439.

[27] Ms.Aparna R. Gupta, V. R. Ingle, Dr. M. A. Gaikwad, LS-SVM Parameter Optimization Using

Genetic Algorithm To Improve Fault Classification Of Power Transformer, International

Journal Of Engineering Research and Application, Vol. 2, 2012, pp.1806-1809.

[28] Mohamad-Ali Mortada, SoumayaYacout, AouniLakis, Fault diagnosis in power transformers

using multi-class logical analysis of data, Journal of Intelligent Manufacturing, Vol.25, 2013,

pp. 1429–1439.

[29] Frank, A., & Asuncion, A., UCI machine learning repository, 2010 http://archive.ics.uci.edu/ml.

[30] HazleeAzilIllias, Wee Zhao Liang, Identification of transformer fault based on dissolved gas

analysis using hybrid support vector machine-modified evolutionary particle swarm

optimisation , PLOS ONE, Vol. 13, 2018,pp. 1-15.

[31] Ghunem RA, Assaleh K, El-hag AH, Artificial neural networks with stepwise regression for

predicting transformer oil furan content, IEEE Transactions on Dielectrics and Electrical

Insulation, Vol.19, 2012, pp. 414-420.

[32] S.Sharanya, Revathi Venkataraman, An intelligent Context Based Multi‑layered Bayesian

Inferential predictive analytic framework for classifying machine states, Journal of Ambient

Intelligence and Humanized Computing, https://doi.org/10.1007/s12652-020-02411-2, 2020.

[33] S. Sharanya, S. Karthikeyan, Classifying malicious nodes in vanets using Support Vector

Machines with modified fading memory, ARPN Journal of Engineering and Applied Sciences,

vol. 12, No. 1, 2017, pp. 171-176

analysis of machine learning based fault diagnosis ... · g. murali department of mechatronics...

Documents