analysis of machine learning based fault diagnosis ... · g. murali department of mechatronics...
TRANSCRIPT
http://www.iaeme.com/IJARET/index.asp 80 [email protected]
International Journal of Advanced Research in Engineering and Technology (IJARET) Volume 11, Issue 10, October 2020, pp.80-94, Article ID: IJARET_11_10_008 Available online at http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=10
ISSN Print: 0976-6480 and ISSN Online: 0976-6499
DOI: 10.34218/IJARET.11.10.2020.008
© IAEME Publication Scopus Indexed
ANALYSIS OF MACHINE LEARNING BASED
FAULT DIAGNOSIS APPROACHES IN
MECHANICAL AND ELECTRICAL
COMPONENTS
S. Sharanya
Department of Computer Science and Engineering, SRM Institute of Science and
Technology, Kattankulathur, Tamil Nadu, India
Revathi Venkataraman
Department of Computer Science and Engineering, SRM Institute of Science and
Technology, Kattankulathur, Tamil Nadu, India
G. Murali
Department of Mechatronics Engineering, SRM Institute of Science and
Technology, Kattankulathur, Tamil Nadu, India
ABSTRACT
Condition monitoring and fault diagnosis plays a vital role in extending the lifespan
of any equipment. Diagnosing faults at right time is crucial in life saving appliances
and applications. Fault diagnosis for any equipment or system involves handling of
large voluminous data, which is far beyond human computing capability. So deploying
automatic fault diagnosis approaches would be an intelligent solution that has opened
the gates for Artificial Intelligence (AI), Data Mining and Machine Learning
algorithms. This work reviews the Machine Learning based fault diagnosis algorithms
and models in detecting bearings, pumps and power transformer faults. A performance
comparison of the models is presented based on their accuracy of fault diagnosis. This
analysis also critiques the models with possible scope for improvement. The inferences
from the analysis limelight the need for development of Extreme Learning (EL) models
that are less dependent on explicit feature selection
Keywords: Bearings, Condition Monitoring (CM), Fault Diagnosis, Machine Learning,
Power Transformers, Pumps.
Cite this Article: S.Sharanya, Revathi Venkataraman and G. Murali, Analysis of
Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical
Components, International Journal of Advanced Research in Engineering and
Technology (IJARET), 11(10), 2020, pp.80-94
http://www.iaeme.com/IJARET/issues.asp?JType=IJARET&VType=11&IType=10
Mahilet Reta, Dr. Subash Thanappan, Sathya Prabha and Eyob Mekonnen
http://www.iaeme.com/IJARET/index.asp 81 [email protected]
1. INTRODUCTION
Fault diagnosis is a generic term that explains the process of finding the deviation of any system
or a component of a system from its normal operational profile. A comprehensive Condition
Monitoring (CM) is continuous surveillance of the system which involves sequence of activities
such as system monitoring, fault detection, fault diagnostics and fault prognostics. The
condition monitoring of the equipment is performed quantitatively by assessing some of the
critical variables. When the variables being monitored and observed values are abnormal, then
it is an indication of occurrence of fault in the machinery which can be termed as fault detection.
When a fault is detected, a diagnostic module is immediately activated so as to identify and
characterize the fault. It is vital to characterize the fault because each fault will contribute to the
failure of the system in its own unique way. After identifying the potential fault, the prognostic
model will predict the probability of failure distribution from the current and past environmental
and usage, conditions, past and current operational sensor data, and historical failure data [1].
Condition monitoring involves assessing the characteristic properties of the systems either
continuously or in regular time intervals and the result must be expressed as a quantitative value.
Fault detection is finding the variables that are crosses their bound values [2][32]. Fault
identification is checking the values of related variables. Sometimes the variables are fused or
combined to assess the system. Prognostics involve the sequence of recovery actions to make
the plant to regain its safe state [3].
Manual condition monitoring of equipment is a tiresome process that demands more
manpower. Apart from inaccuracies, the human intervention in condition monitoring of
complex systems that operate in hazardous environments is a serious threat to life and
properties. Two types of condition monitoring strategies are followed to avoid or to reduce
these overheads: physical models and data-driven approaches [4].
Physical model: The physical models of complex systems can be constructed by exploiting
the physical relations of the systems and its associated variables. This approach makes many
assumptions in building the physical model of the systems thus limiting its applicability in the
real time systems [5].
Data Driven approaches: This approach is also known as data mining or machine learning
approach, uses the historic data obtained from the system to learn about the system and model
its behaviour [6]. The models that are developed through this approach take the advantage of
the availability of large volume of data thus tuning the condition monitoring models for
enhanced accuracy [4]. The data driven approaches are broadly classified into two sub divisions
namely statistical methods and artificial intelligence methods [7].
The statistical methods involve the use of many mathematical and statistical approaches.
The artificial intelligence methods like Artificial Neural Network (ANN), Support Vector
Machines (SVM) [33], Genetic Algorithms (GA), Principle Component Analysis (PCA),
Expert System models are gaining more significance in fault diagnosis and health monitoring
because of the rapid development in the field of machine learning. The researchers have geared
up their work in exploring the possibilities of extending machine learning algorithms in almost
all domains. Notable works have been done in deploying machine learning algorithms for better
fault diagnosis in mechanical and electrical machineries. There is definitely a gap in knowledge
transfer between diverse communities in engineering discipline.
This article reviews the application of machine learning models and algorithms in
diagnosing beaning, pump and electrical faults with detailed emphasis on feature selection,
parameter tuning and its impending limitations. This article would be of a concern to the experts
in the field of machine learning and machine health monitoring whose primary research goal is
Analysis of Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical
Components
http://www.iaeme.com/IJARET/index.asp 82 [email protected]
to invent cutting-edge techniques in fault diagnosis. The crucial aims of this article can be
conceptualized as:
Review the state-of-art techniques in applying machine learning in fault diagnosis.
Pin point the advantages and limitations of each work in the context of machine
learning and health monitoring.
Giving deeper insight to the potential feature selection procedures along with
parameter tuning mechanisms.
Though extensive research have been done in using machine learning algorithms for fault
diagnosis of machinery parts, this work focuses on bearing, pumps and transformer faults
because of their impact on the mechanical and production industries.
The organization of the paper is as follows: Section I gives a brief introduction about various
phases in the fault diagnosis and also the classification of fault diagnosis approaches. Section
II presents the models and algorithms used in fault diagnosis of bearings, pumps and
transformers. A detailed comparative analysis is given in Section III and the scope for future
enhancements is discussed in Section IV. Section V concludes the work.
2. MACHINE LEARNING ALGORITHMS IN FAULT DIAGNOSIS
The machine learning algorithms attempt to use the historic data presented to them, thus
predicting the future results. In terms of fault diagnosis, the machine learning algorithms extract
interesting patterns and features from the past data to diagnose the faults.
2.1. Fault Diagnosis in Bearings
Bearings in mechanical machinery are responsible for reducing the wear and tear caused due to
shaft rotation. There is wide usage of bearings in mechanical, automobile and textile industry
which leads to its enormous production. So there is an impulsive need for fault diagnosis of
these bearings in the production line. The literature shows the improvisations in every aspect
of the fault diagnosis. So the incremental progress could not be picked up.
Relevance Vector Machines (RVM) and Support Vector Machines (SVM) are deployed by
Achmad Widodo et al. after performing component analysis of the signals from the acoustic
emission sensor [8]. The features are directly extracted from the signals and thus scaled down
using the Principle Component Analysis (PCA). The localized errors are diagnosed by SVM
using RBF kernel and integrating Sequential Minimal Optimization (SMO) for fixing the
classification hyperplane. The performance of both RVM and SVM models with PCA and
Independent Component Analysis (ICA) is measured. The testing error for RVM and SVM with
ICF is as low as 2.04%. Since the features are directly extracted from the acoustic signals, there
is a possibility of missing out more prominent features.
The feature engineering is also a notable field in AI. It has become the chief research interest
because of its dominance in performance of the algorithm. Feature selection and parameter
tuning by nature inspired algorithms has tremendously improved the performance of the many
algorithms in different disciplines. Expending this fact, Xiaoyuan Zhang et al. designed a novel
hybrid model that optimizes the parameters of SVM using Barebones Particle Swarm
Optimization and Differential Evolution (BBDE) [9]. The search space for the BBDE is formed
by the Intercluster Distance in Feature Space (ICDF). The parameters that are optimized in
SVM includes C, the penalty factor and , the learning rate. The results of the model are
obtained by testing it on the Case Western Reserve University bearing dataset. The model gives
an accuracy of 99.79% with running time of 0.42 seconds, which much lower when compared
with Differential Evolution SVM. This method offers good accuracy with low running time.
Mahilet Reta, Dr. Subash Thanappan, Sathya Prabha and Eyob Mekonnen
http://www.iaeme.com/IJARET/index.asp 83 [email protected]
Another novelty in feature selection is given by Ben Ali et al. [10] by employing Intrinsic
Mode Functions (IMFs). IMFs are extracting using mathematical analysis of energy entropy
from the vibration signal data method that combined the empirical mode decomposition with
collected from NSF I/UCR Center for Intelligent Maintenance Systems (IMS). This data is
then processed by the neural network. This work also proposes Health Index (HI) metric, which
indicates the degree of wear on the bearing because of degradation. The method gives an
accuracy of 93% and it can act as a prognostic method to predict the lifetime of the equipment.
The mathematical feature selection may not be economic in terms of computational costs.
Machine learning has taken a big leap after the introduction of Extreme Learning machines
(ELM) which softens the feature selection and parameter optimization in machine learning
algorithms. An intelligent method that employs a self- adaptive Local Mean Decomposition
(LMD) and Single Valued Decomposition (SVD) enabled Extreme Learning Network (ELM)
is framed by Ye Tian et al. [11] for diagnosing the bearing faults from Case Western Reserve
University 6205-2RS JEM SKF dataset. The ELM is known for its non-dependency on
activation functions. It is actually a generic extension of Single Layered Feed Forward Neural
Network (SLFN). The signals are decomposed into Product Functions (PFs) of envelop signals
and frequency modulated signals. The PFs form the features for ELM. Similarly the features of
SVD namely the singular values are trained using the ELM. A comparative study is made
between these methods in terms of feature coincidence. Then the trained model is tested validate
against SVM and BP network in terms of manpower, accuracy and running time. The results of
the study indicated that ELM out- perform the other two with average accuracy of 99% and
average running time in the order of 0.1 seconds. The SVM also offers tough competition to
the ELM but the BP suffers from more computational time. This method could be generalised
to fit to rotating machinery components like shafts and there is definitely great scope for
investigation of new modelling techniques.
Model based improvements open a new aspect for research. Many machine learning
algorithms are supported by hierarchical models which also prove to be potential solutions in
fault diagnosis. Hierarchical Deep Learning (HDN) network constructed by MengGan et al.
[12] identified the potential weak links using severity ranking to make the model more reliable.
The classification of fault types and the severity ranking process is implemented on the two
layered hierarchical network. The layers of the network are interconnected by extractors that
transmit fault layers from top deep belief network to the bottom layer. The model was
experimented on the dataset collected from Case Western Reserve University Bearing Data
Center with an accuracy measure of 99.03%, which is better than Back Propagation Neural
Network (BPNN) and SVM that showed 95.14% and 96.48% respectively. The limitation of
this work is that the network parameters must be well tuned which is a tedious task. This model
can be enhanced by fitting a self-tuning HDN that mitigates the overhead of parameter turning.
The parameter tuning and feature selection is further simplified by Convolution Neural
Network (CNN) which does not limit itself only to image processing. A novel Adaptive Deep
Convolution Neural Network (ADCNN) is built by XiaojieGuo et al. [13] to diagnose the faults
and also to measure the severity of the fault through pattern recognition. The ADCNN is built
with four layers namely transformation, ConvNet , Connection and Classification layer. The
first layer converts the vectored data to matrix input that is clearly recognized by the DNN. The
ConvNet layer is a fusion of convolution layer and Max pooling layer to extract useful features
form the input by using filters. The classification layer applies logistic regression with softmax
function that estimates the probability of the fault class. The critical parameters to be tuned in
this model arelearning rate, batch size andnumber of kernels in each layer.Any erroneous choice
of the parameters would dissuade the performance of the model. The model is tested on Case
Western Reserve Universitydataset of bearing health which resulted in 97.9% accuracy in cross
Analysis of Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical
Components
http://www.iaeme.com/IJARET/index.asp 84 [email protected]
and mean validation. The accuracy of ADCNN opens up the scope for the shallow learning
models with minor tricks and tuning mechanisms.
Another neural network model proposed by Yaguo Lei et al. used unsupervised data in
building an intelligent two stage method using mechanical big data for diagnosing bearing faults
[14]. The sparse filtering layer extracts the local representative features from Case Western
Reserve University’s vibration data. As the neural network is trained on big data, the features
are made to converge by whitening process to increase the learning speed. The classification
layer uses softmax regression function to classify the health of the bearing. This method reduces
human involvement in the feature selection since the sparse filtering technique is able to extract
features from the large volume of labelled and unlabelled mechanical big data. Experimental
analysis of the proposed method on the locomotive and motor bearing data set showed an
average accuracy of 92.5%. The proposed method is very much adaptive in handling big data
where the feature selection is very difficult.
To mitigate the impact of feature, Ran Zhang et al. proposed a fault diagnosis model for
bearing faults using Deep Neural Network by directly reading the time varying vibration signals
without any feature selection and signal processing [15]. This model predicts its output by
considering the temporal coherent time series data in its near past. The training of the model is
done only with local time invariant data so as to learn the local characteristics of the data. But
the fault diagnosis process considers the temporal data with specified memory to classify the
state of the bearing. The model is tested on two benchmark datasets obtained from University
of Cincinnati Center for Intelligent Maintenance Systems and Bearing Data Center of Case
Western Reserve University. The fault recognition error of the model is computed using cross
entropy. An accuracy of 99% is achieved in training a neural network with temporal coherence
which opens the door for machine fault prognostics. This method is fruitful for people with
limited expertise in signal processing and feature extraction. The convergence of the model may
be slowed down when the memory limit is large.
The integration of paybacks from various machine leaning algorithms could apparently
result in a better model. The progress of deep learning networks with ensembling motivated
the researchers to develop new techniques that could avoid overfitting. Wei Zhang et.al framed
a deep learning convolutional neural network for diagnosing the faults in bearings [16] that
avoids overfitting with less domain expert knowledge. The proposed Training Interference
Convolution Neural Networks (TICNN), extracts visual feature maps from the data set created
at Case Western Reserve University (CWRU) Bearing Data center. This method takes the
temporal vibration signals and applies Convolutional Neural Network (CNN) with Batch
Normalization (BA) to catalyst the training process thus reducing the internal covariance of the
data [17]. This novel step eliminated the need for de-noising the temporal signals. The ReLU
activation unit converges the developed deep neural network with max-pooling thus extracting
the local invariant features.This method also deploys the dropout trick to reduce overfitting of
the model by deactivating the neurons with probability p, which is a hyper parameter [18].The
stability of the model is established by majority voting technique, an ensemble method which
significantly reduces the impact of the choice of initial parameters to train the model. This
model is validated by testing its performance under noisy environment and across varying loads
thus obtaining the accuracy of 95.5 % when compared with its rival methods like Support
Vector Machines (SVM), Multi Layer Perceptron (MLP) and Deep Neural Networks (DNN).
This method is suitable for applications that handle time varying data.
The machine learning algorithms discussed above shows a good versatility in bearing fault
diagnosis. This survey inclines to the following implications: the performance of learning
algorithms greatly depends on feature selection and parameter tuning. Though the algorithms
Mahilet Reta, Dr. Subash Thanappan, Sathya Prabha and Eyob Mekonnen
http://www.iaeme.com/IJARET/index.asp 85 [email protected]
shows good performance, achieving acceptable accuracy in real on-line systems is a challenging
task.
2.2. Leakage Detection in Pumps
The leakage in pumps is a common phenomenon when the pump wears out. So detection of
faulty pumps will save energy and resources. This section describes the major works done in
fault diagnosis of pumps.
PengXu et al. studied the performance applying Artificial Neural Network (ANN) in the
leakage detection of suck rod pumps in Jiangsu oilfield, China [19]. The study is done with five
classes of dynamometer cards each representing one competitive layer of neurons. The self-
organized winning neuron’s weight is modified by the Kohonen rule [20]. The classification
accuracy of the proposed method is 99.9% which is more than the back propagation network.
This method selects the right input neuron using the Kohonen rule that attempts to improve the
network’s accuracy. In real time systems, the competitive layers may increase drastically
affecting the computing time.
Decision trees naturally select the most prominent feature which is analogous to competitive
layers in neural networks. The extensive literature on fault diagnosis using decision trees with
proper pruning proves to be one of the optimal solutions. This is supported by Sakthivel et al.
through a fault diagnosis model for monoblock centrifugal pump that used Top Down Inductive
Decision Tree (TDIDT) with pessimistic pruning [21]. The statistical features namely mean,
maxima, minima, kurtosis, variance and skewness are measured from the vibrational data. The
feature selection is done by entropy and information gain at each branching point. Decision
trees are always excellent solutions for developing models with continuous data like vibrations.
The test data indicates classification accuracy of 100%but when presented with real world data,
the algorithm produces 99.7%. This method is simple but when the trees are not pruned, it may
lead to an overfitting model. Advanced tree pruning methods may reduce the tree size
drastically. This method may suffer from bas convergence when the dataset is large with
unprocessed features.
A better approach for feature extraction is proposed by Muralidharan et al. [22], was
employed in leakage detection for centrifugal pump using stationary wavelet transforms. The
representative signals for of the healthy pump are recorded and the defective pumps were
isolated by comparing them with healthy signals. J48 algorithm is used for feature selection as
well as for classification. The WEKA implementation of the J48 along with Stationary Wavelet
transformation shows classification accuracy of 93.84%. Tree pruning mechanism could be
integrated inside the model to avoid overfitting. Most of the works in fault diagnosis are based
on data emanating from single source. A more generic model should be supported by
multisource information, which would naturally avoid biasing.
A work that fused the multi-sensory data that uses Bayesian networks for detecting pump
leakage detection [23] is given by Baoping Cai et al. This layered model is supported by two
Bayesian networks to detect the faults: one that detects the single fault and another to detect
simultaneously occurring faults. The exponential growth of conditional probabilities of the
Bayesian network is controlled by using Noisy-MAX function which monitors only the cause
and effect nodes that has shown anomalistic behaviour, thus reducing the number of conditional
probabilistic parameters to be monitored. The work concluded with an accuracy of 99.69% for
single faults. The posterior probabilities of diagnosing multiple faults were not promising with
data from single sensor. So fusion of multiple input sources could be an effective way to achieve
better fault diagnosis. There is definitely room for developing completely automatic fault
diagnosis software using Bayesian networks as the backbone.
Analysis of Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical
Components
http://www.iaeme.com/IJARET/index.asp 86 [email protected]
The application of Bayesian network motivated the researchers to further investigate the
performance of Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM). A
novelty detector framework is developed based on One Class SVM (OC-SVM) for the leakage
detection that is likely to occur in infrastructures like gas and water pipes [23]. The statistical
features of the data are captured from the two datasets namely Almanac of Minutely Power
dataset (AMPds) and the Department of International Development (DFID) dataset. Feature
selection is done by selecting the winning feature from the pool of features using Selection
Feature Algorithm (SFA), which selects one feature more than the previous iteration. The
leakage detection is done based on the normality model that uses log likelihood computation.
The comparative study of applying SFA over the OC-SVM, GMM and HMM indicates that the
GMM and HMM models showed good accuracy of 90%. This work formulated a framework
for selecting best feature, which could be extended to other machine learning models to achieve
better results.
Another notable work in feature selection is given by Muralidaran and Sugumaran [25]. A
comparison in the accuracy and learning time of SVM and ELM in leakage detection of
monoblock centrifugal pump is done by extracting the wavelet features of the vibration signals
and Discrete Wavelet Transform (DWT) is used to get time related information from the
vibration signals. The study shows that the SVM achieves an accuracy of 98.84 with running
time of 0.25 seconds whereas the ELM reported an accuracy of 99.92 in 0.13 seconds. The
accuracy may deteriorate when the dataset is large. The numerical figures promises that ELM
can be further explored to obtain better results.
The study leakage detection in pumps highlights the need for constructing a model or
algorithm that avoids overfitting of data. The models must always be augmented with proper
convergence criteria. Using decision trees and Bayes network will always need parameter
pruning and optimization, which is time consuming task.
2.3. Fault Diagnosis in Power Transformers
The transmission of power from the generating unit to the consumption site is done through the
power transformers. Since the transformers are constantly subjected to high voltage power
supply, there are more susceptible to failures. So fault diagnosis in transformers is vital for
optimised usage of power. This section of the paper reviews the fault diagnosis methods for
power transformers. The classical fault diagnosis in power transformers are done by measuring
the concentration of the insulating oil using Dissolved Gas Analysis (DGA) method. Now the
AI has led to the development of many intelligent methods.
A notable work that could detect multiple faults used a multi-layer SVM model that
classifies the transformer into four states namely normal state, thermal heating, low-energy
discharge and high-energy discharge [26]. The first SVM (SVM1) isolates the normal
transformer from other three states. The SVM2 isolates the faulty transformers into thermal
heating and low-energy discharge states. This model does not stuck up in local optimal, since
it uses complex quadratic hyperplane. This multi class SVM model is tested on history data of
a 500 kV main transformer, located at Pingguo Substation of South China Electric Power
Company which showed an accuracy of 99% with negligible training time (<1 second) which
is far better than its competitors like neural networks and fuzzy logic. The model has to be
validated against real time on-line data. Kernel and parameter optimizations will make the
model more suitable for multi class fault classification.
Need for parametric optimizations led to the development of parametric tuning techniques.
Each technique tunes the parameters in its unique way. One attempt was to use Least Square
SVM (LS-SVM) [27], that does parametric optimizations over the penalty factor and kernel
Mahilet Reta, Dr. Subash Thanappan, Sathya Prabha and Eyob Mekonnen
http://www.iaeme.com/IJARET/index.asp 87 [email protected]
function using genetic algorithms for diagnosing transformer faults. This uses a non-linear
method well suited for DGA transformers. The earlier works optimizes the penalty or cost factor
(C) to improve accuracy. But over tuning of the penalty factor will increase the misclassification
cost, so this work focuses in optimizing both penalty factor and penalty degree which will
maintain a trade off between overtuning and undertuning. The features are selected based on
the accuracy of the fitness function of GA. The choice of learning rate and kernel parameter has
great impact on the overall performance of SVM. So tuning them with GA will improvise the
results. This work compared the accuracy of LS-SVM and DA-LS-SVM. The former showed
an accuracy of 55.32% while the latter has accuracy of 68.09%. The work prominently
concentrated on tuning the parameters to optimal value which will contribute positively to the
final performance of the model to diagnose single fault.
Extracting features from the data confines the diagnosis to rely only on selected features. A
more robust model that uses Logical Data Analysis and pattern recognition is developed to
classify the states of power transformers [28] that dynamically increase the fault classes based
on the signal patterns by transforming the signals to its equivalent binary attributes through
binarization. The method classifies the health of the machine into various sub classes each
labelled with different fault type. Class purity is controlled by discriminating factor, which is a
measure of minimum number of patterns that is required to place an observation to a particular
class. This method is not restricted for binary classification of faults. This model showed an
average accuracy of 93.4% when tested on the UCI repository data [29]. This is a robust model
with reasonable accuracy without overfitting, since the model is trained dynamically by new
vibration signal patterns. The binarization will subside the severity of the fault and the number
faulty class may not be deterministic.
A wrong initial prediction in the multi class classifier may deteriorate the final performance
of the model. This motivated Hao Xu [30] to develop a plurality voting based SVM for
diagnosing multiple faults in power transformers by monitoring the insulating oil. The model
uses synthetic minority over-sampling technique (SMOTE) to sample the data obtained from
the benchmark DGC dataset, IEC TC 10 database. SMOTE will sample the minority data to
form synthetic sampled data. The plurality voting SVM model is a combination of many binary
SVM classifiers where equal weightage is given for every classifier. This model contradicts the
multi-class SVM classifier where the wrong prediction at the initial classifier will degrade the
performance of the entire model. The penalty factor and learning rate are tuned using Genetic
Algorithms (GA). These parameters gain weightage depending on the severity of the faults. The
proposed model shows a classification accuracy of 76.2% which may seem low, since the other
hybrid methods with SVM gives better results.
An improvement in accuracy is brought by integration of SVM with the popular Particle
Swarm Optimization (PSO) for detecting faults in transformers . The model used Radial Basis
Function (RBF) kernel and stepwise regression [31] which guarantees good classification
accuracy with shorter running time and it also avoids overfitting of the model to the test data.
The penalty factor (C) and the slack margin of this SVM model is tuned using Modified
Evolutionary Particle Swarm Optimization –Time Variant Acceleration Coefficient (MEPSO-
TVAC) that expands the search space to obtain global maximum. The model is tested on the
benchmark S1 Table that has dissolved gas analysis (DGA) data.The accuracy of the model is
found to be 99.50% at a running time of 90.8578 seconds which is very much encouraging. The
model is definitely a major leap in using SVM for fault diagnosis.
The extensive research work done in fault diagnosis of DGA transformers shows that there
is still a gap for unambiguous multi-class model. Feature independent models like ANN and
extreme learning algorithms can be explored in fault diagnosis of DGA power transformers.
Only a very few models focus on diagnosing multiple faults.
Analysis of Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical
Components
http://www.iaeme.com/IJARET/index.asp 88 [email protected]
3. COMPARATIVE ANALYSIS OF FAULT DIAGNOSIS METHODS
Machine Learning algorithms are extensively used in the machine fault diagnosis. Each model
and method possesses its own limitations and powers. Though the processing time is also a
critical factor in evaluating the models, the advent of Graphical Processing Unit (GPU) has
reduced its impact. This section compares the classification accuracy of the machine learning
methods that have been reviewed in this paper.
3.1. Analysis of Machine Learning Algorithms for Fault Diagnosis
The literary works on bearing fault diagnosis has shown its developments in all phases namely
feature selection, parameter tuning and modelling. A brief comparative study will clearly
indicate the significant progression of the fault diagnosis. The Table I summarizes the
performance of Machine Learning methods in the fault diagnosis of bearings.
Fig 1. shows the comparative study of the models using in diagnosing bearing faults. Deep
learning networks proves to be a promising solution for fault diagnosis of bearings as the
models deploying neural networks and extreme learning machines offers better result. The
classification accuracy of all the methods is almost same. But there substantial difference
between every model in terms of other metrics like computation cost, Remaining Useful Life
(RUL) etc. Though the classification accuracy of unsupervised Neural Network is relatively
low, it is a good approach for fault diagnosis of unlabelled big data. The fault diagnosis of the
discussed algorithms need not be confined only to bearings. There is a wide scope in health
monitoring of various equipment such as hydraulic breaks, avionic instruments, gear boxes and
rotating machineries
Figure 1 Comparison of classification accuracy in bearings
The relative study in leakage detection in pumps shows many novel developments in
modelling. The feature selection and parametric optimizations are done by new algorithms like
selection feature selection. Table II gives an overview of the fault diagnosis methods used in
leakage detection in pumps.
Mahilet Reta, Dr. Subash Thanappan, Sathya Prabha and Eyob Mekonnen
http://www.iaeme.com/IJARET/index.asp 89 [email protected]
Figure 2 Comparison of classification accuracy in pumps
The comparative study of pump leakage diagnosis models are shown in fig 2. It is evident
that decision trees and neural networks dominate the domain of leakage detection in pumps.
Though all of the models discussed here demonstrate good accuracy, the models differ in the
way of tuning their parameters. The Bayesian network with multi source fusion shows good
performance in detecting multiple faults with less computational costs. Also it is evident that
the models have to be validated on other metrics like F-Score, time of computation etc. The
models can also be deployed in fault diagnosis of aircraft fuel systems, pressure valves, natural
gas pipelines, turbines etc.
Comparison of the literary works in transformer fault diagnosis demonstrates tremendous
advancements in parameter optimizations. This is because a minor tuning in a parameter
resulted in good improvement in performance. The table3 recapitulate the faults Diagnosis
methods in power transformers.
Figure 3 Comparison of classification accuracy in power transformers
The relative study of the fault diagnosis models in DGA power transformers shown in Fig
5, indicates that SVM proves to be an effective method with better accuracy. The LDA with
Pattern Recognition is a novel approach that allows the number of faults classes to be expanded,
thus facilitating detection of new faults that was not trained earlier. Also kernel tricks and
fixing hyper planes plays a vital role in improving the performance of the SVM models. The
models portrays their extensibility to various other components like wiring faults and high
voltage circuit breakers.
4. OPEN RESEARCH ISSUES
The detailed study and analysis of the fault diagnosis of the electrical and mechanical faults
limelight wide research gaps in building fault diagnosis models. The major areas that has good
scope for research includes:
i) Automatic Feature extraction
It is a well-known fact that the proper selection of features holds lion’s share in performance of
any algorithm or model. Almost all the models spend considerable amount of time and
computation in feature selection. The Extreme Learning Machines can be exploited for
automatic feature selection.
ii) Parameter tuning mechanisms
Parameters are the backbone of any machine learning algorithm. The right choice of parameter
values will boost the performance of the model. The learning rate and penalty factor are the two
Analysis of Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical
Components
http://www.iaeme.com/IJARET/index.asp 90 [email protected]
parameters which are tuned to obtain good results. But there is a space for considering other
important parameters which dominate the functioning of the model.
iii) Deploying Nature Inspired Algorithms in building models
From this review article it is apparent that majority of the fault diagnosis models use the existing
algorithms or framed as a hybrid model of existing algorithms. There is tremendous scope for
development of new nature inspired algorithms. There are many aspects in nature that are to
be unearthed which may be budding solutions or algorithms.
iv) Dynamic models for unsupervised data
Almost all the models reviewed in this article are supervised algorithms. Unsupervised
algorithms can be unleashed to handle mechanical big data that are streaming from the sensors.
Also training the model for real time on-line applications is very tedious.
v) Machine Health Prognostic
Though diagnosis of equipment is useful for on-time isolation of the defective machinery parts,
it is futile when it comes to the estimation of lifespan of the machinery part. Fault Prognostic
would be a better approach that assesses the lifetime of the equipment and predicts the fault
occurrence.
The above mentioned issues are not exhaustive. Data uncertainty is a notable issue while
handling real time data. New generic models can be developed that are less feature dependent
capable of handling heterogeneous
Table 1 Bearing Fault Diagnosis Methods
S.No Methods Salient Features Optimizations Metric
1. Multi-class
Relevance
Vector
Machine
(M-RVM)
Extraction of features from mean
peak ratio
SMO is used for setting the
hyper
plane
Independent Component
Analysis
Possibility of missing prominent
features
Component analysis,
Tuning penalty factor
(C) and by cross
validation
Accuracy:
97.98%
Testing error:
2.02%
2. SVM with
BBDE
(SVM-
BBDE)
Nature Inspired algorithm
Processes raw time series data
No feature selection and pre
processing
Intercluster distance
in the feature space
(ICDF), Usage of
BBDE to reduce
penalty factor (C)
and kernel parameter
().
Accuracy:
99%
Running time:
19.42s
3. Neural
Network
with EMD
(NN-EMD)
Time domain and Time
Frequency domain feature
selection.
Formulation of Health index
Estimates degree of wear
Prognostic model
Feature extraction by
Energy entropy
Accuracy:
93%
RUL
4. LMD-SVD
Extreme
Self adaptive model
No dependency on activation
function
Automatic feature
selection by ELM
Accuracy:99%
Running time:
0.77 s
Mahilet Reta, Dr. Subash Thanappan, Sathya Prabha and Eyob Mekonnen
http://www.iaeme.com/IJARET/index.asp 91 [email protected]
S.No Methods Salient Features Optimizations Metric
Learning
Machine
(LMD-SVD)
Product function as features
Check on feature coincidence
5. Hierarchical
deep learning
network
(H-DLN)
Identification of weak links
Overhead of parameter tuning
Extractors acts as layer
interfaces
Scalable model
Severity ranking to
identify weak links
Accuracy:
99.03%
6. Adaptive
Deep
Convolution
Neural
Network
(ADCNN)
Recognition of fault pattern and
sizes
Tuning of learning rate, batch
size and number of kernels
ConvNet: fusion of convolution
layer and Max pooling layer
Tuning of learning
rate, batch size and
number of kernels in
each layer.
Accuracy:97.9%
7. Unsupervised
Neural
Network
(U-NN)
Suited for mechanical big data
Simplifies the feature selection
Sparse filtering to extract local
signals
Whitening to
improve convergence
speed
Accuracy:
92.2%
Standard
deviaton:0.19%
8. Deep Neural
Networks
with
Temporal
Coherence
(DNN-TC)
Uses local time invariant data to
learn
Temporal data is stored in
memory
Fault recognition error is
calculated using entropy
No expertise is required for
signal processing and feature
extraction.
Low signal
processing and
feature extraction
time, Temporal
coherence is
considered during
fault recognition
Accuracy:94.4%
9. Training
Interference
Convolution
Neural
Networks
(TICNN)
Reduce internal covariance in
data
Minimizes the impact of choice
of parameters
Suitable for noisy data
Reduces over fitting by drop
outs
Batch Normalisation,
Dropouts, Ensemble
learning
Accuracy:95.5%
Table 2 Leakage Detection Methods
S.
No
Methods Salient Features Optimization Metrics
1. Self
organized
Competitive
NN
(SOC NN)
Competitive neural layer
Faster neural network
Self organised feature
selection
93.84%
2. Top Down
Inductive
Decision
Tree
(TDI-DT)
Feature selection and classification
is done by same algorithm.
Feature selection is based on
entropy and information gain
Error based post
pruning
Accuracy:
99.7%
Confusion
Marix
3. J48- SWT Feature extraction by SWT and
J48
Classification by Induction tree
Chances of overfitting
NIL 93.84%
Analysis of Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical
Components
http://www.iaeme.com/IJARET/index.asp 92 [email protected]
S.
No
Methods Salient Features Optimization Metrics
4. Bayesian
network
Multisource
fusion using
(Bayes)
Multiple fault detection points
Fusion of data from multiple
source
Can detect single and
simultaneously occurring faults.
Noise-MAX function
to limit the count of
conditional probability
99.69%
5. OC-SVM
with
Normality
model
(OC-SVM)
Normality model
Threshold from non-replaceable
pool of features
Novelty detection by log
likelihood computation
SFA to select best
features
Accuracy:90%
6. SVM-DWT Discrete Wavelet Transform to
extract features
Application of ELM
Reduced computation time
NIL Accuracy:
99.92%
Time
complexity:
0.25s
Table 3 Transformer Fault Diagnosis Methods
S
.No
Methods Salient Features Optimization Metrics
1. Multi layer
SVM
Quadratic hyper plane
Don’t stuck up with local optima
Multi fault detection
Can detect only trained faults
NIL Accuracy: 99%
Average
Running Time:
<1 s
2. LS-SVM Non-linear method
Improved global searching
Fitness based feature selection
Optimised tuning of
penalty factor and
kerel parameter by
GA
Accuracy:
68.09%
3. Logical
Data
Analysis
and pattern
recognition
Binarization of attributes
Multi fault detection
Dynamic growth fault classes
Ranking of features
More Robust
Cut points will
distinguish features
in the patterns,
Discriminating
factor to create
distinct classes
Accuracy:
93.4%
Average
Running Time:
10 s
4. SVM-
SMOTE
Synthetic data
Plurality voting SVM -equal
weightage for every classifier
Detection of fault severity
Tuning of penalty
factor and learning
rate
Accuracy:
76.2%
5. SVM-
MEPSO-
TVAC
RBF Kernel
Stepwise Regression
Good accuracy
Shorter running time
Reduces over fitting
Tuning of penalty
factor and kernel
parameter
Accuracy:
99.5%
Average
Running Time:
74.36 s
5. CONCLUSION
The field of condition monitoring and fault diagnosis is a wide area of research which is now
progressing rapidly by deploying intelligent methods. New inventions and advanced machine
designing demands zero-hardware defect. Many models and frameworks are built to detect
faults and have been validated against benchmark datasets. The field of AI springs as paramount
way to diagnose faults. The work presents a comprehensive review of fault diagnosis in
Mahilet Reta, Dr. Subash Thanappan, Sathya Prabha and Eyob Mekonnen
http://www.iaeme.com/IJARET/index.asp 93 [email protected]
bearings, pumps and power transformers. The fault diagnosis accuracy of the models is
compared and the scope for improvements is suggested.
Deep Learning and Extreme Machine Learning are finding their applications in various
domains. They form the latest buzz words for the researchers. The automatic feature selection
with more accurate results naturally fits them into machine health monitoring and fault
diagnosis.
REFERENCES
[1] Report on Advanced Surveillance, Diagnostic and Prognostic Techniques in Monitoring
Structures, Systems and Components in Nuclear Power Plants, No: NP-T-3.14.
[2] E. Zio, F. Di Maio, M. Stasi, A data-driven approach for predicting failure scenarios in nuclear
systems, Annals of Nuclear Energy, Vol. 37, 2015, pp. 482-491.
[3] E. Zio, Diagnostics and Prognostics of Engineering Systems: Methods and Techniques, Chapter
17, Engineering Science Reference, USA.
[4] J.P. Ma and J. Jiang, Applications of fault detection and diagnosis methods in nuclear power
plants: A review, Progress in Nuclear Energy, Vol. 53, 2011, pp.255-266.
[5] Fan Li, May, Dynamic Modeling, Sensor Placement Design, and Fault Diagnosis of Nuclear
Desalination Systems, The University of Tennessee, Knoxville, 2001.
[6] Mark Schwabacher, A Survey of Data-Driven Prognostics, Infotech Aerospace Conferences,
2015.
[7] Enrico Zio, Francesco Di Maio, Marco Stasi, A data-driven approach for predicting failure
scenarios in nuclear systems, Annals of Nuclear Energy, Elsevier Masson, Vol.37, 2011,pp.482-
491.
[8] XiaojieGuo, Liang Chen, ChangqingShen, Hierarchical adaptive deep convolution neural
network and its application to bearing fault diagnosis, Journal of Measurements, Vol.93, 2016,
pp. 490-502.
[9] ZhenPeng, Lifeng Wu, Beibei Yao and Yong Guan, Fault Diagnosis from Raw Sensor Data
Using Deep Neural Networks Considering Temporal Coherence, Article from sensors, 2017.
[10] Ben Ali, Nader Fnaiech, LotfiSaidi, Brigitte Chebel-Morello, FarhatFnaiech, Application of
empirical mode decomposition and artificial neural network for automatic bearing fault
diagnosis based on vibration signals, Journal of Applied Acoustics, Vol. 8, 2015,pp. 15-27.
[11] Xiaoyuan Zhang a,n, DaoyinQiub,c, Fuan Chen, Support vector machine with parameter
optimization by a novel hybrid method and its application to fault diagnosis, Journal of
Neurocomputing, Vol.149, 2014, pp. 641-651.
[12] MengGan, Cong Wang n, Chang'an Zhu, Construction of hierarchial diagnosis network based
on deep learning and its application in the fault pattern recognition on rolling element bearings,
Journal on Mechanical Systems and signals. Vol.72, 2016, pp.94-102,
[13] Ye Tian, Jian Ma, Chen Lu, Zili Wang, Rolling bearing fault diagnosis under variable conditions
using LMD-SVD and extreme learning machine, Journal of Mechanics and Machine Theory,
Vol. 90, 2015, pp.175-186.
[14] Yaguo Lei, Member, Jing Lin, Saibo Xing, Steven X. Ding, An Intelligent Fault Diagnosis
Method Using Unsupervised Feature Learning Towards Mechanical Big Data , IEEE
Transactions on Industrial Electronics, Vol. 63, 2016, pp.3137 – 3147.
[15] Marco Fagiani , Stefano Squartini, Leonardo Gabrielli, Marco Severini and Francesco Piazza,
A Statistical Framework for Automatic Leakage Detection in Smart Water and Gas Grids,
Journal of Energies, Vol.9, 2016.
[16] Wei Zhang, Chuanhao Li, GaoliangPeng , Yuanhang Chen, Zhujun Zhang, A deep
convolutional neural network with new training methods for bearing fault diagnosis under noisy
environment and different working load , Journal of Mechanical Systems and Signal Processing,
Vol.100, 2017, pp. 439-453.
Analysis of Machine Learning Based Fault Diagnosis Approaches in Mechanical and Electrical
Components
http://www.iaeme.com/IJARET/index.asp 94 [email protected]
[17] Ioffe, C. Szegedy, Batch normalization: Accelerating deep network training by reducing internal
covariate shift, 2015.
[18] N. Srivastava, G.E. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: a simple
way to prevent neural networks from overfitting, Journal of Machineries, Vol.15, 2014, pp.
1929–1958.
[19] V. Muralidharan, V.Sugumaran, GauravPandey, Fault Diagnosis of Monoblock Centrifugal
Pump using Stationary Wavelet Fatures and J48 Algorithm, International Journal of Production
Technology and Management, Vol.1, 2011, pp. 0976 – 6383.
[20] Kohonen T, Self-organized formation of topologically correct feature maps, Journal of Biol.
Cyber, Vol. 43, 1982, pp. 59–69.
[21] BaopingCai, Yonghong Liu, Qian Fan, Yunwei Zhang, Zengkai Liu, Shilin Yu, RenjieJi, Multi-
source information fusion based fault diagnosis of ground-sourceheat pump using Bayesian
network, Journal of Applied Energy, Vol.114, 2014, pp.1-9.
[22] V. Muralidharan, V.Sugumaran, GauravPandey, Fault Diagnosis of Monoblock Centrifugal
Pump using Stationary Wavelet Fatures and J48 Algorithm, International Journal of Production
Technology and Management, Vol.1, 2011, pp. 0976 – 6383.
[23] BaopingCai, Yonghong Liu, Qian Fan, Yunwei Zhang, Zengkai Liu, Shilin Yu, RenjieJi, Multi-
source information fusion based fault diagnosis of ground-sourceheat pump using Bayesian
network, Journal of Applied Energy. Vol. 114, 2014, pp.1-9.
[24] Feng Jia, Yaguo Lei, Jing Lin, Na Lu, Deep neural networks: A promising tool for fault
characteristic mining and intelligent diagnosis of rotating machinery with massive data, Journal
of Mechanical Systems and Signal Processing, 2015.
[25] V. Muralidharan and V. Sugumaran, A Comparative Study between Support Vector Machine
(SVM) and Extreme Learning Machine (ELM) for Fault Detection in Pumps, Indian Journal of
Science and Technology, Vol.9, 2016, pp. 0974-6846
[26] Mohamad-Ali Mortada, SoumayaYacout, AouniLakis, Fault diagnosis in power transformers
using multi-class logical analysis of data, Journal of Intelligent Manufacturing, Vol.25, 2013,pp.
1429–1439.
[27] Ms.Aparna R. Gupta, V. R. Ingle, Dr. M. A. Gaikwad, LS-SVM Parameter Optimization Using
Genetic Algorithm To Improve Fault Classification Of Power Transformer, International
Journal Of Engineering Research and Application, Vol. 2, 2012, pp.1806-1809.
[28] Mohamad-Ali Mortada, SoumayaYacout, AouniLakis, Fault diagnosis in power transformers
using multi-class logical analysis of data, Journal of Intelligent Manufacturing, Vol.25, 2013,
pp. 1429–1439.
[29] Frank, A., & Asuncion, A., UCI machine learning repository, 2010 http://archive.ics.uci.edu/ml.
[30] HazleeAzilIllias, Wee Zhao Liang, Identification of transformer fault based on dissolved gas
analysis using hybrid support vector machine-modified evolutionary particle swarm
optimisation , PLOS ONE, Vol. 13, 2018,pp. 1-15.
[31] Ghunem RA, Assaleh K, El-hag AH, Artificial neural networks with stepwise regression for
predicting transformer oil furan content, IEEE Transactions on Dielectrics and Electrical
Insulation, Vol.19, 2012, pp. 414-420.
[32] S.Sharanya, Revathi Venkataraman, An intelligent Context Based Multi‑layered Bayesian
Inferential predictive analytic framework for classifying machine states, Journal of Ambient
Intelligence and Humanized Computing, https://doi.org/10.1007/s12652-020-02411-2, 2020.
[33] S. Sharanya, S. Karthikeyan, Classifying malicious nodes in vanets using Support Vector
Machines with modified fading memory, ARPN Journal of Engineering and Applied Sciences,
vol. 12, No. 1, 2017, pp. 171-176