00962673

Upload: ganeshbtech

Post on 06-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 00962673

    1/7

    An intelligent online

    priority12

    machine

    checkpoint description help fileconfirm whether the carry guide pins are in line with PCBconfirm whether the PC B IS in correct direction

    AVF-CHK007-1 GIFAVF-CHK007-2 GIF

    fault diagnosis

    4

    system

    confirm the timing for PCB 2 detect sensor AVF-CHK007-4.GI F

    by A. C. M. Fong and S. C. HuiTraditional help desk serv ice relies heavily on the expertise of service personnel.Th is article describes an intelligent data mining technique that combines neuralnetwork and rule-based reasoning with case-based reasoning to mineinformation from the customer service database for online machine faultdiagnosis. Th is technique has been implemented into a help-desk system thatsuppor ts online machine fault diagnosis over the Internet.

    ata m ining has been developed for people toanalyse, understand or even visualise hugeamo unts of stored da ta for business an dD cientific applications. Data mining is theprocess of discovering interesting patterns, associations,changes, anomalies and sign ificant structures from largeamounts of data stored in databases, data w arehouses orother information repositories. It can be used to helpcompanies to m ake better decision to stay competitive inthe marketplace. For the pa st few years, a number of da tamining applications and prototype s have been developedfor a variety of domains including m arketing, banking,financc, manufacturing and healthcare. In general, thedata m ining technique and function to be applied dependvery much o n the application domain and the nature ofthe data available. In this article, we describe a noveldata mining techniqueiniplemeiited to mine theinformation stored in thecustomer service databaseof a m ultinational companyfor machine fault diagnosisin a help-desk application.Customer servicesupport is becoming anintegral part of many multi-national manulacturingcompanies. These com-panies generally have acustomer service depart-ment that Drovides

    their customers. A help-desk service centre is usuallyestablished to answer frequently encountered problemsfrom the cu stomers. Service engineers from the help-deskcentre respond to customers nquiries via telephone callsand carry out on-site repair iC necessary. At the end ofeach service, a customer service report is generated torecord the problem a nd th e remedies taken to rectify theproblem. These service reports are then stored a s servicerecords hi a customer service database.In a tmditional help-desk scrvice centre, the idcnti-fication of machine faults relies heavily on the expertiseol the service support engineers. I1 is often a burdenon the company to train and retain a pool of expertservice engineers. Since the customer service databaseserves as a repository of invaluable information that canbe used for mach ine fault diagnosis, the customer service

    checkpoint group: AVF-CHK007

    I 3 I Confirm the Position of the quide lower limit sensor (1100165) I AVF CHK007-3.GIF I

    installation, inspection and Imaintenance support for Fig. 1 Fault-conditionand checkpoint information of a service record

    COMPUTING& CONTROL ENGINEERKG JOURNAL OCTOBER 2001

  • 8/2/2019 00962673

    2/7

    T DIAGNOSIS

    database can be mined to support customer serviceactivities.Customer service databaselt is necessary to reveal the structure of a customerservice database before mining it for online help-desksupport. Service records are defined and stored in thecustomer service data base to keep track of all reportedproblems and remedial actions. Each scrvicc recordconsists of customer account information and sei-vicedetails: fault condition a nd c heckpoint information, Faultcondition con tains the serv ice engineers description of amachine iault. Checkpoint information indicates actionstaken to rectify the fault. It contains checkpoint groupname, and checkpoint description with priority. Fig. 1shows an e xample of a fault condition an d its checkpointinformation for a service record.Data are stored a s unstructured text in the machine-fault an d checkpoint tables. There ar e over 70 000 service

    records in the customer service database w ith over 50000checkpoints. In addition. structured data on over 4000employees, 500 custom ers, 300different machiuc modclsan d 10 000 sales transactions are also stored. The newtechnique has been developed specifically for mining theunstruc tured fau lt-conditions and chec kpoints data formachine fault diagnosis.Survey of fault diagnosis techniquesCase-based reasoning (CBR) has been successfullyapplied to fault diagnosis fo r customer service support.CBR systems rely on building a large repository of pastservic e records in order to circumvent th e difficult task ofexh-acting and encoding expert domain knowledge. Itis one of the most appropriate techniques fo r machinefault diagnosis as it learns with experience in solvingproblems and hence emulates human-like intelligence.Howcver, the performance of CER systems de pend s onthe adequacy a s well as thc orgaiiisation of cases a nd th e

    COMPUTING & CONTROL EKGIKEERTNGJOURNAL OCTOBER 2001

  • 8/2/2019 00962673

    3/7

    algorithms uscd for retrieval from a largecase database. Most CBK systems use thenearest neighbou r algorithm for retrievalfrom the flat-indexed case database.which is inefficient especially for largecase datahase. Other CBR systems usehierarchical indexing such as decisiontrees." However, building a hierarchicalindex needs the knowledge of an expertduring the case-authoring phase.The neural network (NN) approachdprovides an efficient learning capabilityfrom detailed examples. The search spacein a neural network is greatly reducedbecause of the generalisation ofknowledge through training. In contrast,CBR system s need to store all the cases inthe case database in order to performaccurate retrieval. Thi s greatly increa sesthe search space. The CBK systems that

    service

    ....... , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , _ . . . . . . . . .neural network model generation rule base generation

    ig. 3 Knowledge extraction processstore only relevan t cases for an efficient retrieval lack theaccuracy as well as the learning feature of the NNs.Hence, neural networks a re well suited to case indexingand retrieval.Other data mining techniques, like rule-basedreasoning (RBR) approach, fuzzy logic, geneticalgorithms, decision trees, inductive learning systemsand statistical pattern classification system s5 have beeninvestigated. In addition, hybrid approaches such ashybrid case-based reasoning and neural networksc'have also been proposed. In this article, a data miningtechnique that integrates case-based reasoning. neuralnetwork and rule-based reasoning is proposed anddeveloped to mine the unstructured textual data of faull-conditions and checkpoints of service records from thecustomer service database for machine fault diagno sis.Intelligent data mining formachine fault diagnosisFig. 2 shows a framework for theintelligent data mining process com-prising the offline knowledge extractionprocess and the online fault diagnosisprocess. The knowledge extractionprocess extracts knowledge from thecustomer service database to form aknowledge base that contains the neuralnetwork models and a rule-base. Theneural network models and the rule-basework within the CBR cycle to sup po rt theonline fault diagnosis process. The faultdiagnosis process uses the four stages ofthe CBR cycle: retrieve, reuse, revise andretain, to diagnose customer-reportedproblems. It accepts the user's problemdescription as input, maps the descriptioninto the clos est fault co nditio ns of the

    faults reported before froin the know ledge base. andretrieves the corresponding checkp oint solutions for theuser to tr y to resolve the problem .Th e user's feedback onthe fault diagnosis process is used to revise the problemand its solution. The revised infurmation is retained asknowledge for enhancing its performance in futurediagnosis.Knowledge extmctim p ~ o c c s sFig. 3 shows the knowledge extraction process forextracting expert kn owledge from the unstructuredtextual da ta of the Pault conditions and checkp oints in thecustomer service database. There are two major steps inthe know ledge extraction process: neural netw ork modclgeneration and rule-base generation.In the first step, it extracts know ledge from the fault

    feedback IFig. 4 Reuse of checkpoint solutions

    COMPUTING & CONTKOL NGINEEKING JOURNAL OCTOBER 2001

  • 8/2/2019 00962673

    4/7

    conditions to train the neural network to build neuralnetwork models for classiiication and clustering. Faultconditions in the customer service database are first pre-processed to extract keywords. The pre-processingprocess is implemented using word list, stop list andalgori thms f rom W ~ r d n e t . ~he extracted keywords areused to form weight vectors to initialise the neuralnetwork. Then, the neural network is trained to gene ratethe neura l network model. Two types of neural networksare investigated.They are the supervised learning vectorquantisation (LVQ3) neural network and the unsuper-vised Kohonen self-organising map (KSOM) neuralnetwork. LVQS an d KSOM are used a s classificationan dclustering techniques, respectively, for intelligent faultdiagnosis.The classification echnique isused to classifyan instance of a new a uk description into one of theknown classe s of faults and then uses the suggestedsolution of the k nown fa ult for the current problem. Theclustering technique is used to extract information fronithe customer service databa se to form clusters of similarfaults and then classify a new problem insta nce into oneof the c lusters. The classification into a specific faultcondition can be d etermine d based on the closest matchof th e fault condition with the input pattern within thecluster. The clustering technique generally has betterefficiency but lower accuracy, as compared to theclassification ap proach .The seco nd step involves the extraction of knowledgefrom the checkpoint solutions of th e fault conditions to

    generate a rule-base to guide the reuse of checkpointsolution effectively. Th e ru le-base co nsists of controlrules and checkpoint rules. Control rules are codedmanually to specify the diagnostic procedure for thefiring of checkpoint rules so that the checkpoints can beexercised in sequence accord ing to their priorities. Usingthese two types of rules, the rule-based inferenceengine under the C Language Integrated ProductionSystem (CLIPS) environment8 can provide a step-by-s tepguidan ce to the user in diagn osing a fault condition.Fault diagaosis firocessThe fault diagnosis process consists of the followingphases : pre-proc essing of us er input. neural netw orkretrieval, reuse of service records, and revise and retainwith user feedback.

    In the first phase, the user's textual inputs areprc-proc essed again using Wordnet. In fact, theimplementation is very similar to the pre-processing oflault cond itions during the knowledge extraction process.In the neural network retrieval phase, similar faultconditio ns cxperieiiced in the pas t a re recalled an dranked according to the closeness of the retrieved faultcondition to the use r input fault d escription. The n euralnetwork performs retrieval by computing the winnerthroug h a competitive learning process. Th e winner is theone that corresp onds to the weight vector with minimumdistance from the inpu t vector. For the sup ervised LVQ3neural network, the winner node corresponds to a fault

    Fig. 5 Revise and retain with user feedback

    COMPUTING & CONTROL ENGINEERLNGJOURNAL OCTOBER2001

  • 8/2/2019 00962673

    5/7

    databases

    web browsers1

    onlinehelpdeskapplication

    service engineer

    condition with known checkpoint solution. In the case ofan unsupervised KSOM neural network, the winner noderepres ents a cluster. Th e retrieval of a specific faultcondition is based on the nearest Euclidean distance of allthe faul t conditions in the retrieved cluster.In the third phase, checkpoint solutions of the faultconditions retrieved during the retrieval process arereused. The checkpoints are presented in the orderaccording to the checkpoint rules fired as show n inFig. 4.The rules operate in a competitive manner todisplay the checkp oints in the order of their priority insolving the fault condition.In the last phase, the neural network indexingdatabase, the che ckpoint rules and th e service records inthe customer service databa se are updated based on userfeedback on the effectiveness of the fault diagnosisprocess. The input problem description and its pastcheckpoint solutions arc revised through user feedbackand upd ated into the relevant databases.As shown in Fig. 5, the user piovides leedback onwhether th e problem is resolved or no t. If the problcni isresolved, then the neural network indexing datab ase andthe checkpoint rule-base are updated. If the problempersists after trying all the checkpoints for all theretrieved fault conditions, the user can seek help from aservice engineer by filing a service request formthrough the Web. The engineer will subsequentlyupdate the customer service database through a

    Fig. 6 Web-based help-desk system for machine fault diagnosismaintenance programme.Online help-desk applicationFig. 6show s the Web-based help-desk system th at usesthe iiitelligeiit data mining technique Tor online faultdiagnosis. The customers can access the applicationusing an y of the commonlp used IVeh browsers, such asNetscape Navigator or Microsoft Internet Explorer. Fig. 7show s the Web-based interface for accepting user inputof a fault description. First, the user can enter an errorcode or the name of an error if available. A list of errorcodes and their corresponding fault conditions aremaintained fo r cfficiciit rctricval. If the error code isknown, then no other information is required froin theuser for lurther processing. The corresponding faultcondition can be identified and its checkpoints can heretrieved Otherwise, the fault description can be euteredin natural langu age or as a set of keyw ords. The user canalso provide thc naincs of niachinc compon ents and theirstates as input as shown in Fig. 7. If the use r inputcontains keywords that are not in the keyword list,synonyms of these keywords will be retrieved for userconfirmatioii a s input k eywords. This information is thencombined to form an input vector during the retrievalprocess.Fig. 8 show? the fau lt-cond itions retrieved by the LVQ3neural network when the user enters a fault description.Th e fault conditions displayed at the top of the screen

    COhE'UTING & CONTROL ENGINEERING JOLJRNAL OCTOBER 2001

  • 8/2/2019 00962673

    6/7

    customer service database. Once allmodifications are completed, the neuralnetwork indexing database and checkpointrules are modified using the 'upda te neuralnetworli' an d 'update rule-base' buttons.Performance analysisKetrieval performance in terms ofefficiency an d accura cy of the LVQ3 neur alnctwork for classification and the KSOMneural netwo rk for clustering is compared.

    r y. o n c u i c v a it'auica uaacu ui i Lhesnecified user inmt~ P L ' B ~ ~ V ~ ~ ~ ~ ~ P ~ ~ Y O R Y L ' B ~ ~ ~ ~ ~ ~ESF+SF%KEEDEI~ ~ ~ U ~ O ~ ~ ~ ~ lATTHEWEECTICWPaSnM O F B L S l l t ^ l B t T I O N I 2

    conditions and checkpoints

    These two neural networks arealso compared with the k NearestNeighbour (kNN) technique usedin the traditional CBR systems fo rretrieval A traditional CKKsystem using the kN N techniqueneeds to store all the cases in thecase base in order to pci-formaccurate retrieval. The use ofneural networks with CBRgrcatly reduces the search spacedue to generalisation ofknowledge through training. All

    ET 5MI.NlT DUE TO AN% PARALLELISMElUl

    NG MX. Fa MNG

    correspond to the ones closest to thc fault descriptionprovided by the user. The17 are ranked accor ding to theirmatchin g scores.If the problems of the customer cannot be resolvedthrough the Web-based help-desk system, a help requestform can b e filled in by the customers to do cument theirproblems. The completed form will be processed and aservice engineer will be assign ed to handle th e problemaccordingly Service engineers can in turn interact withthe maintenance programme to update service records.As shown in Fig. 9, the maintenance programme allowsservice engineers to modify service records in the

    experiments were carried out on a333MHz Pentium I1 system with 128MBRAM runninguiider the W indows NT operating system. Th e number offault conditions (the unique set) in the custom er servicedatabase was 9392 and the total number of faultconditions (the training set) was 70137. There were 2173entries in the keyword list. The number of words to besearched in the Wordnet dictionary was 121 962 and amaximum of 20 keywords was allowed in a faultcondition or user input. In each experiment, the timerequired for pre-processing of fault conditions w as foundto be 10 minutes 38 seconds. Table 1summarises theperformance comparison between the neural

    ~

    ~

    COMPUTING& CONTROL ENGINEERING JOURNAL OCTOBER 2001

  • 8/2/2019 00962673

    7/7

    FAULT DIAGNOnetworks and kNNs.each neural netvzorli 15 quite high, it isstill acceptable as tlie training is carried

    Table 1 Performance comparison between neural networks and kNNsAlthough th e trailling time fo r

    out onlv once offline. In addition. th e 16.7 77.6average online retrieval time for each LVQ3 96m 44s 1.9 93.2KSOM 264m 35s 0.8 90.3nenral network is Quite efticient. TheKSOM neur;il network requires a longertraining time but it performs more efficiently onlinecompared to the LVQ3 neural network.Retrieval accuracy generally de pends on the accuracyof pre-processing, he frequency of new keywords beingadded, the nnmber of incorrect winners comp uted by theneural network an d the degree of accuracy of the userinput. In supervised neural networks such as LVQS, theretrieval accuracy can be determined by measuring thenumber oi times thc correct fault conditions generated bythe neural network and the number of iterations required.In the unsupervised KSOM eural network, the retrievalaccuracy can be determined based on tlie closest matche dfault condition of the retrieved cluster. If the user inputconsists of many new keyword s that are not part of thekeyword list, the accuracy will be affected. However, aneural network learns to improve its accuracy in time.The learning rate is another important factor indetern iiiiing the numb er of itera tions required forconvergence. In particular, the convergence was found tobe fastcst with a learning rate o f 04 n lXQ3 neuralnetwork and 0.5 in the KSOM eural netw ork. At liijiherlearning rates , the system became unstable (i.e. noconvei-gencc is achieved). whereas lor lower learningratcs, the nuniber of iterations needed to converge wasquite high.Th e retrieval perform ance of the LYQ3 neural networkfor classification and KSOM neural network for cluster-ing was also compared with the liNN techniclue used inthe traditional CBR systems. T\yo popular variations ofliKN techniques were chosen for comparison. The firstvariation, denoted as kNN1, stores cases in a flat memorystructurc, extracts keywords from tlie textual descrip-tions and uses normalised Euclidean distance formatching. It always assigns equal weights to theindividual atti-ibu tcs (i.e. keywords);). Therefrx-e, retrievalis less accurate. The second variation, known as kNN2,uses the fuzzy -trigrani technique for matching. It assignsa positive score for every scquciice of three lettersmatched. Although this technique may be useful to checkspclling errors and grammatical variations, retrieval isquite inaccurate when com pared \+-ith he neural networktechniques. Moreover, the major drawback in both ofthese kN N techniques is that new cases retained ireindexed separately into the llat memory structurc andthus thc search space keeps on increasing, furtherreducing the efliciency. In summary, both LVQS an dKSOM perfo rm bette r than either of the kN N methods forretrieval in both the speed and accuracy bability to ge iie~i lisenform;rtioii through t

    ConclusionThis article has described a d ata mining technique thathas been implemented to mine da ta stored in a customerservice database to perform online machine fault diag-nosis in a help-desk application used by a multinationalcompany. The approach incorporates neural networkand rule-based reasoning w ithin the framework of acase-based reasoning cycle.NN extracts knowledge fromsei-vice records of the customer service database andsubsequently recalls the mo st appropriate serv ice recordsbased on the user's fault description during the rctrievalphase. RBR is then used to reuse the chec kpoint solutionsfrom the retricved servic e rccords a nd guid e Lhc customerthroug h a step-by-step approach to help diagnose themachine fault in the most effective manner.The machine problem and its checkpoint solutionare revised with user feedback. 'The revised informationis then retained by updating the relevant databases.l'erforniance evaluation has been carried out which hasshown that tlie proposed data mining approachoutpcrforms the traditional kUN technique of CBRsystem s in both a ccuracy a nd efficiency of retrieval. Easeof use, fast an d accura te retrieval a re amo ng the positivecommen ts often cited by end u sers of the il%b-based fau ltdiagnosis system.References1 I%I1ACtIMAN.K. .: 'Vliniiig biisiiiess ilatabasrs'. Co,ninuiricn(ioris iij'tirc A C N , 1996,39. (11). p. 12--lH

    1.pp . 81.106

    based reasriiiing in a hybi-id

    MITPrcss. 19YR)8 ]ohiison Space Cnitrr's homepagr: CLIPS (C 1,angu;ige 1iitrgr:itrdProduction System), 2001. Online document availablt. at UR L~c.nasa.jiovi-clips~CT.TPS.htiti~i(c,IEE: 2001A. C. M. Fang is with th c Institute of Infornxitiou a i id i\;lathe-matical Sciences, Massey University, L41ban): New Zealand,E~niail:.c.fongi~~massey.ac.nz.le is an IEE Mernber.S. . IIuiis with Nanyang Technological University, School of ConiputerEngineering, Nnnyaiiy il\,Cl1Llc. Singapore 6.39798, E-mail:[email protected].

    CORIPLTIKG RL CONTKOT, ENGINEERING JOLRNAL OCTOBER 2001 223