possibilities of decision trees applications for ... · tcsting and cvali~ntion of logic rulcs...

8
pecial lssue 7/24 261 -268 IGINEERING 531 1 Possibilities of decision trees applications for improvement of quality and economics of foundry production M. Pcrxyk "'*, A. Snroczy6ski ', R. Bicrnncki a " Metal Casting Dcpartrncnt Warsaw University of tcchnotogy, ul. Narbucta X5,02-524 Warszawa, Poland * Corrcsponding author. E-mail: M.Pcrzyk @acn.waw.pl Rcccivcd 16.0 1.2008; acccptcd in revised form 22.03.2008 Abstract Thc kcy aclivity arcas. rcIatcd to quality and economics of roundry production. arc prescntcd: designing of manurncturing proccsses, control of pmducrion processes as well as analysis of root causcs of pmccss faults and imgutaritics. m prcscntcd. Possihilitics of utilization of data mining mcrhods, including dccision (classification) trccs typc lcarning systcrns, arc indicated. In partkulnr, t hc rolc of that kind of toots in dccision making concerning sclcction or proccss typc and optimum materials and paramctcrs as wcll as in idcntificarion of proccss cxccssivc variations. similarly likc with thc control charts, arc d i s c u s d . Evalltazion rcs11Its of classilica~ion systems form rhc viewpoint of thcir applicability, accuracy and soflwarc avnilabiliky arc pmscntcd, including dccision zrccs, naivc Bayesian cla~sificr. rough scls thcoty, d~rcct nzlc induction mcthods as wctl as artificial neural networks. In the [inal part of t hc papcr a knowledge in ~hc form nllcs oblainctl form classification trces is dcmonslntcd using thc cxamplc of dccision making related to application or riscss for grcy casr imn castings. Keywords: apprication of information tcchnolopy to the foundry industry. quality rnanagcmcnt. data mining, dccision t m s . knowlcdgc mlcs. 1. Introduction In mnjoriry of manul;lcturing companies large amounts of data arc collcc~cd and srorcd. rclatd to designs. products, equipment, malcrinls, rnanufac!uring pmccsscs ctc. Utilization of that data For impmvcmcnt of prduct quality and Iowcring manufacturing costs rcquircs cxtmctinn of knowledge from thc data, in thc form of appropriac conclusions, rules and procedures. This can bc facililatcd hy mczhods orfercd by the new. interdisciplinary ficld rallcd data mining (DM). DM is rapidIy growing in rcccnt years. howcvcr, z~ntil now, it has bccn uscd mainly in busincss sphcrc. mcdicinc and social scicnccs. Applications lo rnanufacluring and dcsign on a largc scalc arc rcln~ivcl y scldorn 11-51, In the foundry production arca thcrc arc scvcral types of important practical probEcrns that can bc solvd through extracting the knowlcdgc from rccodcd pas1 data, such as: Dctcction of causes or irrcgularitics of the pmcss, Icading to dc~criorating product quality. This can apply lo thc final produczs, c.g, increasing pcrccnt of defcctivc castings. or to inrcrrncdiatc products. c.g. towcrcd strength of alloys or molding sands. Dctcction of causcs and prediction of breakdowns nf machincs. furnaccs ctc. Oftcn the causc of failurc is a combination of operation parameters which cannot bc idcntificd 'manually'. Indication of thc most suitablc proccss parameters for effective control of production proccsses. ARCHIVES of FOUNDRY ENGINEERING Vorvme 8, Special Issue 112008. 281-288 261

Upload: others

Post on 15-Jul-2020

2 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Possibilities of decision trees applications for ... · Tcsting and cvali~ntion of logic rulcs obtain4 from decision ttccs gcncratcd on thc basis of information includcd in thc crnpiricnl

pecial lssue 7/24

261 -268

IGINEERING 531 1

Possibilities of decision trees applications for improvement of quality and economics

of foundry production M. Pcrxyk "'*, A. Snroczy6ski ', R. Bicrnncki a

" Metal Casting Dcpartrncnt Warsaw University of tcchnotogy, ul. Narbucta X5,02-524 Warszawa, Poland * Corrcsponding author. E-mail: M.Pcrzyk @acn.waw.pl

Rcccivcd 16.0 1.2008; acccptcd in revised form 22.03.2008

Abstract Thc kcy aclivity arcas. rcIatcd to quality and economics of roundry production. arc prescntcd: designing of manurncturing proccsses, control of pmducrion processes as well as analysis of root causcs of pmccss faults and imgutaritics. m prcscntcd. Possihilitics of utilization of data mining mcrhods, including dccision (classification) trccs typc lcarning systcrns, arc indicated. In partkulnr, t hc rolc of that kind of toots in dccision making concerning sclcction or proccss typc and optimum materials and paramctcrs as wcll as in idcntificarion of proccss cxccssivc variations. similarly likc with thc control charts, arc d i scusd . Evalltazion rcs11Its of classilica~ion systems form rhc viewpoint of thcir applicability, accuracy and soflwarc avnilabiliky arc pmscntcd, including dccision zrccs, naivc Bayesian cla~sificr. rough scls thcoty, d~rcct nzlc induction mcthods as wctl as artificial neural networks. In the [inal part of t hc papcr a knowledge in ~ h c form nllcs oblainctl form classification trces is dcmonslntcd using thc cxamplc of dccision making related to application or riscss for grcy casr imn castings.

Keywords: apprication of information tcchnolopy to the foundry industry. quality rnanagcmcnt. data mining, dccision tms. knowlcdgc mlcs.

1. Introduction

In mnjoriry of manul;lcturing companies large amounts of data arc collcc~cd and srorcd. rclatd to designs. products, equipment, malcrinls, rnanufac!uring pmccsscs ctc. Utilization of that data For impmvcmcnt of prduct quality and Iowcring manufacturing costs rcquircs cxtmctinn of knowledge from thc data, in thc form of appropriac conclusions, rules and procedures. This can bc facililatcd hy mczhods orfercd by the new. interdisciplinary ficld rallcd data mining (DM). DM is rapidIy growing in rcccnt years. howcvcr, z ~ n t i l now, i t has bccn uscd mainly in busincss sphcrc. mcdicinc and social scicnccs. Applications lo rnanufacluring and dcsign on a largc scalc arc rcln~ivcl y scldorn 11-51,

In the foundry production arca thcrc arc scvcral types of important practical probEcrns that can bc solvd through extracting the knowlcdgc from rccodcd pas1 data, such as:

Dctcction of causes or irrcgularitics of the pmcss, Icading to dc~criorating product quality. This can apply lo thc final produczs, c.g, increasing pcrccnt of defcctivc castings. or to inrcrrncdiatc products. c.g. towcrcd strength of alloys or molding sands. Dctcction of causcs and prediction of breakdowns nf machincs. furnaccs ctc. Oftcn the causc of failurc is a combination of operation parameters which cannot bc idcntificd 'manually'. Indication of thc most suitablc proccss parameters for effective control of production proccsses.

A R C H I V E S of F O U N D R Y ENGINEERING V o r v m e 8 , Spec ia l I ssue 112008. 281-288 261

Page 2: Possibilities of decision trees applications for ... · Tcsting and cvali~ntion of logic rulcs obtain4 from decision ttccs gcncratcd on thc basis of information includcd in thc crnpiricnl

Prediction of proccss panmclcrs or p d u d chmctcristics (c.g. dimensional deviations) on thc hasis or cumnt and past values rccordcd as a scqucncc zypc data (timc series) EstabEishing rutcs Tor dcsign of casting proccsscs, c.g. rigging systcms. or for proccss operations. c.g. molding sand prcpararion, mclr ing proccdurcs crc,

Dara mining utilizcs mcrhodoIogics and trmls from scvcral disciplines such xr datahasc systcms, visualiza2ion. statistics and Icarning (Al) systcms. Thcrc arc many typcs o f thc mcthods availahlc, bowcvcr, thcir practical applications arc oftcn di f icuh bccausc of lack o f thc full knowtcdgc conccrning thcir hchavior and applicability to specific lypcs or practical tasks.

Thc aim of thc prcscnt work waq a cornparativc analysis o f various mcthods o f knowlcdgc cxtraction. in parlicular in thc form of logic rulcs. on thc basis OF rccordcd production cases. that can hc u s d in dcsigning and control of Foundry proccsscs. I'hc rcsults will hc a foundation o f selection erconstnrction olthc sortwarc for scmi-automazcd cnginccring wd opcrationaI knowldpc cxtnction in foundries, lcading to reduction o f timc ncccssary for introduction o f ncw products and improvcmcnt o f thc product quality and manufacturing cconarnics. Thc scopc of rhc prcscnt work includes:

Analysis o f rhe most important ncds of Coundrics and characterization o f pmhtcrns lor which thc knowlcdgc would be generated. Compawtivc analysis of suitability a l rhc most commonly uscd tools for knowlcdgc cxtraction from rccordnl data: dccision trccs, nnYvc Baycsinn classifier. rough scrs theory. dircct mlc induction ~ncthnds ns wcll as artilicial ncural nctwnrks. - Tcsting and cvali~ntion o f logic rulcs obtain4 from decision ttccs gcncratcd on thc basis o f information includcd in thc crnpiricnl nornographs nvailablc in foundry cnginccring Iitcri~turc.

2. Charactcrimtion of foundry-rclatcd problems from thc standpoint of knowlcdgc cxtmction mcthods

In this scction wmc chmtcristic groups ivrd t y p of mufxturing pmhlcms appearing in foundries m pscntcd, for which thc knowEcdgc cxtmctd firm thc pwt pprodiictio~i c a m would bc uxfiul. Thcsc prohlcmq inrludc dcsiyning of ~nanufncniring pmmcs. conrml and dctcction of pd t r r i on p m s irrcgulnritics as wclt dimvcsy of m t causcs oh thc pmccrs rnults and implaritics bcing n mum: of dctcrionthg product quality.

IIrsi,qtrirg of rl~e r)mr~fcltrrin,~ processes nnrf fw1it1.q in contcmplrary industry is asisled. by aduanmi mmputcr tmls. covcring simulation softwm. cxpcrt systcms b d on knowldgc acquiml Form human c x w s as wvcll .u thc knowldg cx tnc t l hy scmi ;lutomarorl DM mcthnds. A flowchart o f thc dcsigning pnkrss is shorn in Fiyr. 1. Thc dcsiping aids, both mnvmtiond mcs (formtllz, pmccdurcs, data b m s ctc.) aml rhc modcm knowldgc syskmq, play parliculnrly irnpn,mt rolc a thc initial s t a y of ihc dcsigninp pmcss. Thc provr choicc of thc manufacturing p m m drcmalivc in tllal phax allows duction or nutnhcr of dcsign versions .ad. consequcntty, thc numbcr of ncccsuy cornlions mulling Imm simulation ardor flmr ICS~S.

In T,lblc 1 sow cxcmplary chmtcriaic pmhlcm rc lad to sclcction or thc manufacturing process dlcmaiivcs .arc p m r c d ; rhc lrnowIcdgc ohtaincd by DM mcthods cml significmtly conlributc to thc right dccision making.

Dprerriott of i r r~g l~ tnr i f i~ ,~ (P.YCP.TS~I'E lnrinfimt.~) oJ rtmrrl{ac~rtritrg pwess is usudly p d o r m d with apprication of Stn~isticd Process Control m h d s (SPC). Thc t m 'contml' is usK1 to indimtc rha SIC includcs not only thc dctcction of appmxncc ofrhc irrcplxity but also analysis o f rhc c ~ w imd umgc or its rcsults to dacrminalion and ctiminalion of Ihc cmw. Thc fuml,uncntal tmls of SPC m coarnl chms which arc uxxl to dctcct t l x statistical process ins!ahili!ics. An innability can hc artrihurd to pvticular factors such ns opcmlor or tcam, machine or dcvice. hatch of material ctc.

A R C H I V E S a1 F O U N D R Y ENGINEERING V o l u m e 8 . S p e c i a l I s s u e 1 1 2 0 0 8 , 261-268

I

Rclat ionsl~ip and conditions Usttally rhc choice Is rclatcd to mmpromisc bcrwccn economics or production (casting yicld and utilization of the molding plnic arm) and quality (slag and oihcr impurilics can bc casicr stappcd by tlic opn pouring). ?'hc oprimutn solution dcpcnds on thc numbcr or cavirics in mold. Dcsircd typc of flow (laminar), assurance or caaahil ity. ~ ~ r i l i ~ a l i o a of gating system for fccding of thc costing (possihlc in snlnc cascs). MoId cavity filling scqucncc (cold shuts). tnisruns. rnoId crosion, tcmpcraturc distribution at thc onscr or solidificalion. Applicaion or inallat ing or cxotcr~nic slccvcs incrcnscs thc yicld. tcadin~ to savings on ctcaning of casting, cncrgy rctl~rircd for mclting, hut incrcnscs ~ h c dircct costs.

Casting yield, quality oFcasring.

d

w

1 Process 1 Sctcction prohlcm

-

00 .C e = .- L c 3

E -

u 5 E BIb '= C

2 .- -

Type of pouring systcm (closed - opcn)

Nurnhcr o f parts

Gating points

Convcmional, insulazcd or cxothcrmic riscr

Additional kcdcr or wall padding to cxtcnd the fccding distancc Applicalion or chills for cxtcnding OF fccding distances Casting yicld vs cornplicatcd molding process and scparntion of chiIls.

Page 3: Possibilities of decision trees applications for ... · Tcsting and cvali~ntion of logic rulcs obtain4 from decision ttccs gcncratcd on thc basis of information includcd in thc crnpiricnl

Convenlional

Export systems frwn human knowiedge

I Preliminar)r design (

LI Design version i = 1. . . . I I

Knowledge obtained

' I ' 4

b W I I 1 I I

* _ _ ~ _ _ _ _ _ _ _ _ I I I I I I I I I I I I

I Design final version

Fig. 1. Flow chart illustraling designing of manufacturing proccsscs i n contcrnporary industry. Urokcn Iincs indicatc tlic hack-flow of tl~c infnrmation uscd as a [mining data for DM knowlcdgc cxtrnction syncms

Thc most ortcn uscd type of control chams i s Shcwhart's m p l c mcm, for which scwd nllcs OF intcrprctntion hnvc hem fwmulatd. permitting idcntilication o f thc cxocssivc variation FW or its gcncnl causc. Thw n~lcs m yxzt togdlicr in Tnhlc 2. h a d on description givcn in 161.

Srcscnt authors propsc utilization o f thc cl;~sifimtion tools, c.g. docisinn t m q , ror idcntificalion of iypc of imgul,xity (as lislcd in thc right column of Tshlc 2). "Ihc idca i s 10 rcplxc thc inrotmalion such as thost: listrd in 12lc Icft column or thc tahlc, by n set of ac~ual valucs of rrccnt samplc mans, which ~ v i ~ l d hc inpitt (indcycndcnt) v.uiahles for thc classiliuutiuii rrcc. 'Ihc primmy tnining could he pcrfnrrncd on mificidly gcncntcrl data sets. inclitding vxio i~s rcdizdons of caws listed in T,hlc 7.71~ linal induction of thc trcc would bc b a d on data i~ tud inp r i l l c x s . mrdcr l in a givm cnrcrprix.

Ilicrown nJ roof cntrses of tmnfjmc~lrring p m e s s irregtrlnritirs, Icaqinp to dctcriumting product quality. is undot~btcdly onc of thc most impnmt tiiqks uhich m ~ t d bc p c d o d with ;l ~ I S C the DM rcchniqucs, particularly laming systcrns. Onc of thc pssibilitics is to

hi ld a rcgmion modcl of thc pmccss, i r l which thc input (indcpcndcnr) vcuiablcs iwuld bc widcly undcstnnd pm~css pnili lclcn amE thc output (dcpcndcni) v,uinhlc sl io~~ld chiuiictcrizc I t ~ c pnsccsl; qualily. An analysis or lhc i ndc l should indicate thc rnoR si~iiificiml input vatia2ltcs i.c. thosc which ;iTTcct tlic allpit in thc largcst cxtcnl - thcsc nrc thc most prohnhlc c n ~ s c s orthc clt~nli~y drnp.

Thc inpltt v;lriahlcs caul Irc rctors rcl:ltcrl to mntcrinl, innchinc, man, nrrpnim~ioo, cnvimninmr ctc ~vtlilc thc qi~dity can hc dcfincd by thc pmdua's prnpcny lcvcl (c.g. srrcngb) or rr;~ction ordcfcctivc pals. harnplcs o f Ishiki~irq diagnrns, i l lus t~~ inp relationships appxing in snmc Snitndry pmcctsc~ cnn fm~nd in I5, 7-91, That typc of p p h s could hc a pocxl hi~ridntion ,nor constnlnio~l nf thc pmcss mdcls, p ~ i c t l l x l y xlclcctinn of ~lrc input v;irinhlcs.

7hc sipiiificnrrcc analysis o l indcpmdm! vnriablcs cam hc dw cnrricd out with a usc or lion-pmtncrric stxistical m t h d s s11ct1 m ANOVA and contingoicy lahlcs. I lorvcvcr. thc prcscnt authors hnvc dc~nonstratcd i m ~ n n n t ;ulvantagcs o f zhc mc~hds h;~uxl on 1c:mling systclns ovcr thc slatistical nncs [ 101.

Tahlc 2. Interpretation oipoinz distrihutinns on SPC samplc mcan charts 1 Paltcrn of mints dislrihutfan 1 Intcmrctntion of naltcrri I~roccss fault sien.71) 1

A R C H I V E S of F O U N D R Y ENGINEERING Volume 8 , Special Issue 112008, 261-26B

9 poin~s in a row on onc side orccntml linc.

6 points in a row stcadily incrcnsing or dccrcnsing

14 poin~s in s row nltcrnating up and down

2 out of 3 points in n row in 7m1ic A or hcyond

4 out o f 5 v i n t s in n row in 7knc R or hcyond

15 points in a row in Xonc C X p in ts in n row outsidc or Zonc C

At1 ir~~ponant factor influcnccd thc proccss

Urirt in [hc proccss avcragc Two systcma~ically alternating causes arc influciicing thc prrlocsq

"Early warning"' of a potcnt ial pmccss shift " E d y warning " of a potential prnccss shift

Variability was rcduccd Proccss is arfcczcd by difkrcnt ktctors, rcsul t itrg in :I Ili~tiod:il distribution of rncnns

A~nrrtliatr: 7mnc A is dcfincd as tlic arca bctwccn 2 and 3 lirncs slyrnn ahovc and hclow tflc ccntcr lint: %one 1% i s dciiiicd as thc arc:i hawccn I and 2 tirncs s i~ma, and a n e C is dcfined as thc arca trawccn thc ccntcr linc and 1 titncs sictnn

Page 4: Possibilities of decision trees applications for ... · Tcsting and cvali~ntion of logic rulcs obtain4 from decision ttccs gcncratcd on thc basis of information includcd in thc crnpiricnl

3, Tools for knowlcdgc extraction from m s of dccision tm. dcdicatcd for applications in foundry industry, arc pnsmrd.

recorded production cases

?hc analysis king thc suhjcct of thc prcscnt smion is a rcsult of exltmsiw litmturc sudics. horh ~ h c w n t I y puhlishcd handbooks and r c m h rrprfs LS WII xs lhc ~ U I ~ ~ O T S ' awn m h , mainly rclatcd to thc damion n l mot causcs oh dctcriomting quality. 'lhc summary of those stludics ,uc prcscnfcd, in a symbolic form in Tablcs 3 and 4. Tcrms md dcnntiltions u d in the ~ablcs m cxplaincd bclow.

,,Classification modcl" means a systcm suitabtc for classilicat ion or ncw ahjccts without explicit rulcs prcssntation; ,,Precision" dcnotcs ability of flcxiblc and precisc fining rhc modcl ro data: ..Availability'" rclates to software and includcs ils uscr- rricndlincss and pricing; ? mcans a Inck of a scliablc information for givcn application: * Prcscnt authors own soliwarc.

4. Testing of logic rulcs generation from decision Erces

'Ihc sofiwarc markcl mgni t ion has brm d c which rcsrdtal in sclcction of thc s o i ~ w m pacbgc ~ i n c ~ c t ~ ' vcnion 100M. p v i d d by a US company R~rpEc Insight. I t offcrs a l q c mrjcty of pssihilif ics such as:

prclirninary statislid ;d visualization analysisotdafa: building md intcrrogaion of classifiation and q m i m t r m with v.uious splitting critcria and puning options; building and in!emgation of hooslcd tm: building and intcmgdion of nave Uaycsian classilicK logic nilcs pmtat ion ;

* input variablc sigifiwncc analysis, ba.. on classification d rcgmion irccs as well as n3ivc Ilaycsiai clawificr. clustcr analysis: m i a t ion mlcs gcncmtion: wide pssibititics of 3D rcsults visudi7ation.

k mmprchcnsivc rcfcmcc, including llic sonwm csscntids and iuncrioning as twll as uscr's pidc .and tiaorinl in clcctronic vcrsion is wailablc.

In this ch,aptcr somc nctivitics biliaing m a r c h as wit as pepamtion of tcnching coums rclnicd to knowlcdgc ccxtmrion by

aramctcrs 1 Pmccss dcsign I +++ I + 1 +++

Table 3. Applicahility OF knowlcdpc ohtaincd with DM mcthods for solving production prohlcrns Type of knowtnlgc

Classification oPcascs Vcrhnl I Cli~ssificatiun

Classification and rcgrcssian ~rccs

Tyrr OF task

Verbal logic rulcs Applicab~l ~ t y +++

Prcdiclion will1 rqmsion

Precision +++ Availabili~y +++ Applicithil ity + Prccision -+ + AvailahiIity +++ Applicabil ity +++ Prccision +++

Significance OF prtxess

I'MCCSS controE h~)clcct~on of ~rrcgulvri~ier (erccsrivc vuriat ions) in pmcess I>c~cr~n~nat Ion of root causcs of proccss irrcgulnr~t ics (quality detcriorarion)

. . . ...

Classification models Applicahil ity +++ Precision +++ Availability -H+ Applicnbility +++ Precision 4-k

+++ + ++

Availability +++ Applicability ++ Prccisinn +++

+ +++ ++

++

- 3 $'

Prediction will1 rcgrcssion motlcls Applicability +++ Precision ++ Availability +++ Applici~hility ++ Prcc isinn + AvailabilityS + ++

Applicability -

I I Artificial neural networks

Applicability -

++

+++

Dircct induction or rulcs

Applicability +++ Precision +++ Availahili~v +++

I

Appl ~ c a h ~ I ~ t y +++ Prcc ision +++ AvaiEahilEfv +

SEg~liIjcancc af process

Availability + hppl icabilily +++ Precision ++

I Avaitability + Appl iciihil ~ t y +++ Precision ++ + Availnhil itv +++

pn ramctcrs Appl icahility +++ I'rccision ++ Avnilnbility +++ Applicability +++

Availability + Applicilhil ~ t y ++ Precision ++ Availability +

Precision +

. . Srccision ?

Applicability ? 1 Prccision Availahilitv* +++

2M A R C H I V E S o f F O U N D R Y E N E l N E E R l N G V o l u m e B , Spec ia l Issue 1 1 2 0 0 8 , 2 8 1 -268

Page 5: Possibilities of decision trees applications for ... · Tcsting and cvali~ntion of logic rulcs obtain4 from decision ttccs gcncratcd on thc basis of information includcd in thc crnpiricnl

4.2. Testing data stt~ shrinkagc has thc pouring tcmpcratitrc, whilc thc othcr vnrinblcs arc much Icss significant.

Fbr thc prcliminary cvafua!ion of docision trcm as h e engineering knowldgc cxtixtion tools thc data mrds wcrc obtainctt ar d u t s from a nomopph published in thc proicssional litcraturr: rclacd to laundry rochnology [ 1 1 1. T h i s nomograph, comprising a semiempirical knowldgc, is widcly u d Tor calculation of thc fccding shrinkage o f grey. caq iron castings and dctcrmination of appmp-iatc dimensions of r i s c ~ . Thc fundamental decision which should he d c in dcsiping of rigging systems for [hat kind of castings is whethes rhc amlicmion of a riscr is ncccssary. Thc so called riserless dcr;ign can hc apprnpridc whcn the imn expansion. which occurs during the salidification pcrid, is capahte of mmpcnsation its shrinkage. which takes p lxc during cooling of thc liquid phaw, i.c. whcn rhc o v d l volumc chmg ( c d l d inaccurately sbrinkagc) will k psitivc. Thc volumc changcs appearing during cooling and sol idi ficalion of p r y cast iron caslings d c p d on:

p r i n g tcmpmIurc (sutxthcating of thc alloy), nffccring mainly the liquid con~mion, cooling mtc of the casting dcpndcnt mainly on its massivcncss and dcfincd by solidification modulus chemical composition of cast iron (dcfinod by the frictions of two p u p s of ctcmcnts: c&n and summay fraction or silicon and phosphorus),

It is wonh noticing that the rctdonships bctwccn shrinkagc nnd h c above quantitim arc not idcyxndcnt on c x h orhcr, c.g. onry musivc cdings can hc purcd from lowcr tcmpcnturcs. In gcncnt, thc cornpIcxity of thc pmbtcm rcs~~lts in that analytical tncthds of calculation o f shrinkagc and rjxrs ,arc not avail,&lc.

Numhcr of madouts of ~ h c nomognph madc for v.mious combinations of all i n p t vari,&!cs was 1111. 7hc continuous output variahlc vducs (shrinkagc 5 ) wcrc convcnd to nominal (disemc) oncs, expressed by cla5scs. Two vmions of the output vxinhlc classifications wcreassud:

Versiorr I: two valucs: .siscr not q u i d " ( i t S M ) and ,dxs r e q u i d (if S43.

Versiorz 2: thrm vducs dcfining thc nmssity of usc ad t y p of thc riser: .. not rcquircd'" (if SM). ,small" (if - 1964) and J q c " (if S<- 1%). W h n thc rim wEumc is ~lar ivcly stnil, ir is usudty cost ine!Tcctivc lo apply thc cxothcrmic slccvcs. whilc for I 'qe riscr volumcs tbc slccvcs arc commonly uscd. That ~ypc of classifimion would h thcrcfotr hclprul in sn.lking dccisioo concerning both thc ncoxity of riscr .md its Ijqw.

Finally. two tcst data sets wcre obaincd. each of four rcal valuc inputs (carhon contcnts C. %- summary content of silicon and phosphorus Si+Pi. %. solidification modulus mo. cm, puring tcmpcraiurc t p r . "c) and with one output. i n the form o f the ahovc dcfincd two typcs of nominal vatucs. That type of data sets can be considcrcd, in a certain cxzcnt, as cxamplcs of rcal, noisy data sets obtnincd in jn$ustrial conditions. On !he other hand. they cxprcss rhc hiddcn relationships about which thcrc i s much known. thus pr rn i t~ ing Gctter intcrprcta~ion of thc rcsulls of testing thc trccs and rulcs induction.

A prcliminary stat isrical analysis o f significance of input variables (in respect of the shrinkage S) was carricd out (clctailcd mcr hodology can bc found in 11 01). The results. prescnred in Fig. 2. show that unquestionably predominant influcncc on the

Conlingency lab& tea

Fig. 2. Rclativc significances o f input vxiablcs for ou lp t variablc cast imn shrinkagc obtaincd by statistical non-pammctric methods

4.3, Rcsults of trccs and rulcs gclneration

In Fig. 3 a graphic rcprcscntation o f thc decision trcc for Versiotz I dam sct is ptcscnrcd, obtaincd for the MineSct softwarn dcfautt sctt ings (pcssimistic pruning at confidence levd 0.7). Thc sizcs of thc Ihrcc bars appearing in cach n d c rcflccl numbcrs of rccords (cascs) OF lhc data sct: d i rcctd to lcrt branch. right branch and split in the node (base bar), In Table 5 !he inrorma~ion available for thc tsec shown in Fig. 3 is presented.

Thc simplcst form of the vcrbal rulc cquivalcnt 10 thc gcneratcd trec will bc:

,, ff a casting rnodtrlus is larger then I. 125cm and a polrrit~g tetnptrartlre is lower rlrerl /zs~)c, ?hn,l n riser is not r~ecessa~y else a riser is necessary ".

This rcsult is in agrccrncnr with cxpcctations hascd oo foundry exjxrience.

I t is worth noticing that thc pmccdurc uscd for trcc induction has comptetcly ignorcd thc two lcss significant varinbrcs (dcfining the cast iron chemical composition), which cok~ld csscntially bc a rcsult of a low precision of thc trcc model. Ilowevcr. n closer cxarnination of thc training data rcvcalcd that thc sign of shrinkagc. dcciding about thc nccd of riscr application. i s a rcsuIt of thc pouring rcmflcratrrrc and casling modulus only. In othcr words. thcrc was no pair of rccords in which ihc pouring tcmpcraturc and casting modulus would hc thc samc and only onc or b o ~ h o f two ignorcd variahlcs would hc diffcrcnt, which would havc diircrcnt cla~qcs of thc output variable. For that kind of data. thc m c structure could not bc diffcrcnt [torn thc obtaincd: onc.

In Tablc 6 tbc information availahlc for tmc. obtained for Version 2 data set with the MincSct softwarc dcfault settings (pcssimistic pmning at confidcncc lcvcl0.7). i s prcscntcd.

A R C H I V E S of F O U N D R Y E N G I N E E R I N G Volume 8 . Spec ia l l s s u a 112008. 2 8 1 - 2 6 8

Page 6: Possibilities of decision trees applications for ... · Tcsting and cvali~ntion of logic rulcs obtain4 from decision ttccs gcncratcd on thc basis of information includcd in thc crnpiricnl

Fig. 3. Illustntbn of dccision tree gcncratcd by MincSct soitwarc for Version I data sct

Tabte 5. Logic relations and numerical values corresponding ~ h c dccision (classification) trcc shown in Fig. 3 (Vcrsiori I data sct), obtained for pessimistic pruning at contidcncc lcvcl 0.7 (cotumn hcadinps added by the prcscnt authors)

I 'tmr, "<= l2SO:'mo. ern'<= 1.125 1 riscr rcquired 1 1,125 10,6 10, 100 1100 1

Path to node - logic rule left part

'tpor, "c<= 1250:'mo, cm'> 1.125 1 riser not rcquircrl 1 > 1,125 141,0 1 100,O 1 100 '~por, 'c'> 1250 I riser required l>1250 10,144 1 0, 100 1 100

The results obtaincd Tor the Version 2 data set (with three nominal valucs of thc output variable, i.e, riscr: "not required", "smal t" and "largc") for dclault settings of thc MineSet program are so complex that khcir presentation in a form of onc or a few simple verbal rulcs is difficult. Nevertheless, using those results

Splitting variablc in node or leaF(rcsu1t of dassificat ion)

for decision making regarding kcding mcthod in any particular case is simple. I t Is worth noticing that here the tree induction algorithm has also utilized the previously ignorcd variables of low significance, defining the cast iron cbernicaI cornposi~ion. which also affect the classification results.

Notation of branch Icwing to node

Becausc of thc complexity of thc tree obtained for the Version 2 data set. another method of trce pruning was tricd. Instead the MineSet defauIt pessimistic pruning, the cost-complexity criterion

Sizcs of output variable values in node (riscr mt ncccssary / riser ncccssary)

at the default level = 0 (trce of minimum cast) was applicd. The rcsutts arc shown in Tablc 7.

For thc so simplilicd trcc ~ h c vcrbal dccision nllc can bc also relatively simplc, c.g.:

,.IS a y ~ u r i n g ternperatlrre is lower tltcn I ~ s ~ ' c , then for n casfing modul~ts larger ~ h a n 1,125~1~1 o riser is not rcqrrired ~vhile jor a casting ~nodulus smaller ~lrarr 1,125c~n a anal1 riser is necessary; in all other cusps a large riser is needed.

It is worth noticing that for thc pruning mcthod bascd on thc cost-complexity criterion a significamly simpler trcc was obtained, comparcd to thc pcssirnistic pruning, which is in agreement with a gencral tcndcncy for thcsc pruning mcihods [ 121. In panicular, rhc rclativcly lcss significant variables, defining the cast iron chcmical composi~ion, wcrc ignorcd.

264 A R C H I V E S o l FOUNDRY E N G I N E E R I N G Vo lume B , Spec ia l Issue 1 1 2 0 0 8 , 261-268

B fact ions of outpitt variahlc valucs in n d c (riscr not ncccssary I riscr ncccssary)

Node purity

Page 7: Possibilities of decision trees applications for ... · Tcsting and cvali~ntion of logic rulcs obtain4 from decision ttccs gcncratcd on thc basis of information includcd in thc crnpiricnl

Tahlc 6. lapic rclarinns ;u~d sclcctcd numcricnI valucs corresponding ihc dccision (classificntion) ~ r c c ohtninctl lor Itrrsior~ 2 data sct fnr ~cssirnist ic pn~ning at confidcncc lcvcl 0.7 (column hcadincs nddcd by thc prcscnt ail~hors)

I 1

Splitting varinblc in nodc Sizcs of outpur variahtc valucs in nndc or Icnr (rcsult af

classi ficalion) (riscr not ncccssnry / riscr ncccssary)

'~por, ['C'c= 125Q:'mo. cm'c= 1.125 small 0.6, 0 iCtl

' tpor, "c<= 1250:'mo, cm'> 1.125 not rcqtlirctt O,O, 41 I 0 0

'tpor, 'c'> 1250 --- tpnr, IJU 126, 18,O 65.71)

'tpor. "~'r 1250:'tpor. "c<= 1350 !no. cm 30. 18.0 30.7X

11 'rpor. '"> 1250:'1por, "c'<= 1350:'1no, cm'> 2.25:'C. 1 C. % 16. 18.0 I 4X.X

9;> 3.1:'C. %'<= 3.5:'Si+P. %'<= 2.5:'C. K> 3.3:'Si+P, %<= 1.5 'rpor. "c'> 1250:' l~r. "CC= 1350:'rno. cm.3 2.25:'C. small %> 3.1:'C. W<= 3.5:'Si+P, ?'<= 2.5:'C. G,'> 3.3:'Si+P. W z 1.5 'tpor, 'c'> 1 250:'ipor. O r < = 1350:'inn. c~n'> 2.25:'C. sn1~11 W> 3. I:'C. %'<= 3.5:'Si+P. '?;:> 2.5 'tpor, "> 1250:'zpr. "C'T= t 350:'1no. ciit'> 2.25:'C. smaFl %'> 3. I :'C. %'> 3.5 'tpor. ['c'> 1750:'lpor. "c> 1351) large

Tahlc 7. Iagic rcI:llions and sclcc~cd ni~mcricnl vnlucs comsponding thc dccisien (clnssificntion) ~ r c c nhlninctl for I'iv-sinr~ 2 tlntn scr for p~Exity prtlning crilcrinn = O (column headings nddcd by thc prcscnt authors)

1 I 1 I

Path to nodc - logic n ~ l c lcft par1 1 Splizting variable in nodc or Sizcs ofclt~rpat variahlc V~I~UCS ill nodc 1 Icd (result or classification) (riser no1 ncccss~ry / riscr ncccssnryl

I Nnde purity I

I 'lpor. "Cc= 125O:'rno. c111'<= 1.125 1 small I 0.6, 0 I 100 1

' tpor. ['c' ' tpor, "(:'c= 12513

5. Summary and conclusions Thc studics and wsrs prc.wnrd in rhc pnpcr allow hc~tcr understanding ihc mle that application of DM mcthods can play ia

inn, cin

A R C H I V E S n l F O U N D R Y E N G t N E E R l N G Volume 8 . Spec ia l lssue 112008. 263-268

I

126.24.4 1 21.23

0.6.4 1 I 65.24

Page 8: Possibilities of decision trees applications for ... · Tcsting and cvali~ntion of logic rulcs obtain4 from decision ttccs gcncratcd on thc basis of information includcd in thc crnpiricnl

designing, control and fault diagnostic of mufactur ing pmcsscs. The characteristics o f thcxl pmbtcms and thc DM tools as wctl as the proposed mommcndations arc not restricted 10 foundry processes bur [hey can bc Sully and directly u t i l i zd in othcr manufacturing pmxsscs.

Dccision trrcs appmcd to bc rr3ativcly simple md convcnicnt tmls for knowledge rulcs genmtion, cnahling a flexibility of ~ h c choicc between a l q c numbcr o f prccisc rulcs and a smal t amount of mugh rules. giving simple hints for dceisian making in various situations in designing and running manufacturing pmcsscs.

Funhcr rcscmh should bc aimcd at systematic mrnpmtivc twting and andyws o f features and ahilitics o f various knowlcdgc extraction systcms, in pmicular such modcrn tools likc thosc bascd on thc muph scts thmry.

References

[ I ] A. Kusiak. Data mining: manufacturing and scrvice applications, Inrcrna!ional Journal of Procluc~ion Rcscarch. vnl. 44, No. 18-19 (2006)4175419l,

/2] J.A. Ilarding, M. Shahhaz, Stinivas and A Kusiak, Data mining in manufacturing: h rcvbw, J Mmuf Sci Eng Trans ASME. v01. 12% No. 4 (2006) 969-976.

131 K. Wang. Applying data mining ro manufacturing: The nalurc and imp1 ications, J Inlcll Manui, vol. 18 No. 4 (2007) 487-495.

[4] M. Pcrzyk, Dala mining in foundry produclion, Research in Palish Mctallurgy at thc Beginning of XXI Century, Cornrnittcc of Mctallurgy of thc Polish Academy of Sciences, ed. K. ~wi~ tkowsk i , Krakhw, 2006.

[5] M. Pcrzyk, R. IELrnncki and J. Kozlowski, Data mining in manufacturing: methods, potcntids, limitations, Advanccs in Production Engineering confcrcncc, Warsaw Univcmity of Technology, Poland, 13-16 Junc 2007, 147-156 (Publishing and Printing I-lousc of rhc Institurc Tor Sustainable Tcchnologics - NRI, Radom, Polmd).

[6] StatSarf, Inc. (2007). Electronic Statistics Textbook. Tulsa, OK: Statsort. WEB: http:/Jwww.slatsoft.comrtcxtbooW stathomc.htmt

[7] X. Guo, Implementing Six Sigma in Foundry Industiy, AFS Transactions, vol. 110 (2002), 199-210.

[8] S. Kannan. J. E Thixton, System Approach to Casting Defect Analyses and Reduction: IIydrogcn Gas Dcfcct in Imn Castings, Am Transactions. vol. 112 (2004). 115-1 19.

[9] P.L. Barkcr, B. Bidassic, Using Sratistical Tools to Dclcct and Improve Corc Shift: A Casc Sludy. AFS Transactions, VOI. I12 (2004), 121-130.

/I01 M. Perzyk, J. Kozlowski, Comparison o f statistical and neural networksbased methods in analysis o f significance and interaction of manufacturing pmccsscs pararnctcrs. Computer Mahds i n Materials Scicncc, vol. G, No. 2 (2006). 8 1-93.

[Ill A. Holzmliller. R. Wlodawcr. a h n lahre S~iser-Einguss- Verfahren fur Gus=iscn. Gicsscrci. vol. 50. No. 25 (3963) 78 1-79 1.

1121 J. R. Quinlan, Simplifying decision t r m , lntcmational Journal of Man-Machine Sludics. vol. 27. No. 3 (1987) 22 1-234.

A R C H I V E S of FOUNDRY ENGINEERING Volume 8, SpeciaF Issue l f 2 0 0 8 , 261-26&