bayesian waste disposal

Upload: nabilahetong2714

Post on 09-Apr-2018

223 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 bayesian waste disposal

    1/18

    Application of Bayesian network to the probabilistic risk assessmentof nuclear waste disposal

    Chang-Ju Lee a, * , Kun Jai Lee b,1

    aKorea Institute of Nuclear Safety, P.O. Box 114, Yuseong, Daejeon 305-600, South KoreabKorea Advanced Institute of Science and Technology, 373-1 Kuseong, Yuseong, Daejeon 305-701, South Korea

    Received 24 April 2004; accepted 16 March 2005Available online 25 May 2005

    AbstractThe scenario in a risk analysis can be dened as the propagating feature of specic initiating event which can go to a wide range of

    undesirable consequences. If we take various scenarios into consideration, the risk analysis becomes more complex than do without them.A lot of risk analyses have been performed to actually estimate a risk prole under both uncertain future states of hazard sources andundesirable scenarios. Unfortunately, in case of considering specic systems such as a radioactive waste disposal facility, since the behaviourof future scenarios is hardly predicted without special reasoning process, we cannot estimate their risk only with a traditional risk analysismethodology. Moreover, we believe that the sources of uncertainty at future states can be reduced pertinently by setting up dependencyrelationships interrelating geological, hydrological, and ecological aspects of the site with all the scenarios. It is then required currentmethodology of uncertainty analysis of the waste disposal facility be revisited under this belief.

    In order to consider the effects predicting from an evolution of environmental conditions of waste disposal facilities, this paper proposes aquantitative assessment framework integrating the inference process of Bayesian network to the traditional probabilistic risk analysis. Wedeveloped and veried an approximate probabilistic inference program for the specic Bayesian network using a bounded-variancelikelihood weighting algorithm. Ultimately, specic models, including a model for uncertainty propagation of relevant parameters weredeveloped with a comparison of variable-specic effects due to the occurrence of diverse altered evolution scenarios (AESs). After providingsupporting information to get a variety of quantitative expectations about the dependency relationship between domain variables and AESs,we could connect the results of probabilistic inference from the Bayesian network with the consequence evaluation model addressed. We gota number of practical results to improve current knowledge base for the prioritization of future risk-dominant variables in an actual site.q 2005 Published by Elsevier Ltd.

    Keywords: Scenario; Probabilistic risk assessment; Bayesian network; Probabilistic inference; Causal dependency; Likelihood weighting algorithm;Uncertainty propagation; Waste disposal

    1. Introduction

    The notion of risk represents a possibility of potentialharm against human beings. Over past two decades, we haveprovided this notion in many different ways to assess anyhazards from daily activities. Kaplan and Garrick provided aconceptual guidance on the risk analyses by introducinggeneral notions of risk [1,2] . According to their denition,

    the concept of sets of triplets, i.e. the scenario, thelikelihood, and the consequence are emerged. Actually, it iswell known that a formal decomposition of risk can be givenby three querieswhat can happen; how likely things are tohappen; and what are the end points measures from sets of occurrences [2]. This has become a standard approach forpredicting the risk.

    A nuclear risk generally relates with regular emissions of radioactive nuclides or possible accidental releases of radioactivity [3]. In a nuclear aspect, the risk has beendened as an exceeding expectation of the magnitude of undesirable radioactive releases, e.g. the product of probability and consequence of an accident [4]. The endpoints measures of the nuclear risk prediction may beexpressed with some measures such as core damage

    Reliability Engineering and System Safety 91 (2006) 515532www.elsevier.com/locate/ress

    0951-8320/$ - see front matter q 2005 Published by Elsevier Ltd.doi:10.1016/j.ress.2005.03.011

    * Corresponding author. Tel.: C 82 42 868 0149; fax: C 82 42 861 1700.E-mail addresses: [email protected] (C.-J. Lee), [email protected]

    (K.J. Lee).1 Tel.: C 82 42 869 3818; fax: C 82 42 869 3810.

    http://www.elsevier.com/locate/resshttp://www.elsevier.com/locate/ress
  • 8/7/2019 bayesian waste disposal

    2/18

    frequency, exposure dose of humans, or the number of casualty. They can be used to interpret the variouscontributors to the risk with which are compared andranked [5]. We can identify, these measures with limiting(or regulatory) indicators which affect about humans orenvironment.

    The results of a deterministic approach are usually judged in view of specic measures known as performancecriteria. The performance criteria used to facilitate com-parison with specic conservative regulatory criteria.Unfortunately, we are not sure that the performance criteriaproduce best-estimate values under incomplete analystsknowledge bases. On the other hand, a probabilisticapproach in quantifying the risk, generally known as aprobabilistic risk assessment (PRA), has been used toprovide a probability distribution of the end pointsmeasures in various elds such as nuclear reactor operation,environmental forecasting, space accident prediction, etc.

    [68] . The subjective probability distribution represents anuncertainty against the impact of such a lack of knowledgeor stochastic processes. The uncertainty must be practicallytreated in the estimation of the end points measures. Thedepth of uncertainty depends upon our total state of knowledge; upon all the evidence, data, and experiencewith similar courses of actions in the past [1]. Therefore, anumber of factors affect a variation of the risk results.Especially, long periods of evolution in natural environmentfacing diverse stochastic processes have inuence on theuncertainty [9]. Evolution examples coming from stochasticprocesses in the environment involve potential changes of weather, hydraulic prole, geological form, and ecologicalcondition, and so on. Since the prediction for such anevolution is extremely difcult, a knowledge base might notbe easily provided. Due to both lack of the knowledge andstochastic features, the potential risks to future generationsmay be subjected to complex uncertainty propagationassessment via PRA techniques.

    Diverse approaches for categorizing uncertainty have setforth last two decades. Most risk analysts [1012] agree thatall the variables used in the risk analysis models containboth epistemic and aleatory uncertainty because of theirunique nature. In estimating, the variables in a modeldomain (called here as domain variables), therefore, wehad to consider a system-specic feature originating from avery broad range of aleatory and epistemic uncertainties,besides an issue of numerical solution error due to themathematical modelling approximations and temporaldiscretization in simulations, as denoted by Oberkampf [13] . The issue of numerical solution error has beengenerally ignored in risk analysis, which is also beyondour concerns. Recently, some remarkable studies have beenconducted, which focussed on uncertainty propagation withemphasis on the separation principle for both uncertaintytypes in every variables. This approach has been widelyintroduced by Hoffman [14], Hofer [15], Siu [16], and

    Helton [17] etc., and known as so-called divide andconquer concept.

    A lot of risk analyses have been performed to actuallyestimate a risk prole from both uncertain future states of hazard sources and undesirable scenarios. These risk prediction would require a systematic process for the

    scenarios or accident sequences. In addition, a comprehen-sive risk prediction needs a complete listing of scenarios andestimation of their frequencies, as well as rankings of occurrence probability of the scenarios. The scenarios willbe formed by combining both initiating events andmitigating capabilities. Therefore, the scenario in the risk analysis can be dened as a propagating feature of specicinitiating event which can go to a wide range of undesirableconsequences [18], which answers to the question, Whatcan go wrong?

    If we imagine an initial stage of scenario, a hazard sourceitself can be a trigger for specic end points measures.Various hazard sources exist in diverse natural or man-madesystems. In terms of treating scenarios, we can classifystates of systems with twoactive and passive. First, thestate of system is called active if an event can be directlycontrolled according to its happening simultaneously,whose risk has been easily assessed using a traditionalPRA technique. The operating risk from nuclear powerplants is categorized into this case. In this case, through a lotof operating experience, as well as by taking the support of standardized and enhanced evaluation techniques, the risk from scenarios can be systematically predicted. Next, on theother hand, the state of system is called passive if anyevent cannot be directly controlled according to its

    happening simultaneously, whose risk then scarcely hasassessed via the traditional PRA technique. The future risk from radioactive waste disposal facilities will be categor-ized into this case. In this case, with natural hazards beingconsidered, the risk under any of the scenarios should beestimated.

    With a view to preventing exposure or unacceptable risk to human health, which includes an adverse impact to thenatural environment, a waste disposal facility has to beisolated from natural circumstances. Furthermore, it is justnecessary to demonstrate that waste disposal treatment isreliable and that both man and environment can beadequately protected. The risk assessment of a disposalsite includes specic models for addressing the majorpathways of radionuclides ill-affected human beings. Sincethe system performance in the passive state is very sensitivedepending on a variety of scenarios, a temporal evolution bychange in both external and internal environmentalconditions becomes a fundamental factor in estimating therisk. However, since the release of risk-dominant radio-nuclides from a disposal site is expected to occur afterhundreds or even thousands years, usually there are nodirect methods of assuring the credibility of the evolution.

    The risk prole coming from combination of likelihoodand consequences of future transport and exposure depends

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532516

  • 8/7/2019 bayesian waste disposal

    3/18

    on aleatory uncertainty, as well as epistemic uncertainty. Inaddition, sources of the uncertainty at future states can bereduced by pertinently setting up dependency relationshipsinterrelating geological, hydrological, and ecologicalaspects of the site with all the scenarios. It is then obviousthat current methodology of uncertainty analysis of a waste

    disposal site should be revisited with these diverseconsiderations. The uncertainties arisen from differentevolutions should be adequately propagated throughoutthe analysis. Interpreting the signicance of results of a PRAin view of the uncertainties is also important if the PRAresults are applied to make meaningful risk-informeddecisions. Subjective probability distributions on thenumerical results are given by uncertainty analysis and areused to represent the condence level where a risk criterionis being met. While it is important to characterize the overalluncertainty, it is equally worth understanding well whatfactors or variables mainly propagate the uncertainty [19].

    It is required to explicitly consider distribution functions of random variables that dene both aleatory and epistemicuncertainty associated with each variable in a scenarioevaluation. In some variables, only epistemic uncertaintymay be treated [2024] .

    After a survey of relevant studies [9,18,20,25,26] , wefound most scenarios developed for a disposal site, withoutconsiderations on above-mentioned dependency treatmentrequired, were not practically linked to the PRA. Further-more, as to the evolution modelling for stochastic domainvariables under adversely-environmental conditions inwaste repositories, scenario analyses up to the presenthave not given a specic and realistic approach. Thesendings notice us not to lose sight of how dominatingdomain variables contribute to the uncertainty of the risk inrelation with diverse complex phenomena and/or scenarios.Therefore, the rst and the second terms used in sets of triples of risk denition are delineated in detail in thispaper. The last term of sets of triples, i.e. the consequence,may be easily identied using deterministic evaluationmodels. In order to consider the effects owing to theevolution of environmental conditions of waste disposalfacilities, this paper proposes a quantitative assessmentframework combining an inference process of Bayesiannetwork with a traditional risk analysis.

    The remainder of this paper consists of four sections.Section 2 provides the methodology framework for usingBayesian network in the study. The probabilistic inferencealgorithm employed is explained in Section 3 in detail.A deterministic model of radioactive waste disposal,including an explanation of the dependency betweenscenarios and domain variables is presented in Section 4.This section provides an overview of scenario categoriz-ation, as well as characteristics of the domain variables.A major portion of Section 4 is devoted to the developmentof the evaluation models in relation with the informationgiven in Section 3. This section also provides some

    illustrative results and discussion. The last section summar-izes and concludes the study.

    2. Framework development for the risk assessment usingBayesian networks

    2.1. Overview

    Bayesian network also known as belief networks,probabilistic networks, Markov random elds, and causalnetworks, is a new concept for reasoning complex uncertainproblems, where network means a graphical model. Inutilizing Bayesian networks, it is general to communicatequalitative assumptions about cause-effect relationships andto derive causal inferences from a combination of diverseassumptions. Bayesian networks have been applied todiverse technical or societal problems, including medical

    diagnosis (e.g. PathFinder, QMR [Quick Medical Refer-ence]), map learning, real-time monitoring, languageunderstanding, pattern classication, forecasting, heuristicsearch (e.g. Microsoft troubleshooter), etc. [27]. In addition,in terms of providing decision analysis and expert systemsusing articial intelligence (AI), they have been popularlyutilized [28].

    Since Bayesian network constitutes a model of reasoningprocess in a system, answers to a variety of queries arerequired. For example, a system analyst may want knowingwhether or not it is worthwhile to reinforce engineeringprotection features in order to defence a random hazard (e.g.an initiating event). The use of Bayesian networks helps toanswer such the queries even when no experiment about theeffects of reinforced protection is available.

    The reasoning processes can be operated by propagatingprobabilistic evidence in any directions, where variousforms of reasoning [29] such as prediction, abduction, andexplaining away are made: for example, if a forwarddirection of reasoning from a cause (i.e. a predecessor) to aneffect (i.e. a descendant) is occurred, then prediction isformed; if a backward direction of reasoning from an effect(i.e. a descendant) to a cause (i.e. a predecessor) is occurred,then abduction is formed; on the other hand, if we havemore than two causes and one of those causes is found, then

    that reduces the likelihood of causes of the others, so calledit as explaining away. As these reasoning processes withuseful evidence are made, Bayesian network can provide abasis to perform uncertainty propagation of query variables.In order to build overall reasoning processes for Bayesiannetworks, following three steps are generally engaged:

    Structuring the graphical models for causalityrepresentation.

    Learning the structure or dening parameters of thequery variables.

    Estimating the probability of the query variables by anyinference algorithm.

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532 517

  • 8/7/2019 bayesian waste disposal

    4/18

    As explained by Heckerman [30], Bayesian networkshave been explicitly applied as a popular graphicalmodelling language to represent the causality underuncertain relational situations in a real world. However, asaddressed in the above, a whole reasoning process isdeemed to be a complex task in itself. Unfortunately, since a

    practical implementation of Bayesian networks requirescomplex modelling of the inter-relationships betweenvarious initiating events and engineered mitigating featureswhich diversify the risk prole, they are far from popular inthe nuclear PRA. Accordingly, after considering thecausality from the initiating events to the engineeredfeatures, which will be explained in the following sections,we propose a conceptual framework to apply Bayesiannetworks to the nuclear PRA. In the PRA adopting Bayesiannetworks, it is kept in mind that an initiating event simplymeans a conditional initiation for any occurrence of thescenarios.

    2.2. Representation of the cause-effect dependency

    In this section, we suggest a special technique forapplying the general reasoning concept of Bayesian net-works to the PRA of a nuclear facility. Since our approachmainly focuses on connecting scenarios (i.e. causes) toconsequences (i.e. their effects), we are mainly interested inthe direct representation of the real circumstance, not in afull reasoning process of Bayesian networks.

    A structure of Bayesian network is generally representedby an annotated directed acyclic graph (DAG) G, wheresome nodes denote variables of interest, and arcs, i.e.

    linking lines, realize a probabilistic and causal dependencebetween random variables that they connect. Usually, arcsare drawn from a cause to an effect. Incomplete DAGsencode expert knowledge in the form of missing arrows.Therefore, the DAGs are regarded as a substantial languagefor communicating conditional independence assumptions,and for representing qualitative causal inferences [25].Since the DAGs represent high dimensional uncertaintydistributions, construction of Bayesian networks helps tocapture engineers intuitive understanding of complexsystems or problems. Once the network is congured,subsequent computations are pursued by symbolic manipu-lation of probabilistic expressions. In general, after assign-ing a probability (or its distribution) to a node conditional onthat nodes predecessors, the joint probability (or itsdistribution) over the whole network is determined [26].

    Presented with a step-by-step decomposition of thenetwork structure for solving a problem (e.g. for describingthe performance of a system), Fig. 1 shows an overallprocess for making DAG G which nally gives a causalitybetween random variables of E denoting a cause (input likean occurrence of any initiating event) and R denoting aneffect (output like a response of given problem). As a rststep for representing G, we can gure a problem to a simplestructure as outlined in Fig. 1(a). However, it seems that we

    cannot provide any practical reasoning for the problem onlywith this structure. If we want resolving the causality (i.e.relationship of cause and effect) in detail, then we have tomake a further expansion in G. With this desire, thestructure in G is divided into two modules providing withdecomposing pathsfrom E to V and from V to R,shown with dotted boxes in Fig. 1(b). A random variable V denotes a trigger (a kind of ltering interaction) reectingthe fact that the input E cannot always produce the output R. In other words, by introducing the upper module (i.e. atrigger) in the decomposition of Fig. 1(b), we can consider apossible interaction to hinder a direct impact frominsignicant causes. The trigger V remains in a hiddenstate. Moreover, we can represent this idea by expanding theupper module in Fig. 1(b). Fig. 1(c) shows a result usingAND gate and NOT operator, where M denotes aninhibitory indicator (or mechanism) that prevents anycause E from giving an effect R and thus has two statestrue (on) and false (off). As a last step for guring G, we

    Fig. 1. The causality represented in four decomposition steps: (a) a basicDAG simply representing a cause and an effect; (b) more extended DAGconsisting of 2 modules; (c) module 1: a detailed DAG between E and V using a concept of M ; (d) module 2: a detailed DAG between V and R withn-elements of X . (Note: All the symbols in a node represent randomvariables. E denotes an occurrence of any initiating event while R denotes a

    response of given problem. V denotes a trigger or a ltering interactionwhich reects the fact that the cause E cannot always produce the effect R. M denotes an inhibitory indicator or mechanism that prevents any cause E from giving an effect R. X k is an actual system performance indicatorcorresponding to a domain variable k under the dependency of initiatingevents).

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532518

  • 8/7/2019 bayesian waste disposal

    5/18

    have to fully represent the lower module of Fig. 1(b) asshown in Fig. 1(d). In the gure, we also introduce somerandom variables which can give a system response Rfollowing after the instantiation (it commonly means that arandom variable becomes a true state) of triggering;therefore, let a random variable X k an actual system

    performance indicator corresponding to a domain variablek under the dependency of initiating events.With the above-mentioned decomposition steps, we can

    fully represent the causal relationships from the occurrenceof initiating events to the system response. In the generalgraphical representation it is noted that the circle nodesdenote continuous random variables, squares denotediscrete; furthermore, a clear node indicates a hidden stateand a shaded represents an observed. The states of unshadednodes in the gures, therefore, cannot be observed in anactual circumstance.

    2.3. Modelling considerations for buildinga network structure

    In a graphical structure adopted in Bayesian networkscausality plays a great role. Practically, when identifyingdirect physical connections we should consult a graphicalstructure. According to this consideration, Bayesian net-works take advantage of expressing a graphical descriptionof the dependencies between random variables. It is alsonoted that when we make a network structure, the nodesmust always be numbered in topological order, i.e. ancestorsbefore descendants, to express the causality. Using thisvariable ordering and a concept of conditional independen-

    cies, we can usually congure the network structure.We always want making a graphical structure predict

    responses and infer causes from evidence. To do this,Bayesian network needs a corresponding database and a setof explicit assumptions about our prior probabilisticknowledge of a domain. It is noted that almost all theobserved things can be represented with the evidence, i.e.available information about the states of nodes. By way of interfacing structure model to data, all kinds of evidencescan be propagated within the structure.

    With this modelling concept, we attempted to represent amore general structure model for application of a nuclearPRA. Fig. 2 shows a fully-specied Bayesian network corresponding with an actual inference problem. For thepurpose of showing an illustration, a multiply-connectednetwork with four query nodes is outlined in Fig. 2. Thenetwork structure has a characteristic to encode conditionalindependencies among random variables. Therefore, in theselected graphical models, a missing edge explicitly meansthat there is an independent relationship between randomvariables. Absence of a direct link between any E and R, forexample, reveals that there is no direct inuence of initiatingevents on the nal system status. In describing Bayesiannetworks, we use a lower-case letter to represent any singlestate of a random variable and an upper-case letter as a set of

    state of a random variable. We also write xZ k , for example,to denote that variable x is in state k . When we observe everystate of a variable in any set X , we call it a state of X .

    The logic of the structure provided in Fig. 2 is explainedin detail. E i in a root node represents the happening of aninitiating event i which affects physical domain variables of given system. As a descendant node is affected by differenttrigger nodes, diverse connections from any jth node of V (V j) to any k th node of X ( X k ) exist depending on their causalcharacteristics. For instance, in some variables there is onlyone connection, but in others, 3 or 4 connections arepossible. As far as the response R from an instantiation of X k is concerned, we may dene it as an adverse impact on thesystem or any other consequential measure of the system.

    Direct inference of the network structure shown in Fig. 2is not easy to solve and requires numerous probabilitydistributions because a causal complexity exists betweenboth ancestors and descendants. For the purpose of simplifying the inference after structuring of the model,we may take up the concept of a noisy-OR gate [28,31] ,specied by a specic function incorporating independen-cies between nodes relevant to the converging path. Letstake an example for expressing this case. As denoted inFig. 1(c), if an individual trigger ( V j following E i) occurstogether with its corresponding inhibitory indicator M , theinstantiation of their descendant ( X k ) in Fig. 1(d) is givenonly that input is present, i.e. P jZ P ( X k jV j only), where theexpression P ( X k jV j only) means a conditional probability of k th X node given only V j node. In addition, because theprocess which prevents the input signal from beingsufcient is independent of the others, we can easily assumethat the causal mechanism in the connection path from V jto X k is mutually independent. Therefore, if this assumptionis accepted, the noisy-OR gate model can be selected.

    Fig. 2. A Bayesian network with fully treating causal relationships in PRAproblems (As an illustrative purpose, four E s, V s, and X s are congured asnodes, respectively. Three R nodes are also provided for the states of theend points measure).

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532 519

  • 8/7/2019 bayesian waste disposal

    6/18

    In the noisy-OR gate model, each input variable has aprobability of being sufcient to cause an output variable.Given any combination of the input, the probability of adescendant is expressed with a form of minimal cut upperbound (using a principle of inclusionexclusion) as

    Pr X k jV 1 ; V 2 ; . ; V nZ 1 K 1 K pV 1 1 K pV 2 . 1 K pV n (1)

    where the left expression species the probability distributionof X k conditional on its parents of V 1 , V 2 , . , V n; n denotes amaximum number of V node; pVj means the input probabilityof theparents V j to X k given an arbitrary number j, jZ 1,2, . ,n.It is well known that the principle of inclusionexclusion cancompensate the combined belief of outcome by the degree of belief that is common to a different set of causes. The mainadvantage of the noisy-OR gate model is that the number of parameters in the variables of interest may be reducedproportionally to the number of causes ( n), while it isexponentially increased (2 n with binary state) in theapplication of other general gate types. As a consequence,the noisy-OR gate simplies knowledge acquisition, savesmemory space, and allows evidence propagation in timeproportional to the number of parents of each variable [32].

    When modelling a real world, we often encountersituations where an effect has many potential causes.However, if we practically consider the actual occurrenceof initiating events, since it may be reasonable that multiply-occurring initiating events are relatively rare under the timespan considered, we can reduce Eq. (1) to an equation withrst order probability. Furthermore, with adopting assump-tion about mutually exclusive situation for preventing dualinitiating events, the above reduced equation can be furtherreduced to a single term but disjunctive element on thepossibility of each initiating event. As a result, we cansimplify the structure of Bayesian networks with one rootnode. Given the response R shown in Fig. 3, only one end

    effect is also proposed for the purpose of illustrativeapplication.

    3. Approaches for a probabilistic inference in Bayesiannetwork

    As denoted by Henrion [28], the notion of probability isknown for the best way to actualize uncertain beliefs, and itsrole in Bayesian network is to provide a belief basis aboutqualitative encoding of a network structure. Overlookingbased on our available knowledge base, expressing ourbeliefs for the random variables states in a term of probability, is quite a demanding task. In solving variousreal problems, most values of the probability would beobtained by engineering judgment rather than by insufcientempirical data.

    3.1. Representation of variable states

    Generally, random variables have been regarded as amatter of state. In an ideal case, they can have two simplestatestrue and false. However, this is not t for a generalcase. Actually, a node can be gone to diverse statesrepresenting continuous values.

    As discussed by Heckerman [31], it is well acknowl-edged that, in complex networks, parents inputs should becarefully treated so as not to induce an impractical problem.Suppose n parents X 1, X 2,. , X n bearing on a descendant Rcan have m states, respectively. According to the charac-teristic of Bayesian networks, we must specify the

    probability distribution of R conditional on every state of its parents. Thus, we have to specify mn probabilitydistributions for representing this causality. Lets takeanother example for continuous random variable; thevalue of root nodes given in Fig. 3 might be a Poisson-distributed probability with parameters depending on theiroccurrence frequencies.

    Any random variable is represented by either acontinuous scalar or a set of discrete values. In Bayesiannetworks, however, it is usual that a lot of variables suchas a decision variable are inherently discrete. Eventhough a variable representing time-dependent situationsmay be inherently continuous, it can be also treated asdiscrete in the simulation for the convenience of computation.

    If a set YZ {Y 1, Y 2,. ,Y n} is composed of n nodes of discrete random variable Y i, iZ 1,2, . ,n and given aspecied DAG G in a Bayesian network, a joint probabilitydistribution over the set can be represented by the product of the individual probability distributions over each Y iconditional on its parents. In other words, the causalindependence relationship in a graphical model informs ushow a global high-dimensional probability distribution overall the variables can be factored into a product of localprobability distributions over lower dimensional spaces.

    Fig. 3. A simpliedDAG of Bayesiannetwork for considering non-multipleinitiatingevents in PRA (For the purpose of showing an illustration, n nodesfor random variable X s and one node for a state of the endpoints measureare provided).

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532520

  • 8/7/2019 bayesian waste disposal

    7/18

    If we denote the parents set of any ith variable Y i in G byPa i Z fPa i1 ; . ; Pa

    imi g where mi is the number of parents of

    Y i, and assign their values with a corresponding probabilityset f i Z ff i1 ; . ; f

    imi g, Bayesian network denes a prob-

    ability distribution over the possible assignments of valuesto all the variables as

    PrYjGZ Yn

    iZ 1PrY i Z yijPa

    i1 Z f

    i1 ; . ; Pa

    imi

    Z f imi ; G:

    (2)

    Eq. (2) implies that, given the parents sets and G , eachvariable Y i is conditionally independent of the other parents,which implicitly provides a well-known denition of d-separation (directional separation) [33] in Bayesiannetwork model.

    3.2. Conditional probability estimation

    In Bayesian networks, coming into existence of anystates of random variables is termed an instantiation. Theprobability of each instantiation can be simply calculated asthe product of the conditional probabilities of its prede-cessors. Therefore, we need specifying a complete jointprobability distribution (JPD) over all the random variables.Full JPD over a given domain can be provided with acombination of conditional independencies in network structure and local probability models. From the rules of probability, if we will apply Fig. 3 as an example, we havethe conditional probability as

    Prxj r ZPrx ; r Pr r Z P

    e ; v Pre ; v ; x ; r

    Pe ; v; x Pre ; v ; x ; r : (3)

    In the above equation, Pr( e,v, x ,r ) is the joint probabilitydistribution of a set of random variables e, v, set of x , and r determined from the overall network structure. In reality,since this distribution may not be feasible to estimate, it isdesirable to efciently utilize an alternative approach withthe consideration of conditional independence assumption.The expression of Eq. (3) can be expanded in a way thatreects the structure of Bayesian network itself [29]. It isalso noted that all the classical and logical operators can bebrought when dening unknown parameters in Bayesiannetworks. In the case of Fig. 3, if n-element X s are identiedin network, the JPD can be expressed, based on the chainrule of probability:

    Prei ; v j ; x ; r Z PreijqePrv jjei ; qvPrr jx ; qr Yk Pr xk jv j ; q xk (4)

    As explained previously, lowercase letters are commonlyused here to represent particular instantiations of variables.A bold lowercase letter x represents a set of instantiations of X elements. Each q shown in right side terms of Eq. (4)denotes the parameter of each query node. We can state thatthe variables of X k are conditionally independent of V

    because the dependence between X i and X j is conditional onthe certainty of V , where is j. If we expand the Eq. (3) byapplying Eq. (4), we can obtain an explicit form of theconditional probability:

    Prxj r

    ZPr r jx ; qr Pe Pr

    e ijqePv Prv jje i ; qvQk Pr

    x k jv j ; q xk

    Pe Pre ijqePv Prv jje i ; qvP xPr r jx ; qr Qk Pr x k jv j ; q xk (5)In functional and logical determinations, the state values

    of the descendant node are wholly determined by the valuesof the parent nodes. Under the circumstances of divergingconnections in Fig. 3, for example, where any of the pathsfrom one parent node V to descendant nodes X s remainunchanged, information can be transmitted from any X k node to the other X k 0 node through the parent V unless V isinstantiated. On the other hand, in case of converging

    connections where any of the paths from parent nodes X s todescendant node R also remain unchanged, if nothing isknown about R, then Rs parents ( X s) are being inter-independent so that no information can be transmittedbetween them.

    3.3. Simulation algorithm for a probabilistic inference

    In performing a traditional PRA, a variety of sensitivityanalysis are generally used to examine the importance of various parameters and assumptions for risk modelling,including specic risk attributes. On the other hand, in

    Bayesian networks it is possible to use a learning processsupported by data on the structure and parameters of variables. In a learning process with xed graphicalstructure, a parameter tting problem is exposed and itcan be resolved with available reasoning process considered[34]. Many structure identication methods or learningalgorithms are also used to optimize a structure, which is aprocess for structure learning. However, since we let a xedstructure in the study, methodology adoption for structurelearning is out of scope.

    A key point of practical reasoning is to take the evidencepropagation by computing the answer probabilistically forparticular queries about the domain, which is generallyreferred to as a probabilistic inference. Therefore, in order toobtain an adequate solution on the reasoning with parametertting, it is necessary to consider an appropriate probabil-istic inference algorithm [35]. The probabilistic inferencefocuses on estimating the posterior distributions of randomvariables when relevant evidences are observed. There aretwo kinds of probabilistic inferences; observing the effect(s)of a given model and trying to estimate the causes is termeda diagnostic reasoning; observing the root(s) of a givenmodel and trying to forecast the effect is termed a predictivereasoning. Both forms of reasoning can be selected inBayesian networks. The probabilistic inference should give

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532 521

  • 8/7/2019 bayesian waste disposal

    8/18

    an adequate answer to a query in a reasonable amount of time. It has been frequently provided to query problemswithin the specied bounds.

    A lot of efcient methodology or algorithms for evidencepropagation in Bayesian networks have been proposed [36].Various exact inference algorithms have been also devel-

    oped for the probabilistic inference [37]. However, since anexact inference solution in real-world networks seems to beimpractical, approximate probabilistic inferences have beenwidely surveyed by many researchers. A wide variety of approximate inference algorithms for Bayesian networksshare many common traits. They have realized utilizingstochastic simulation-based, search-based, and variational-based techniques [38], and so on. As a special case, search-based approximation algorithms, which search for highprobability congurations through a space of possiblevalues, have appeared as an alternative to the overallalgorithms in Bayesian networks with extreme probabilities

    [39] . Variational-based inference methods, meanwhile, usea special technique, which exploits the law of large numbersto approximate large sums of random variables by theirmeans [40].

    One of the categories in approximate inferencealgorithms, stochastic simulation-based algorithms suchas direct sampling, rejection sampling, likelihood weight-ing algorithm, and Markov Chain Monte Carlo algorithm(sometimes Gibbs sampling) are popularly used, althoughthey depend on the characteristics given a problem.Advantage of the Monte Carlo algorithms includes theirsimplicity of implementation and theoretical guaranteesof convergence. Disadvantages of the Monte Carloapproach are that it can be slow to converge and behard to diagnose its convergence. The accuracy of stochastic simulation-based algorithms generally dependson the size of samples, irrespective of the structure of Bayesian network. However, these algorithms have alittle disadvantage, as they converge very slowly if thereare extreme probabilities in the conditional probabilitytables (CPTs) [37].

    Finding a sufcient and effective inference algorithmin static Bayesian networks is desirable with a computingefciency of simulation considered. Based on theconsideration for characteristics of the study, it may bereasonable to apply a simple simulation-based algorithm.However, since general marginalization using jointprobabilities requires exponential time in the probabilisticinference, we had to consider more efcient methods. Tomeet the above demands, we nally selected the best outof a lot of inference algorithms, i.e. likelihood weightingalgorithm, emerged as a variation of stochastic logicsimulation for the purpose of giving an algorithm speed-up. Instead of randomly choosing a value, the algorithmtakes values with conditional probability as a likelihoodweight. For instance, in the likelihood weighting, eachsample has a weight, and a nal ratio is used to compute

    the probability. The following explicit advantages havegiven to the algorithm:

    It can be applicable to any type of nodes, discrete orcontinuous.

    It avoids the inefciency of rejection sampling by

    generating only events that are consistent with theevidence. That is, it xes the values on evidencevariables and samples only query variables left.

    Each event generated is consistent with an evidence.Also, it weights each sample by the product of theconditional probabilities of the evidence variables,given its parents.

    Furthermore, it is noted that the algorithm displaysmuch faster convergence time than a simple stochasticsimulation, but is still slow under extremely smallconditional probability values. For applying to Bayesiannetworks with extreme conditional probabilities, Dagum

    [38] proposed a bounded-variance likelihood weightingalgorithm to provide efcient probabilistic inference inpolynomial time. This updated algorithm is adopted in thestudy.

    The weights in a network make local assertions aboutthe relationship between neighbouring nodes. Inferencealgorithms turn these local assertions into global assertionsabout the relationships between nodes. In the followingdiscussions, let E denote a set of observed node, Y thenodes not contained in E . It is remarkable that aprobabilistic inference problem can be easily solved if we have a JPD of the random variables involved. To

    provide an inference probability Pr( Y Z

    yjE

    Z

    e) which is anextended expression of similar term as denoted in Eq. (3),we must generate the relative approximation of twoprobabilities Pr( Y Z y,E Z e) and Pr(E Z e), respectively,with a relative error 3. In order to make these approxi-mations, the likelihood weighting algorithm utilizes atechnique to decompose the full joint probability Pr( Y Z y,E Z e) into a path probability distribution p ( y,e) and aweight distribution u ( y,e) presented by

    p y; eZ YY i2 Y PrY ijPa Y inE Y Z y; EZ e ; (6)and

    u y; eZ YEi2 E PrEijPa Ei Y Z y; EZ e ; (7)where the expressions are conditioned, for example

    QPrEijPa Ei Y Z y; EZ e , to denote an instantiation of theirarguments, in this case Y to y, and E to e; also, Y i \ E denotesith arbitrary node which is contained in a set Y but notincluded in the evidence subset E . Next, in order to use thealgorithm, lets dene a new binary random variable c ( y,e)that is equal to true if Y Z y instantiates the node X to x,

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532522

  • 8/7/2019 bayesian waste disposal

    9/18

    and is equal to false otherwise:

    c y; eZTrue ; if X Z x in y;

    False ; if X s x in y:(Then the expectation of the random variable

    c ( y,e)u ( y,e) becomes

    E c $u Z X y2 U c y; ep y; eu y; eZ PrY Z y; E Z e ;where a sample space U for Y contains all the instantiationsof Y Z y. Also the expectation of weight distribution isrepresented by

    E u Z X y2 U p y; eu y; eZ X y2 U c y; ep y; eu y; ejc y; eZ TrueC X y2 U c y; ep y; eu y; ejc y; eZ False Z PrE Z e :

    Therefore, if we sample the distribution p ( y,e) insufcient counts, the limiting ratio of c ( y,e)u ( y,e) andu ( y,e) can converge to a desired inference probability. Thesimulation of the algorithm uses a random numbergenerator to uniformly select values for given subset of evidence nodes and subsequently propagates them to pick values for the other query nodes. Statistics are thenmaintained by the corresponding values that the nodestake out, and this process gives a desired answer. Inobtaining the simulation output, a key feature seems tomake a practical estimation of the query variable whoseparameters are an unknown function even under thesimulation inputs.

    4. An example application to an environmental transfermodel of the nuclear facility

    The application involves an example for estimating therisk in a nuclear facility, particularly in association with aradiological waste disposal facility. Near surface disposal isone of the preferred options for comparatively large volumesof low and intermediate-level wastes, which arise duringnuclear power plant operation, and also for wastes arisingfrom radionuclide applications in hospitals and researchestablishments. Especially, the disposal system is a gooddomain example for this study because its performancepotentially depends on a variety of scenarios. Theendpointsmeasure of interest can be simply dened as a peak dose to aperson drinking from a biosphere well. It is noted that thelikelihood of a peak dose can be also represented as an endpoints measure resulting from the scenario evaluation.

    4.1. Model inputs

    Release pathways strongly depend on both a location anddesign of a repository. Once the repository has been

    determined, release pathway models should be assembledso that they can represent radionuclide transport from therepository to the environment and ultimately to the humanbeing. A normal exposure pathway in the post-closurerepository is generally dened by the non-accidentalleaching of radionuclides from the repository to human

    beings. Key exposure pathways can be dened with theconsideration of both a likelihood and signicance.However, with respect to the risk concept, it may be unableto identify the key exposure pathway because both highlikelihood and a great consequence cannot generally cometogether. Several key factors such as delay times fromdifferent engineered barriers, a transport of radionuclidesthrough groundwater in the geological formation, andinventory uses by a critical human group are considered inthe model of exposure pathways.

    4.1.1. Radionuclide release characteristicsA simple, hypothetic site with an engineered vault design

    is adopted to the environmental transfer model. Wastes arenormally disposed at or near the ground surface, generallynot below 50 m in depth at near surface disposal repository.In general, both natural and engineered systems form aseries of multiple and potentially independent barriersagainst the release of waste [41]. The engineered barriers, aswell as waste containers and surrounding materials, andbacklls within the vault, can also contribute to therepository performance. Various processes in the under-ground environment (i.e. geosphere) such as dispersion,retention by geologic media, and holdup in sub-layer aquiferusually relate to the release amount of radioactive materials

    to accessible environment (i.e. biosphere). Also, the releasetiming is greatly affected by the assumptions concerningoccurrence time of breach of the engineered barriers.Therefore, unavoidable uncertainties remain unsolved inthe time when waste packages are breached and in the wayhow the radionuclides are released.

    Source terms describe the release of radionuclides fromthe repository to the pathways through the geosphere and fareld, which can ultimately go to the biosphere. From ageochemical standpoint, there are several considerations indening the source terms; the rate and extent of containerdegradation which determines when and how much waste isexposed to the leaching and the other release mechanisms;the rate how much radionuclides are released into the mediawhere they are transported; and local retardation of nuclidemovement on container debris, backll materials, or otherengineered barriers etc. Altogether, these considerations areimportant in the determination of a delay time of release. Inother words, in spite of the engineered features, a normaldelay of the source terms will occur because a delay frombreach of waste container and a relatively-slow leaching areusually expected.

    A basic mechanism governing the release in undergroundmedia is the mass transfer in a porous medium where porewater is a main agent for radionuclide transfer. After an

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532 523

  • 8/7/2019 bayesian waste disposal

    10/18

    elapse of a delay period, the groundwater system becomesan important part of pathway for radionuclide transfer. Thedegree of saturation of medium, ow of the groundwatersystem, and chemical properties of radionuclides may bekey uncertain elements in determining the release rate.

    4.1.2. Deterministic pathway modelWe adopted a deterministic model in the pathway

    analysis, which was prepared by the National Council onRadiation Protection and Measurements (NCRP) [42]. Sincethe main objective of the NCRP model is to give a simplemeans in predicting the results for screening analyses,conservative approaches and parameters with effective dosefactors for 826 radionuclides were proposed in manyenvironmental transport pathways. The model gives ascreening technique that can be employed to demonstratecompliance with environmental standards or other admin-istratively-set reference levels against releases of radio-nuclides to the biosphere. The screening dose assessmentgives a defensible basis for making regulatory decisionsbecause resulting dose is likely to be overestimated ratherthan underestimated.

    Only a groundwater ingestion pathway is considered inthis paper. The others such as direct irradiation, inhalation,and soil and vegetable ingestion pathways are not addressedbecause any given pathway model has no problem forfullling the objective of the study. According to the NCRPmodel, with only one time burial assumed, the averageyearly fraction of the initial inventory of the parentradionuclide X 0 left from time t delay to t delay C T av is derivedas

    X 0 Z1 K eK l LC l

    r 0T av eK t delay l

    r 0

    T avl L C l r 0(8)

    where

    l L leach rate (yK 1)

    l r 0 radiological decay constant of parent (yK 1)

    T av averaging timet delay delay time for release (y)

    The leach rate dened in the NCRP model is

    l L Z I

    R H n (9)

    where

    R retardation coefcient (dimensionless) I groundwater inltration rate (m y

    K 1) H thickness of the layer of buried soil (m)n soil porosity (dimensionless).

    The considerable domain variables in Eq. (9) are theinltration rate and the retardation coefcients for eachradionuclide of interest. For nuclides with signicantprogeny, average fractions of each progeny appeared from

    the parent nuclide over whole lifetime is also provided inRef. [42]. With consideration of the screening factor for thegroundwater ingestion pathway, we can calculate thescreening dose rate D i, i.e. annual committed effectivedose of radionuclide i as:

    Di Z M 0; i$l LT 0U

    DWV X

    N

    iZ 0 X i$ DF ing ; i (10)

    where

    M 0, i initial source term inventory (Bq)T 0 maximum leaching period (y)l L leach rate of the parent nuclide (y

    K 1)U DW consumption of drinking water, taken to be

    800 L yK 1

    V dilution volume, i.e. total volume of water in theaquifer (L y

    K 1)DF ing, i ingestion dose factor for radionuclide i (Sv Bq

    K 1)

    X i decay correction factor (dimensionless)

    Basic site data are adopted from the study conducted withthe IAEA research program on the safety assessment of nearsurface radioactive waste disposal facilities [43]. Thus, weget a lot of system information; the repository is located in astable geological formation with an approximate depth of 3 m; the length of the facility becomes perpendicular tothe direction of groundwater ow; all the soil layers arecomposed of saturated zone, which implicitly means theproperties of unsaturated soil layer such as partitioncoefcient or bulk density are the same as those of saturatedsoil layer; the geosphere is also characterized as anunconned sand aquifer with dened properties. Althoughthis adoption would intentionally limit the work scope, yetthere is sufcient realism to identify the feasibility of modelling approach suggested in this paper.

    4.1.3. Scenario classicationThe combination of events, features, and processes

    causing diverse natural phenomena can be taken as ascenario. Sometimes aggregation of events and correspond-ing special situations can go into scenarios. We introducethe concept of altered evolution scenarios (AESs) here asone of extreme and special types of the scenarios. Forconvenience sake, AESs are dened as unusual happeningsthat can cause signicant alteration or momentary changesof underground geochemical, hydrological, and/or mechan-ical properties which are directly linked with the systemperformance. Therefore, signicant initiating events andconcurrent responses resulting in adverse disturbance of thesystems concerned can go to the major AESs. In this paper,disturbance means an abnormal state departed from thenormal state. Any scenarios leading to similar consequencescan be grouped into the same category of AES.

    Referring to previous scenario studies [44,45] , it may beenough to classify the scenarios with four major AES

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532524

  • 8/7/2019 bayesian waste disposal

    11/18

    categories, as outlined in Table 1 , for applying to the wastedisposal repository. In general, the AESs has a stochasticbehaviour, and may depend on a time-dependent character-istic [46].

    (1) Geological AES . Earthquake or tectonic displacements(i.e. faults) have been regarded to be critical in the

    geological stochastic analysis. It is noted that largefaults causing a direct release of radionuclides areexpected to be improbable because of the plasticbehaviour and self-sealing capabilities of clay. Never-theless such the faults with low or intermediate levelmay be responsible for local disturbances in theproperties of geological barrier, so result in altering anormal evolution of the system. The primary hazardconsidered is the rupture of waste canisters induced bythe faults. Observing many historical surface faultruptures, McGuire [47] indicates that often the map-view width of the fault zone during an earthquake is not

    restricted to a narrow zone along the primary fault but afault zone that is wide up to kilometers. He also notesthat changes in ground stress following an earthquakecan induce pore pressure changes and groundwaterow, as well as water table changes. It is assumed thatthe geological AES can, therefore, strongly inuencetwo variables of the system, i.e. delay time for releaseand dilution volume. We can include a human intrusionscenario in this AES.

    (2) Climatological AES . The AES representing a climaticvariation can cause medium level perturbations in thegeosphere. The major consequences in this case willprobably be a reduction of inltration rate or aquiferow. It is well acknowledged that the inltration rate ishighly dependent upon the weather; spatial and temporalvariations(soil erosion) in precipitationhave thegreatesteffect on the inltration rate. Another possible conse-quence will be the variation of aquifer ow, whichbecomes a considerable increase or decrease in theaquifer characteristic, ultimately changing the releasequantity of radionuclides. Therefore, we need determin-ing how much the inltration rate or an aquifer ow canchange from a nominal value due to the AES.

    (3) Surface hydrological AES . A lot of phenomena such asfreshwater sediment transport and deposition, natural

    thermal effects, and site ooding have the potential tochange regional hydrological properties over timescales of hundreds years. The phenomena are combinedwith the normal evolution pathway creating a lumpedscenario, surface hydrological AES. This AES maygreatly inuence the groundwater recharge and dis-charge. The quantity changes of a water table and/or an

    aquifer can also occur resulting from the disturbances of local surface hydrological behaviour due to this AES.Adverse effects of this case result in a considerabledecrease in the dilution volume, causing an increase inthe release quantity of radionuclides.

    (4) Ecological AES . Vegetative progression is expected tooccur near surface disposal after a loss of waste control.This causes two effects; a biotic intrusion is consider-ably increased, even though its likelihood depends onroot depths; also a water budget can be changed.Progression toward trees, for instance, may lead tolarger canopy hold-up, which can alter the amount of

    water available for inltration. Progression may tend toeither increase or decrease the inltration [48]. Theextreme water consumption by oras can also cause toreduce the width of a water table or an aquifer. Thesoil/liquid partition coefcients of radionuclides can bederived from soil-to-plant concentration ratios forspecic plants and types of soil. Sheppard and Thibault[49] dened a partition coefcient correlation of leafyvegetables based on plant-to-soil concentration ratios.Consulting this correlation, we assume that the AESindirectly inuence the change of plant-to-soil concen-tration ratios, ultimately the soil partition coefcients.

    4.1.4. Consideration of domain variablesNominal parameters of the domain variables are mainly

    adopted from available references [43,50] , as given inTable 2 . Adopted values for the parameters depend onassigned subjective distributions. Fixed but unknowndistributions represent the epistemic uncertainty aboutcharacteristics at a particular site unless site-specicinformation is available for the corresponding parameters.To develop a subjective distribution of the parameters in themodel, reported available data must be implicitly inter-preted. Uncertainty in a variable is also treated as aleatorywhen the distribution is determined from a random process.

    Table 1Example compilation of events, features, and processes for the classication of AESs

    Geological AES (AES-1) Climatological AES (AES-2) Surface hydrological AES (AES-3) Ecological AES (AES-4)

    Seismicity Precipitation River ow and lake level changes Plant uptake/evolutionFault activation Glaciation Site ooding Animal uptake/evolutionFracturing Deep weathering Recharge to groundwater Uptake by deep rooting speciesRegional Tectonics Transient greenhouse-gas induced

    warmingFreshwater sediment transport anddeposition

    Microbial interactions

    (Human intrusion) River, stream, channel erosion(downcutting)

    Effects of terrestrial ecologicaldevelopment on hydrology

    Natural thermal effects

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532 525

  • 8/7/2019 bayesian waste disposal

    12/18

    Furthermore, stochastic effects and data uncertainty can beconsidered altogether in a set of the distributions. Takingthis consideration along with the epistemic uncertainty of the parameters, we can outline a domain variable with truebut unknown distributions.

    All upper and lower bounds for the parameters of thedomain variables given in Table 2 are assumed in theillustrative application, partly based on available infor-mation addressed in the followings. In particular, in case of

    handling the domain variables with both epistemic andaleatory uncertainties, we assume that a variance of theepistemic distribution closes to a mode of specicrepresentative, which intends solving an issue that how wecan provide the parameters of the domain variables when wehave a lack of critical data set for them. By giving highcondence on the variance of the epistemic distribution likethis, we can focus on the problems desired to be solved.

    Delay time . In this paper, the time between initial burialand the beginning time of concrete structure failure isdened as the delay time ( T delay ) for release. Waste form

    hold-up and container lifetime cause a natural delay andthen a reduction in the vault release terms. The delaytime of the release varies depending on adsorption of waste and concrete. It is noted that while a probablefailure mode of concrete may be a gradual failure, thereis more or less a possibility of a rapid failure. It isassumed that a geological AES can directly affect theintegrity of concrete structure. Minimum T delay isdetermined to be around 10 years. Maximum T delay isdetermined by the maximum time of the simulation.

    Retardation coefcient . The delay in release can bedescribed by a variety of analytic models. For example,leaching by dissolution or adsorption by buffer materialis one of the models. The delay mechanism and itsrelevant time scales are the most critical data. It is usualto assume that groundwater transport processes aredominated by various phenomena such as advection,diffusion, and sorption, etc. All the heterogeneousphenomena can be lumped together into a model forretardation coefcient. Partition coefcients of radio-nuclides need calculating the retardation coefcient of soil layer. As a domain variable, the soil partitioncoefcient ( SPC ) can inuence mass transfer betweensoil and an aquifer, and consequently the radionuclideconcentrations in drinking water consumed by human

    beings are affected by this. Uniform distribution for thisepistemic parameter uncertainty is assumed with lowerand upper bounds for each specic radionuclide.

    Inltration rate . The source terms can be provided byestimating a ux of groundwater near the repository.Estimating the ux directly consists in nding theamount of water that inltrates into soils, redistributes,and enters the disposal unit as a function of time. It isnoted that the inltration rate (IR) is very site specic

    [51] and depends on a natural stochastic process. It is alsowell known that soil hydraulic conductivity is a criticalfactor for deriving the inltration rate and Darcyinltration velocity can be limited by the speciedhydraulic conductivity of the vault roof [43]. Beyeler etal. [50] have a specic model that the inltration rate isthe product of application rate (AR) and fraction of theapplied water that will percolate deeply beneath a rootzone and become inltration. They estimated a minimumirrigation water requirement, which uses an empiricalrelationship between soil permeability and the proportionof water developed by the US Bureau of Reclamation.Based on this consideration, we guess that the scenariossuch as climatological AES or ecological AES maygreatly affect the inltration rate.

    Dilution volume . The contribution to an ingestion dosefrom the use of groundwater depends on many domainvariables. One of the major domain variables, dilutionvolume (DV) directly relates with a transport modellingof several contaminants through an underground hydro-logical system. All the radionuclides released from theburied waste are assumed to be diluted in the volume of aquifer dened as equivalent tube stream area timestransverse length of the aquifer to a well. If a time periodlong enough to change the groundwater resources is

    provided, it is accepted to consider a lot of naturalstochastic processes directly or indirectly affecting thevolume of aquifer. Reference data from Beyeler et al.[50] is adopted again for choosing a nominal value in thispaper.

    4.2. Evaluations associated with information gained fromthe probabilistic inference of Bayesian network

    This section devotes our concerns to combine theprobabilistic inference of Bayesian network withthe estimation of parameters. Likelihood estimation of

    Table 2Subjective probability distributions representing the state of knowledge about xed but uncertain parameters in using default values for Th-230 in the model

    Fixed quantity (units) T delay (yr) SPC (m 3 /kg) IR (mean) (m/yr) IR (s.d.) (m/yr) DV (mean) (m 3) DV (s.d.) (m 3)

    Epistemic distribution LU U U LU T U

    Min 10 3.0 ! 103 0.1 1! 10K 3 6.0! 104 1.5! 104

    Mode 9.1 ! 104

    Max 1000 3.4 ! 103

    0.3 2.4 ! 10K 2

    1.22! 105

    2.5! 104

    T delay , delay time; SPC, soil partition coefcient; IR, inltration rate; DV, dilution volume; s.d., standard deviation; LU, loguniform; U, uniform; T, triangular.

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532526

  • 8/7/2019 bayesian waste disposal

    13/18

    the AESs is a starting point in the analysis. As a practicalassumption, an AES stochastically occurs in space and intime so that the underlying measure of interest (e.g. theoccurrences of an AES in a specied time interval) is facedwith a stochastic process. Therefore, a frequency estimationof AES occurrence is related to the realization of aleatory

    characteristics of the parameters. The occurrences of AESsfor non-overlapping intervals are independent of each other.As summarized by Siu [52], the Poisson distribution is

    commonly used in parameter estimation problems to modelall processes where events are generated over all the time.Considering this perspective, the probability of observing N events in a time period T is given by

    P f N events in time T jl t gZfl t $T g N

    N !eK l t $T ; (11)

    where l (t ) is an expected occurrence rate of an AES. Giventhe jth AES occurrence rate, l j(t ), we can estimate theprobability of occurrence of a rst time event within aspecied time interval t :

    pn R 1Z 1 K expfK l jt $t g: (12)

    Thus, if the expected occurrence rate is provided,Eq. (12) easily calculates a scenario probability. A best-estimate value may be taken to each occurrence rate of AESs. However, it is noted that the occurrence rate of anAES should be estimated with the support of empiricaldatabase and/or expert judgment. Historical data collectionand relevant methodology development are then necessaryin order to establish a reliable data base about stochasticinformation of AES, which is beyond the scope of thisstudy. Instead of choosing insufcient empirical data, thefrequency about AES likelihood is just estimated based onthe subjective engineering judgment which uses a conceptof categorized selection criterion, as denoted in Table 3 . Forpreparing each likelihood category of Table 3 , we set eachoccurrence interval and median occurrence frequency. Asexplained in the previous section, by referring to the surveyof relevant information, we made a decision to choose thelikelihood category for each AES, which resulted in suitabledesignation so as category 3 (intermediate likelihood) forAES-1 and AES-3, and category 2 (likely likelihood) forAES-2 and AES-4, respectively.

    As recognized by Frank [8], owing to environmentalcircumstances by many phenomena in the groundwaterduring the release of radioactive materials, the

    consequences such as an exposure dose will involve bothepistemic and aleatory uncertainties. Also, it is alreadyidentied that the subjective probability distributionsrepresenting parameter uncertainties may undergo changesover the time owing to the scenarios causing perturbations[53] . We are sure that, in the consideration of environmental

    transfer models, there are strong dependencies betweenenvironmental scenarios and the characteristics of thegeospheres and biospheres domain variables. Therefore,we assume that the subjective parameters of the domainvariables, which are critical for estimating the end pointsmeasure of ground release of radionuclides, can varyaccording to the occurrence of various AESs.

    As explained earlier, it is reasonable to assume that theoccurrence of AESs directly inuences estimating uncertainparameters of the domain variables. If that is the case, thedependency between scenarios and variables can obviouslyaffect all the input types of a model. However, no precisely-dened methodology exists in identifying adequate inputtypes and standardizing the inputs. Very often, while there isan incomplete database, the analyst tends to resort tosubjective inputs by some experts who make full use of theirexperience or the literatures published.

    The informal approach for performing uncertaintyanalyses involves a technique varying one parameter orset of parameters at rst and then observing the deviationfrom a base-case prediction [22]. This kind of approach wasconsidered in the study so as to obtain reasonableuncertainty estimates based on the dependency betweenscenarios and domain variables.

    In order to realize this consideration, we adopted a

    so-named mixture prior concept. This concept refers to aspecial approach using correction factors for representingthe parameter uncertainty. By introducing the mixtureprior we can articulate the effects of uncertain parametersunder the occurrence of AESs. The term contamination isaddressed in a mixture prior. The contamination means ageneration of contaminated prior. The contaminatedprior represents a variety of input parameters which reectthe previously-mentioned dependency. Therefore, themixture prior shows a mixing status of a normal prior(as termed for representing ordinary uncertainty distributionin this paper) and the contaminated prior.

    Under those representations, rst, we dene an arbitraryprior set G to elicit a linear estimator of an uncertainparameter q. Next, the priors p (q) close to a single normalprior p 0(q ) can be realized with a class of possiblecontamination [54]:

    G Z fp : p qZ 1 K 4 p 0qC 4 qq; q 2 J g; (13)

    where J is a class of possible contamination, and 4 is anadjustable correction factor with 0 % 4 ! 1 which reectshow p closes to p 0 . Next, if we attempt to give the priorclass q(q) with a maximum likelihood estimator (MLE) forreecting a possible contamination, then Eq. (13) becomes

    Table 3Categorization of AES likelihood in the frequency estimation

    Degree of likelihood Estimated occurrenceinterval

    Occurrencefrequency (median)

    (1) Highly likely 010 2 y post-closure 2.0 ! 10K 2 /yr

    (2) Likely 10 2103 y post-closure 2.0 ! 10K 3 /yr

    (3) Intermediate 10 3104 y post-closure 2.0 ! 10K 4 /yr

    (4) Unlikely 10 4105 y post-closure 2.0 ! 10K 5 /yr

    (5) Highly unlikely 10 5106 y post-closure 2.0 ! 10K 6 /yr

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532 527

  • 8/7/2019 bayesian waste disposal

    14/18

    a similar form of a convex combination of the MLE of thecontaminated prior and a best estimate of the normal prior[55] . Furthermore, the factor 4 will give a quantitativemeasure for showing a possibility of MLE of thecontaminated prior.

    As modelled in Section 2, the information gained from

    the probabilistic inference of Bayesian network, i.e.probabilities on the query random variables obtained fromevidential reasoning can be used to approximately predictthe factor 4 . However, in fact, the prediction of MLE of thecontaminated prior requires engineering/scientic knowl-edge specic to the problem, as well as probabilisticmodelling expertises.

    4.3. Results of example application and discussion

    The previous section indicates that we can connect theresults of probabilistic inference from Bayesian network with the consequence evaluation model addressed. In apractical matter, we had to provide some supportinginformation to get a variety of quantitative expectationsabout the relationship between domain variables and AESs.As a previous step for estimating the risk from futureevolution, supporting information is directly used in theprogram input of Bayesian inference algorithm.

    Preparing specic information on the CPTs used in theinference program seems not to be an easy task. None-theless, based on current knowledge of the relationshipbetween domain variables and scenarios, which are usuallygathered from relevant experts, we can do that. This kind of solution can be explained from the other case. When we

    consider a traditional level 2 PRA for nuclear power plantsparticularly in the determination of containment progressionevent branch, we believe that it is unavoidable to give asingle representative estimate of the qualitative probabilitybased on users degree-of-belief on corresponding severeaccident phenomena [56]. Even though the PRA considersinterval probabilities covering the uncertainty in theexpressions of branch possibility, only nominal pointestimate values primarily based on engineering judgmentsare used in the quantication. This is similar to the approachadopted for getting the CPTs in this paper.

    As a general assumption, each individual AES must lastfor sufcient period so as to cause signicant characteristicchanges to the domain variables. Multiple dependenciesmay occur by the simultaneous AESs. However, since itsconsideration needs doing complex graphical represen-tations in Bayesian network, it is not treated for simplifyingthe problem.

    4.3.1. Conditional probabilitiesWe had primarily considered the dependencies between

    4 domain variables and 4 AESs with the informationexplained in Section 4.1. While keeping this considerationin mind, conditional probabilities for representing theserelationships were determined using single representative

    estimate. Furthermore, we had already addressed anapproximate inference program with a bounded-variancelikelihood weighting algorithm in order to calculate theconditional probability of query nodes of Bayesian network.Before practically using the program, we decided to apply apopular network structure with four simple nodesthe

    so-called WetGrass problem [37], for the purpose of doingverication and showing actualization of applied inferencealgorithm. We found a very fast convergence during theverication process, where the convergence was checkedwith the following stopping rule provided by the developersof the algorithm [38,57] :

    S T R4l 1 C 3

    32ln

    2d

    (14)

    where T is a number of samples; 3 is a relative error; d is afailure probability; and l is a dened constant. We obtainedvalues for the query nodes that were close to the exactprobability within an error of G 0.005, with 3Z 0.1 anddZ 0.05.

    Next, we let the program apply to actual simulationswhen the network structure was given. In preparing asimulation input for CPTs, we dened a dependency matrixwith an ad hoc basis, where the dependency relationshipswith four gradeshigh, medium, low, and zerowereassumed. All the random variables were simulated con-currently with each AES. Fig. 4 provides two runningoutlines from the simulation of all the AESs, where theconvergence in the iteration of Monte Carlo sampling is

    desired. The simulation easily reached a stable state exceptinitial transitions. Also, in most cases, the simulation made aprogression quickly and ended within 30,000 timesiteration. With the acceptance of this simulation onapproximate probabilistic inference, therefore, we canprovide the solutions of the query random variablesdepending on each AES, as denoted in Fig. 5.

    Fig. 5 provides some remarkable insights, acquired in thesimulation results, on the relationship between randomvariables used in Bayesian network and various AESs. First,the random variable corresponding to a disturbance of T delayseems to have a great causal inuence to the occurrence of AES-1 (geological AES); similar insights are followed withthat of IR to the AES-2 (climatological AES) and with thatof DV to the AES-3 (surface hydrological AES), respect-ively. Second, even though we let any strong dependency of random variables to the AES-4 (ecological AES) provide,which seems to be usual in case of the random variable forSPC , signicant results are hardly shown in representingrelative importance of the random variables against theAES-4. Last, compared with the others, the random variablefor SPC reveals relatively-small conditional probabilities; itseems that low condence of the degree-of-belief to thedisturbance of this variable is reected in its likelihoodagainst all the AESs.

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532528

  • 8/7/2019 bayesian waste disposal

    15/18

    4.3.2. Prediction examples of the end points measures

    By making a mixed prior directly by taking theinformation given in the form of Eq. (13), we can moveon a rst step to estimate the end points measures of theenvironmental transfer model. Since the contaminated prioris provided in the evaluation process following afteroccurrences of AESs, the effects of AESs must be reectedin the parameters of random variables. The parameters of the random variables were prepared by approximatingvalues referred to a maximum or minimum bound of normalparameters depending on their probability distributions. Forinstance, an upper bound of delay time would be extremelyreduced according to the existence of AES-1, provided thedependency between geological phenomena and degra-dation of engineered barrier of repository is considered. Onthe other hand, an upper bound of the location parameter ina domain variable may be near to the best estimate of anormal prior depending on specic situation. Similarconsiderations were given for the other domain variables,where a variable-specic treatment was done.

    Historically, a form of complementary cumulativedistribution functions (CCDFs) has been used as a favouriteform for portraying uncertainty results of risk. CCDF is aneasily-plotted risk curve, where an exceeding probabilityagainst an end points measure can be expressed. In thisstudy, the CCDF construction process with a Monte Carlosampling technique was adopted from Ref. [25].

    Before making the simulations, we must identify theendpoints measure of this paper with a screening factor(SF) as it was suggested by the NCRP [42] for representingthe consequence (e.g. exposure dose). The screening factorhas the dimension of committed effective dose (Sv) for aunit concentration of radioactivity (Bq) in the media, nallyhas a unit of Sv/Bq as provided in the NCRP model. OnlyTh-230 was selected for the purpose of illustrativeevaluation. We also tried to usually get 10,000 timessimulation with all the combinations of AESs. Next, thecomparative results between normal (i.e. non-AES) case

    and abnormal (i.e. occurrence of AES) cases were

    illustratively provided. Even though the frequencies of theoccurrence of AESs in Table 3 were reasonably assumed, itis necessary to conrm that the adopted values do not causeany problem in predicting the end points measures. Weperformed a sensitivity study to extensively survey theeffects of frequency variation. This sensitivity study can bepotentially used to check the importance of the AESfrequency. Fig. 6 indicates that the variation of occurrencefrequency of AES-2 has no strong inuence on the resultssuch as mean values of the end points measure. In otherwords, although an extremely-frequent occurrence of AES-2 is assumed, the deviation against the case of non-AES isonly 29% approximately. For the purpose of giving ourcondence to this insight, however, it may further needdetermining the most limiting case in that the scenariofrequency inuences the uncertainty propagation.

    Various CCDF fractiles (i.e. fractions of percentile) forexceeding probabilities of the end points measure areshown in Fig. 7, which gives us good insights in theuncertainty propagation for non-AES case. It is noted that,

    Fig. 4. Convergence representation of Bayesian inference program for the cases of AES-1 and AES-4.

    Fig. 5. Comparison chart in the estimation of conditional probability of random variables from the Bayesian inference program, classied witheach AES. (Note: variable 1 Z time delay, variable 2 Z SPC, variable 3 Z IR,variable 4 Z DV, respectively).

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532 529

  • 8/7/2019 bayesian waste disposal

    16/18

    in terms of estimating an end points measure, there is nobig difference between the uncertainty output (e.g. mean of mean CCDF) and the deterministically-calculated value bythe NCRP. In the case of a subjective condence interval,however, Fig. 7 reveals that there is a great possibility forshowing a wide band. For instance, the end points measurestands at about 2.5 times difference in the interval between95 and 5% (i.e. 90% condence interval) at a credibleexceeding possibility limit of 0.05.

    Since we are interested in the ranking for the importanceof AESs to the impacts of the end point measure, a focused-

    down analysis on each AES has been conducted. The meanCCDF curves of the AESs resulted from the uncertaintypropagation are depicted in Fig. 8. The results are made withnominal frequencies given in Table 3 . Compared withthe non-AES case, we can easily identify the relativeimportance of AESs at any subjective condence level.In particular, the occurrence of AES-2 has a strong impact

    on the outcomes of the end point measure because highdependency of AES-2 on the domain variables such as IRand DV has been already anticipated as shown in Fig. 5. Onthe other hand, the results for AES-1 and AES-3 curves arealmost the same; this is partly because there were no bigdifferences of conditional probability between AES-1 andAES-3 except only for the variable of delay time, as shownin Fig. 5, and partly because we had already let samelikelihood category for both AES-1 and AES-3, aspreviously explained.

    To investigate the importance of the parameters of thedelay time, we have performed another sensitivity study.

    Just after dening two cases with xed inputs for the delaytime, i.e. one for 10 years and the other for 300 years,we made a separate simulation and ultimately comparedwith a standard case having previously-given distribution.In this case, randomly occurrence of AESs was assumedwith the other domain variables remained unchanged. Fig. 9shows an outstanding prole for the case of Cs-137.

    Fig. 6. Dose in unit of screening factor and occurrence rate of AESaccording to a variation of the AES-2 frequency. Arithmetic means of 5%,mean, and 95% fractiles are shown. (Occurrence rate of AES means theAES occurrence counts per total simulation times. SF denotes the screeningfactor).

    Fig. 7. Comparison of uncertainty propagation results for non-occurrenceAES case.

    Fig. 8. Mean values according to the AESs, including non-occurrence of AES.

    Fig. 9. Mean values according to random AES occurrences, showingsensitivity results for the delay time (Radionuclide: Cs-137).

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532530

  • 8/7/2019 bayesian waste disposal

    17/18

    Wide band of the end points measure is derived from theinput variation of delay time. According to this result, wecan assert that it is more desirable for the uncertaintyanalysis of relatively-short lived radionuclides to prepare abest-estimate value of dominant domain variable (i.e. thedelay time in this case).

    5. Summary and conclusions

    Since Bayesian networks can contribute to identify thecausality of an uncertain systems behaviour, we introducedits inference process quantitatively estimating the depen-dency between stochastic scenarios and affected domainvariables of the system. A general approach integratingthe Bayesian network concept to the nuclear risk assessmentwas proposed and illustratively demonstrated. After simpli-fying the network corresponding with a problem-specicstructure, we developed and veried an approximateprobabilistic inference program using bounded-variancelikelihood weighting algorithm, nally shown to beadequate in the verication test for clarifying the depen-dency relationships under different problem queries. Inaddition, in estimating the consequence of example pathwayproblem, i.e. environmental transfer of radionuclides fromradioactive waste disposal, a Monte Carlo simulationprocess accounting for epistemic and aleatory uncertaintieswas provided. Ultimately, specic models, including amodel for propagating uncertainty of relevant parameters,were developed with a comparison of variable-speciceffects due to the occurrence of diverse altered evolution

    scenarios (AESs). After dening and identifying the termsfor specic AES, we classied them into four majorcategories.

    While analysing the outcomes of the illustrativeevaluation of the environmental transfer pathway, wecould identify exceeding probabilities of the end pointsmeasure and importance ranking on each AES. The domainvariables important in the consideration of causality werefound, and also dominant contributors to the uncertaintypropagation might be identied.

    Not a few remarkable features appeared in applying theproposed approach; the knowledge base for establishingdependency relationships was prepared and enhancedduring the simulation; it was worthwhile to establishdenite rationale on the dependency information for gettinggeneral insights through probabilistic inference modellingin the prediction of future uncertain state of a system.Therefore, further researches need enhancing the knowl-edge base of various modelling areas, especially in theprediction of the occurrence frequency of AESs and inthe assessment of causality between domain variables andstochastic scenarios. We have also an understanding thatmore appropriate data for domain variables should beprovided to rene the results. A systematic technique forexpert knowledge survey and utilization, as well as the use

    of experimental results, can support to enhancing thepredictability. However, if the epistemic uncertainty in thedistribution of event occurrence frequencies will beaddressed in this study, a more complicated simulationprocess may be required.

    While taking a modelling approach proposed in this

    paper, we made a causal inference process includinguncertainty estimates, also got a number of results toimprove the current knowledge base for prioritizing futurerisk-signicant variables in an actual site. The identiedspecic results presented will probably be useful in makingthe prediction of future system states. Furthermore, theproposed approach will probably give a general and exiblestructure for doing further extension studies and appli-cations. For instance, this study can be extensively appliedto the evaluation of all the exposure pathway models,provided the relationships between AESs and domainvariables are searched in detail. The evaluation can bealso intensively done for any specic scenarios such as aspecic phenomenon categorized in this paper.

    References

    [1] Kaplan S, Garrick BJ. On the quantitative denition of risk. Risk Anal1981;1(1).

    [2] Helton JC. Risk, uncertainty in risk, and the EPA release limits forradioactive waste disposal. Nucl Technol 1993;101.

    [3] Slaper H, Blaauboer R. A probabilistic risk assessment for accidentalreleases from nuclear power plants in Europe. J Hazard Mater 1998;61:20915.

    [4] Protection from Potential Exposure: A Conceptual Framework. ICRP64; 1993.[5] Modarres M. What every engineer should know about reliability and

    risk analysis. New York: Marcel Dekker Inc; 1993.[6] Garrick BJ, Chistie RF. Probabilistic risk assessment practices in the

    USA for nuclear power plants. Safety Sci 2002;40:177201.[7] Solomon K, Giesy J, Jones P. Probabilistic risk assessment of

    agrochemicals in the environment. Crop Prot 2000;19:64955.[8] Frank MV. Treatment of uncertainty in space nuclear risk assessment

    with examples from Cassini mission applications. Reliab Eng Syst Saf 2000;66:20321.

    [9] Wilmot RD. The treatment of climate-driven environmental changeand associated uncertainty in post-closure assessments. Reliab EngSyst Saf 1993;42:181200.

    [10] Rowe WD. Understanding uncertainty. Risk Anal 1994;14:74350.

    [11] Pate-Cornell ME. Uncertainties in risk analysis: six levels of treatment. Reliab Eng Syst Saf 1996;54:95111.

    [12] Caruso MA, Cheok MC, Cunningham MA, Holahan GM, King TL,Parry GW, Ramey-Smith AM, et al. An approach for using risk assessment in risk-informed decisions on plant-specic changes to thelicensing basis. Reliab Eng Syst Saf 1999;63:23142.

    [13] Oberkampf WL, DeLand SM, Rutherford BM, Diegert KV, Alvin KF,et al. Error and uncertainty in modeling and simulation. Reliab EngSyst Saf 2002;75:33357.

    [14] Hoffman FO, Hammonds JS. Propagation of uncertainty in risk assessments: the need to distinguish between uncertainty due to lack of knowledge and uncertainty due to variability. Risk Anal 1994;14:70712.

    [15] Hofer E. When to separate uncertainties and when not to separate.Reliab Eng Syst Saf 1996;54:1138.

    C.-J. Lee, K.J. Lee / Reliability Engineering and System Safety 91 (2006) 515532 531

  • 8/7/2019 bayesian waste disposal

    18/18

    [16] Siu N, Malik S, Bessette D, Woods H. Treating aleatory and epistemicuncertainties in analyses of pressurized thermal shock. PSAM 5 2000.

    [17] Helton JC, Martell M, Tierney MS. Characterization of subjectiveuncertainty in the 1996 performance assessment for the wast