a study of estimation methods for defect estimation by syed waseem haider and dr. joão w. cangussu...

A Study of Estimation Methods for Defect

Estimation

by

Syed Waseem Haider

and Dr. João W. Cangussu

The University of Texas at Dallas

Outline Introduction Classical Approach Defect Estimators Conclusion & Future Work

Introduction The major goal of software testing is to find and fix as many

defects as possible under given constraints so as to release a product with reasonable reliability.

Time to achieve the established goal and percentage of the goal achieved up to the moment are important factors to determine the status.

The clear view of the status of the testing process is crucial to find a trade-off between releasing a product earlier or investing more time on testing.

Many defect prediction techniques have addressed this important problem by estimating the total number of defects.

The availability of an accurate estimation of the number of defects at early stages of the testing process allows for proper planning of resource usage, estimation of completion time, and current status of the process.

Also an estimate of the number of defects in the product by the time of release allows the inference of required customer support.

Introduction (contd.) We will discuss various estimation methods which

are used to develop defect estimation techniques.

We will discuss assumptions that each method makes about the data

model.

probability distribution, and mean and variance of data and estimator.

statistical efficiency of the estimators developed from these estimation methods.

Estimation Approaches

Some of the Bayesian defect estimators in the field are Bayesian Estimation of Defects based on Defect Decay Model

(BayesED3M) A Bayesian Reliability Growth Model for Computer Software by B.

Littlewood and J. Verrall Bayesian Extensions to Jelinski-Moranda Model by W. S. Jewell

Classical Approach

For each method we will discuss Requirements How to develop estimator Statistical Performance Merits & Demerits

Basic ingredients Collection of data

Data modeling

Likelihood function

Collection of data samples Any estimation technique needs samples or data from the

ongoing system testing process.

Samples can be in the form of number of defects found each day or week or any other time unit.

Samples can be total number of defects found by any instant of time.

Examples of sampling data In reliability models number of defects discovered per execution time. Calendar time versions of reliability models also exist. Estimation of Defects based on Defect Decay Model (ED3M) works for

both calendar time and execution time.

Data modeling Let be the parameter to be estimated. It is the total number of defects. A data model is used to relate to the data samples drawn from the

system testing. Data model must also account for random behavior caused by work force

relocation, noise in the testing process, testing of varying complexity product, among others.

Lets assume that we take nth sample x[n] which contains corrupted by random noise w[n] as given by

Obesrvations of made in N intervals is given by

Note that in Eqs.1 and 2 is linearly related to data. In Eq.2 h is the observation vector. It can contain information such as

number of testers, failure intensity rate, number of rediscovered faults for each sample, etc.

)1(][][ nwnx

(2)whx

Likelihood function The joint probability distribution of data is given by

p(x[0],x[1],…,x[N-1]; ) or in the vector form p(x; ) (PDF of

data).

p(x; ) is the function of both data x and the unknown

parameter . For example if for a given data set is

changed to the value of p(x; ) will change.

When p(x; ) is seen as the function of it is called

likelihood function.

Intuitively p(x; ) provides how accurately we can estimate

1

Minimum Variance Unbiased (MVU) estimator

Minimum Variance Unbiased (MVU) estimator (contd.) Probability distribution of the data must be known.

The problem of finding the estimator is simply to find a function of data.

must be unbiased .

Variablility of the estimates determines the efficiency of the estimator.

Among several estimators one with the lowest variance is the efficient estimator.

Among various methods to determine lower bound on the variance Cramer-Rao Lower Bound (CRLB) is easier to determine.

]ˆ[E

Minimum Variance Unbiased (MVU) estimator (contd.) CRLB states: it is assumed that the PDF p(x; ) satisfies the

regularity condition

where the expectation is taken with respect to p(x; ). Then the variance of any unbiased estimator must satisfy

An estimator which is unbiased, satisfies the CRLB and is based on linear data model is called an efficient MVU estimator. It is found using Eq.5

The efficient MVU estimator and its variance is given by Eqs. 6 and 7 respectively

)3(,0);(ln

xp

E

)4(]);(ln

[

1)ˆ(

2

2

xp

EVAR

)5()()();(ln

xgI

xp

)6()(ˆ xg )7()(

1)ˆ(

IVAR

Minimum Variance Unbiased (MVU) estimator (contd.) It may happen that we are able to find an estimator whose

variance is less than other estimators but not less than CRLB. We

will simply call such an estimator MVU estimator.

In other fields such as signal processing and communication

systems where the system or model under investigation is well

defined in terms of physical constraints, it is possible to find

efficient MVU estimator.

In software engineering no model completely captures all the

aspects of a software testing process. Different models are based

on different assumptions and this lack of consistency hints

towards the absence of a mature testing model. Therefore its

unlikely to find an efficient MVU estimator

MVU estimator based on Sufficient Statistic

MVU estimator based on Sufficient Statistic (contd.) The minimal data required to make PDF of data p(x; )

independent of unknown parameter is called sufficient statistic.

For example a simple estimator = x[n] will have high variance in estimating .

But if sufficient data x[0],x[1],…,x[N-1] is available then new sample x[N] will not provide additional information about .

If sufficient statistic exist then p(x; ) can be factorized as given by Eq.9 according to the Neyman-Fisher Factorization theorem.

In Eq. 9 T(x) is sufficient statistic. A function of T(x) is an MVU estimator only if it is unbiased .

)8(])1[,],0[][()];1[,],0[][( NxxnxpNxxnxp ||

)9()(),();( xxx hTgp

]ˆ[E )10()(ˆ xTf

The PDF of ED3M the technique developed by authors can be factorized as given by Eq.9. The estimator of ED3M is an unbiased function of T(x) but we do not claim that estimator of ED3M is based on sufficient statistic for the following reason.

An example of sufficient statistic is that we want to estimate the accuracy of a surgical precision laser.

We take sufficient samples to estimate the average precision achieved as shown in the figure.

In software testing as Dijkstra noted, testing shows the presence of defects but not their absence.

Even though as time elapses rate of finding new defects subsides significantly but there will be new defects now and then.

Overall testing process can be considered an increasing function of defects (in one not ‘around’). We can only forecast saturation of finding the defects. A change in strategy can result in sudden burst of more defects.

Because of this behavior of testing process the notion of sufficient statistic in software testing is arguable.

Therefore even though ED3M fulfills the mathematical requirements of a sufficient statistic estimator, we do not claim that its based on this method.

MVU estimator based on Sufficient Statistic (contd.)

Maximum Likelihood Estimator (MLE)

Maximum Likelihood Estimator (MLE) (contd.) Many practical estimators developed are based on MLE. Important properties of MLE

MLE is asymptotically (as ) an efficient estimator. For Linear data model as given by Eqs.1 and 2 MLE achieves CRLB for

finite data set. If an efficient estimator exists MLE will produce it.

The basic idea is to find the value of theta that maximizes ln p(x; ) the log-likelihood function for a given x.

If a closed form solution does not exist a numerical method such as Newton-Raphson can be used to approximate the solution.

Numerical approximation may not necessarily converge to maximization of ln p(x; ) to produce MLE.

An example of numerical approximation of MLE is Musa-Okumoto model.

Authors were able to find a closed form solution of MLE for ED3M.

N

Method of Moments

Method of Moments (contd.) Method of moments is generally consistent. Given p(x;theta) if we know that the kth moment of x[n] is a function of

as given by Eq.11.

We approximate the kth moment of data x, by taking average of x(k) as given by Eq. 13.

If f is an invertible function as given by Eq.12 then substitution of into Eq.12 results in the estimator as given by Eq.14.

k

)14(][1ˆ

)13(][1

ˆ

)12()(

)11()(][

1

0

1

1

0

1

N

n

k

N

n

kk

k

kk

nxN

f

nxN

f

fnxE

k

k

Best Linear Unbiased Estimator (BLUE)

Best Linear Unbiased Estimator (BLUE) (contd.) BLUE is based on two essential requirements called

linearity conditions Data model is linear. Estimator itself is a linear function of data.

The two linearity conditions are given by Eqs.2 and 15.

Note that second linearity condition is necessary to make unbiased as given by Eq.16.

BLUE is a suboptimal estimator because lower bound of its variance is unknown.

It can be successfully used if its variance is in acceptable range and it is producing results with reasonable accuracy.

1

0

1

0

)16(][]ˆ[

)15(][ˆ

N

nn

N

nn

nxEaE

nxa

Best Linear Unbiased Estimator (BLUE) (contd.) A limitation of this method from practical point of view in

software testing is that we have to know the variance of noise.

In the present day no detailed study has been done which has investigated the statistical characteristics of noise in testing process.

A simple way to approximate the variance of noise is to find the variance of data as given by Eqs.17 and 18.

However the effects of this approximation on the performance of the BLUE estimator are unknown with respect to software testing.

)18(][][

)17(][][

wx

whθx

VARVAR

VARVAR

Least Square Error (LSE)

It is the most commonly used approximation or estimation method. The geometrical interpretation of LSE is more intuitive. If we have data points in space the LSE finds a curve which minimizes the

distance from all these points together. A weakness of LSE is that it is sensitive to outliers (points which are away

from the group of points). Due to these outliers the curve may be found away from the vicinity of

points. A simple way to remedy this situation is to ignore the outliers from the

data set. Main advantages of LSE is that its simple to develop and no information

about the probability distribution of the data set or noise is needed. On the other hand the statistical performance of LSE is questionable. Authors have use LSE to approximate the values of and the defect

decay parameters in ED3M . From the application of ED3M on several industrial data sets and simulation

data sets the performance of LSE estimator for and was concluded acceptable.

Least Square Error (LSE) (contd.)

2

1 2

1

Defect Estimators All of these approaches are based on MLE, where MLE requires

assumption about probability distribution.

Therefore each model defines distribution by making some

assumptions while ignoring other facts.

Hence it can be safely deduced that no model will work in all

situations.

Padberg’s Maximum likelihood estimates for the Hypergeometric

software reliability model

Musa-Okumoto logarithmic Poisson execution time model for software

reliability measurement

Estimation of Defects based on Defect Decay Model (ED3M)

Padberg’s Approach Padberg showed that growth quotient Q(m) of the likelihood function L(m) when

greater than 1 indicates that likelihood function is indeed increasing and provides maximum likelihood estimates.

In Eq.19 m is the initial number of faults, wn is the number of newly discovered and

rediscovered faults for the nth test and cn is cumulative number of faults in n tests.

For a give data set cn first find x = cn+1 then Q(x).

If Q(x) > 1 then set x = x+1 and fine Q(x) again.

Keep repeating the steps until Q(x) ≤ 1.

Statistical performance of Q(m) is not discussed.

We do not know if the variance of Q(m) is asymptotically bounded by CRLB in other words if Q(m) is asymptotically an efficient MVU estimator.

Even though the underlying data model is not known but it can be observed from Eq.19 that model is nonlinear.

If the data model is nonlinear then it cannot achieve CRLB for finite data records.

)19()(

)()(

)1(

)()(

11

nn

n

cmm

wmwm

mL

mLmQ

Musa-Okumoto logarithmic poison execution time model Musa-Okumoto proposed a reliability model based on the assumption that the

expected number of faults by time t are Poisson distributed as given in Eq.20.

The parameters to be estimated are the initial failure intensity and the rate of

reduction in the normalized failure intensity per failure.

The data model in Eq.21 is a nonlinear function of and , hence MLE will not

achieve CRLB for finite data set.

A closed form solution of MLE could not be found for Eqs.20 and 21.

Therefore a numerical approximation of MLE is needed.

Whether the approximation of MLE will be asymptotically an efficient MVU estimator

is not guaranteed.

)21()1ln(1

)(

)20(!

)]([)(Pr

0

)(

tt

em

tmtM t

m

)(t

0

0

Estimation of Defects based on Defect Decay Model (ED3M) The data model of ED3M is given by Eq.22, where D is the defect data

vector, h is the observation vector and w is noise vector. Vectors are of dimension Nx1.

We have assumed that D is normally distributed and the PDF of D is given by Eq.24.

The initial number of defects in software is given by R0

The MLE estimator for R0 is given by Eq.25.

As seen in Eq.22 the data model is linear, therefore MLE estimator in Eq.25 can achieve CRLB for finite data set and will be an efficient MVU estimator.

)25(ˆ

)24(

)2(

1);(

)23(1)(

)22(

1

0

)()(2

1

220

12

1

12

2

0

002

21

Dhhh

D

whD

hDhD

TT

RR

N

nn

R

eRp

eenh

R

T

0R

0R

Comparison between ED3M and Padberg's MLE, where the total number of defects is 481 and the total time length is 111 units.

Comparison between ED3M and Musa-Okumoto model, where the total number of defects is 136 and the total time length is 88.682x103 CPU seconds.

Conclusion and future work An accurate prediction of total number of software defects helps in

evaluation of the status of testing process.

But the accuracy of the estimator owes to the estimation method which is used to develop the estimator.

We have tried to provide a general framework of available estimation methods for researchers who are interested in defect estimation.

Although discussion had been around software testing and defect estimation but its general enough to be used for other estimation problems.

We have elicited the requirements of each method.

We have also discussed the statistical efficiency that each method offers.

Even though the discussion is limited to single parameter estimation, it can be easily extended to a vector of parameters to be estimated.

In future we will extend our discussion to Bayesian Approaches and expand the analysis of existing estimators to be more comprehensive.

a study of estimation methods for defect estimation by syed waseem haider and dr. joão w. cangussu...

Documents

bayesian estimation

estimation of completion

defect estimation techniques

total number of defects

variance of data

collection of data samples

accurate estimation

estimation approaches