Estimation and Inferential Statistics
Pradip Kumar Sahu • Santi Ranjan PalAjit Kumar Das
Estimation and InferentialStatistics
123
Pradip Kumar SahuDepartment of Agricultural StatisticsBidhan Chandra Krishi ViswavidyalayaMohanpur, Nadia, West BengalIndia
Santi Ranjan PalDepartment of Agricultural StatisticsBidhan Chandra Krishi ViswavidyalayaMohanpur, Nadia, West BengalIndia
Ajit Kumar DasDepartment of Agricultural StatisticsBidhan Chandra Krishi ViswavidyalayaMohanpur, Nadia, West BengalIndia
ISBN 978-81-322-2513-3 ISBN 978-81-322-2514-0 (eBook)DOI 10.1007/978-81-322-2514-0
Library of Congress Control Number: 2015942750
Springer New Delhi Heidelberg New York Dordrecht London© Springer India 2015This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or partof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations,recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmissionor information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilarmethodology now known or hereafter developed.The use of general descriptive names, registered names, trademarks, service marks, etc. in thispublication does not imply, even in the absence of a specific statement, that such names are exempt fromthe relevant protective laws and regulations and therefore free for general use.The publisher, the authors and the editors are safe to assume that the advice and information in thisbook are believed to be true and accurate at the date of publication. Neither the publisher nor theauthors or the editors give a warranty, express or implied, with respect to the material contained herein orfor any errors or omissions that may have been made.
Printed on acid-free paper
Springer (India) Pvt. Ltd. is part of Springer Science+Business Media (www.springer.com)
Preface
Nowadays one can hardly find any field where statistics is not used. With a givensample, one can infer about the population. The role of estimation and inferentialstatistics remains pivotal in the study of statistics. Statistical inference is concernedwith problems of estimation of population parameters and test of hypotheses. Instatistical inference, drawing a conclusion about the population takes place on thebasis of a portion of the population. This book is written, keeping in mind the needof the users, present availability of literature to cater to these needs, their merits anddemerits under a constantly changing scenario. Theories are followed by relevantworked-out examples which help the user grasp not only the theory but alsopractice them.
This work is a result of the experience of the authors in teaching and researchwork for more than 20 years. The wider scope and coverage of the book will helpnot only the students, researchers and professionals in the field of statistics but alsoseveral others in various allied disciplines. All efforts are made to present the“estimation and statistical inference”, its meaning, intention and usefulness. Thisbook reflects current methodological techniques used in interdisciplinary research,as illustrated with many relevant research examples. Statistical tools have beenpresented in such a manner, with the help of real-life examples, that the fear factorabout the otherwise complicated subject of statistics will vanish. In its sevenchapters, theories followed by examples will make the readers to find most suitableapplications.
Starting from the meaning of the statistical inference, its development, differentparts and types have been discussed eloquently. How someone can use statisticalinference in everyday life has remained the main point of discussion in examples.How someone can draw conclusions about the population under varied situations,even without studying each and every unit of the population, has been discussedtaking numerous examples. All sorts of inferential problems have been discussed, atone place supported by examples, to help the students not only in meeting theirexamination need and research requirement, but also in daily life. One can hardlyget such a compilation of statistical inference in one place. The step-by-step
v
procedure will immensely help not only the graduate and Ph.D. students but alsoother researchers and professionals. Graduate and postgraduate students,researchers and the professionals in various fields will be the user of the book.Researchers in medical and social and other disciplines will be greatly benefittedfrom the book. The book would also help students in various competitiveexaminations.
Written in a lucid language, the book will be useful to graduate, postgraduateand research students and practitioners in diverse fields including medical, socialand other sciences. This book will also cater the need for preparation in differentcompetitive examinations. One can find hardly a single book, in which all topicsrelated to estimation and inference are included. Numerous relevant examples forrelated theories are added features of this book. An introduction chapter and anannexure are special features of this book which will help readers in getting basicideas and plugging the loopholes of the readers. Chapter-wise summary of thecontent of the proposed book is presented below.
Estimation and Inferential Statistics
• Chapter 1: The chapter relates to introduction to the theory of point estimationand inferential statistics. Different criteria for a good estimator are discussed.The chapters also present real-life worked-out problems that help the readerunderstand the subject. Compared to partial coverage of this topic in most bookson statistical inference, this book aims at elaborate coverage about the subject ofpoint estimation.
• Chapter 2: This chapter deals with different methods of estimation like leastsquare method, method of moments, method of minimum χ2 and method ofmaximum likelihood estimation. Not all these methods are equally good andapplicable in all situations. Merits, demerits and applicability of these methodshave been discussed in one place, which otherwise have remained mostly dis-persed or scattered in the competing literature.
• Chapter 3: Testing of hypotheses has been discussed in this chapter. Thischapter is characterized by typical examples in different forms and spheresincluding Type A1 testing, which is mostly overlooked in many of the availableliterature. This has been done in this book.
• Chapter 4: The essence and technique of likelihood ratio test has been discussedin this chapter. Irrespective of the nature of tests for hypotheses (simple andcomposite), this chapter emphasizes how easily the test could be performed,supported by a good number of examples. Merits and drawbacks have also beendiscussed. Some typical examples are discussed in this chapter that one canhardly find in any other competing literature.
vi Preface
• Chapter 5: This chapter deals with interval estimation, techniques of intervalestimation under different situations, problems and prospects of differentapproaches of interval estimation has been discussed with numerous examplesin one place.
• Chapter 6: This chapter deals with non-parametric methods of testinghypotheses. All types of non-parametric tests have been put together and dis-cussed in detail. In each case, suitable examples are the special feature of thischapter.
• Chapter 7: This chapter is devoted to the discussion of decision theory. Thisdiscussion is particularly useful to students and researchers interested in infer-ential statistics. In this chapter, attempt has been made to present the decisiontheory in an exhaustive manner, keeping in mind the requirement and thepurpose of the reader for whom the book is aimed at. Bayes and mini-maxmethod of estimation have been discussed in the Annexure. Most of theavailable literature on inferential statistics lack due attention on these importantaspects of inference. In this chapter, the importance and utilities of the abovemethods have been discussed in detail, supported with relevant examples.
• Annexure: The authors feel that the Annexure portion would be an asset tovaried types of readers of this book. Related topics, proofs, examples, etc.,which could not be provided in the text itself, during the discussion of variouschapter for the sake of maintenance of continuity and flow are provided in thissection. Besides many useful proofs and derivations, this section includestransformation of statistics, large sample theories, exact tests related to binomial,Poisson population, etc. This added section will be of much help to the readers.
In each chapter, theories are followed by examples from applied fields, whichwill help the readers of this book to understand the theories and applications ofspecific tools. Attempts have been made to familiarize the problems with exampleson each topic in a lucid manner. During the preparation of this book, a good numberof books and articles from different national and international journals have beenconsulted. Efforts have been made to acknowledge and provide these in the bib-liography section. An inquisitive reader may find more material from the literaturecited.
The primary purpose of the book is to help students of statistics and allied fields.Sincere efforts have been made to present the material in the simplest andeasy-to-understand form. Encouragements, suggestions and help received from ourcolleagues at the Department of Agricultural Statistics, Bidhan Chandra KrishiViswavidyalaya are sincerely acknowledged. Their valuable suggestions towardsimprovement of the content helped a lot and are sincerely acknowledged. Theauthors thankfully acknowledge the constructive suggestions received from thereviewers towards the improvement of the book. Thanks are also due to Springer
Preface vii
for the publication of this book and for continuous monitoring, help and suggestionduring this book project. The authors acknowledge the help, cooperation, encour-agement received from various corners, which are not mentioned here. The effortwill be successful, if this book is well accepted by the students, teachers,researchers and other users to whom this book is aimed at. Every effort has beenmade to avoid errors. Constructive suggestions from the readers in improving thequality of this book will be highly appreciated.
Mohanpur, Nadia, India Pradip Kumar SahuSanti Ranjan PalAjit Kumar Das
viii Preface
Contents
1 Theory of Point Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Sufficient Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Unbiased Estimator and Minimum-Variance
Unbiased Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211.4 Consistent Estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391.5 Efficient Estimator. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2 Methods of Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.2 Method of Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.3 Method of Maximum Likelihood . . . . . . . . . . . . . . . . . . . . . . . 482.4 Method of Minimum v2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 552.5 Method of Least Square . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3 Theory of Testing of Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . 633.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633.2 Definitions and Some Examples . . . . . . . . . . . . . . . . . . . . . . . . 633.3 Method of Obtaining BCR . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733.4 Locally MPU Test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903.5 Type A1 (�Uniformly Most Powerful Unbiased) Test . . . . . . . . . 97
4 Likelihood Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.1.1 Some Selected Examples . . . . . . . . . . . . . . . . . . . . . . . . 104
5 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1315.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1315.2 Confidence Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
ix
5.3 Construction of Confidence Interval. . . . . . . . . . . . . . . . . . . . . . 1325.4 Shortest Length Confidence Interval and Neyman’s Criterion . . . . 138
6 Non-parametric Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1456.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1456.2 One-Sample Non-parametric Tests. . . . . . . . . . . . . . . . . . . . . . . 146
6.2.1 Chi-Square Test (i.e Test for Goodness of Fit) . . . . . . . . . 1466.2.2 Kolmogrov–Smirnov Test . . . . . . . . . . . . . . . . . . . . . . . 1476.2.3 Sign Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1486.2.4 Wilcoxon Signed-Rank Test . . . . . . . . . . . . . . . . . . . . . . 1516.2.5 Run Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.3 Paired Sample Non-parametric Test . . . . . . . . . . . . . . . . . . . . . . 1566.3.1 Sign Test (Bivariate Single Sample Problem)
or Paired Sample Sign Test . . . . . . . . . . . . . . . . . . . . . . 1566.3.2 Wilcoxon Signed-Rank Test . . . . . . . . . . . . . . . . . . . . . . 157
6.4 Two-Sample Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1596.5 Non-parametric Tolerance Limits . . . . . . . . . . . . . . . . . . . . . . . 1686.6 Non-parametric Confidence Interval for nP . . . . . . . . . . . . . . . . . 1706.7 Combination of Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1736.8 Measures of Association for Bivariate Samples . . . . . . . . . . . . . . 174
7 Statistical Decision Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1817.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1817.2 Complete and Minimal Complete Class of Decision Rules . . . . . . 1897.3 Optimal Decision Rule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1977.4 Method of Finding a Bayes Rule . . . . . . . . . . . . . . . . . . . . . . . 1997.5 Methods for Finding Minimax Rule. . . . . . . . . . . . . . . . . . . . . . 2087.6 Minimax Rule: Some Theoretical Aspects . . . . . . . . . . . . . . . . . 2267.7 Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
References. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
x Contents
About the Authors
P.K. Sahu is associate professor and head of the Department of AgriculturalStatistics, Bidhan Chnadra Krishi Viswavidyalaya (a state agriculture university),West Bengal. With over 20 years of teaching experience, Dr. Sahu has publishedover 70 research papers in several international journals of repute and has guidedseveral postgraduate students and research scholars. He has authored four books:Agriculture and Applied Statistics, Vol. 1, and Agriculture and Applied Statistics,Vol. 2 (both published with Kalyani Publishers), Gender, Education,Empowerment: Stands of Women (published with Agrotech Publishing House) andResearch Methodology: A Guide for Researchers In Agricultural Science, SocialScience and Other Related Fields (published by Springer) as well as contributed achapter to the book Modelling, Forecasting, Artificial Neural Network and ExpertSystem in Fisheries and Aquaculture, edited by Ajit Kumar Roy and NiranjanSarangi (Daya Publishing House). Dr. Sahu has presented his research papers inseveral international conferences. He also visited the USA, Bangladesh, Sri Lanka,and Vietnam to attend international conferences.
S.R. Pal is former eminent professor at the Department of Agricultural Statistics atR.K. Mission Residential College and Bidhan Chandra Krishi Viswavidyalaya (astate agriculture university). An expert in agricultural statistics, Prof. Pal has over35 years of teaching experience and has guided several postgraduate students andresearch scholars. He has several research papers published in statistics and relatedfields in several international journals of repute. With his vast experience inteaching, research and industrial advisory role, Prof. Pal has tried to incorporate theproblems faced by the users, students, and researchers in this field.
A.K. Das is professor at the Department of Agricultural Statistics, Bidhan ChandraKrishi Viswavidyalaya (a state agriculture university). With over 30 years ofteaching experience, Prof. Das has a number of good research articles to his creditpublished in several international journals of repute and has guided several
xi
postgraduate students and research scholars. He has coauthored a book, Agricultureand Applied Statistics, Vol. 2 (both published with Kalyani Publishers), and con-tributed a chapter to the book, Modelling, Forecasting, Artificial Neural Networkand Expert System in Fisheries and Aquaculture, edited by Ajit Kumar Roy andNiranjan Sarangi (Daya Publishing House).
xii About the Authors
Introduction
In a statistical investigation, it is known that for reasons of time or cost, one maynot be able to study each individual element of the population. In such a situation, arandom sample should be taken from the population, and the inference can bedrawn about the population on the basis of the sample. Hence, statistics deals withthe collection of data and their analysis and interpretation. In this book, the problemof data collection is not considered. We shall take the data as given, and we studywhat they have to tell us. The main objective is to draw a conclusion about theunknown population characteristics on the basis of information on the same char-acteristics of a suitably selected sample. The observations are now postulated to bethe values taken by random variables. Let X be a random variable which describesthe population under investigation and F be the distribution function of X. There aretwo possibilities. Either X has a distribution function of Fh with a known functionalform (except perhaps for the parameter h, which may be vector), or X has adistribution function F about which we know nothing (except perhaps that F is, say,absolutely continuous). In the former case, let H be the set of possible values ofunknown parameter h, then the job of statistician is to decide on the basis ofsuitably selected samples, which member or members of the family Fh; h 2 Hf gcan represent the distribution function of X. These types of problems are calledproblems of parametric statistical inference. The two principal areas of statisticalinference are the “area of estimation of parameters” and the “tests of statisticalhypotheses”. The problem of estimation of parameters involves both point andinterval estimation. Diagrammatically, let us show components and constituents ofstatistical inference as in chart.
xiii
Problem of Point Estimation
The problem of point estimation relates to the estimating formula of a parameterbased on random sample of size n from the population. The method basicallycomprises of finding out an estimating formula of a parameter, which is called theestimator of the parameter. The numerical value, which is obtained on the basis of asample while using the estimating formula, is called estimate. Suppose, for anexample, that a random variable X is known to have a normal distribution Nðl; r2Þ;but we do not know one of the parameters, say l. Suppose further that a sampleX1;X2; . . .;Xn is taken on X. The problem of point estimation is to pick a statisticT X1;X2; . . .;Xnð Þ that best estimates the parameter l. The numerical value of Twhen the realization is x1; x2; . . .; xn is called an estimate of l, while the statistic T iscalled an estimator of l. If both l and r2 are unknown, we seek a joint statisticT ¼ U;Vð Þ as an estimate of l; r2ð Þ.Example Let X1;X2; . . .;Xn be a random sample from any distribution Fh for whichthe mean exists and is equal to h. We may want to estimate the mean h of distri-bution. For this purpose, we may compute the mean of the observationsx1; x2; . . .; xn, i.e., say
�x ¼ 1n
Xni¼1
xi:
This �x can be taken as the point estimate of h.
Example Let X1;X2; . . .;Xn be a random sample from Poisson’s distribution withparameter k, i.e., P kð Þ, where k is not known. Then the mean of the observationsx1; x2; . . .; xn, i.e.,
�x ¼ 1n
Xni¼1
xi
is a point estimate of k.
xiv Introduction
Example Let X1;X2; . . .;Xn be a random sample from a normal distribution withparameters l and r2, i.e., N l; r2ð Þ, where both l and r2 are unknown. l and r2 arethe mean and variance respectively of the normal distribution. In this case, we maytake a joint statistics �x; s2ð Þ as a point estimate of N l; r2ð Þ, where
�x ¼ 1n
Xni¼1
xi ¼ sample mean
and
s2 ¼ 1n� 1
Xni¼1
x1 � �xð Þ2 ¼ sample mean square.
Problem of Interval Estimation
In many cases, instead of point estimation, we are interested in constructing of afamily of sets that contain the true (unknown) parameter value with a specified(high) probability, say 100 1� að Þ%. This set is taken to be an interval, which isknown as confidence interval with a confidence coefficient 1� að Þ and the tech-nique of constructing such intervals is known as interval estimation.
Let X1;X2; . . .;Xn be a random sample from any distribution Fh. Let h xð Þ and�h xð Þ be functions of x1; x2; . . .; xn. If P½h xð Þ\h\�h xð Þ� ¼ 1� a, then ðh xð Þ; �h xð ÞÞ iscalled a 100 1� að Þ% confidence interval for h, whereas h xð Þ and �h xð Þ are,respectively, called lower and upper limits for h.
Example Let X1;X2; . . .;Xn be random sample from N l; r2ð Þ, whereas both land r2 are unknown. We can find 100 1� að Þ% confidence interval of l. To esti-mate the population mean l and population variance r2, we may take the observedsample mean
�x ¼ 1n
Xni¼1
xi
and the observed sample mean square
s2 ¼ 1n� 1
Xni¼1
xi � �xð Þ2
Introduction xv
respectively. 100 1� að Þ% confidence interval of l is given by
�x� ta2;n�1
sffiffiffin
p
where ta2;n�1 is the upper a
2 point of the t-distribution with n� 1ð Þ d.f.
Problem of Testing of Hypothesis
Besides point estimation and interval estimation, we are often required to decidewhich value among a set of values of a parameter is true for a given populationdistribution, or we may be interested in finding out the relevant distribution todescribe a population. The procedure by which a decision is taken regarding theplausible value of a parameter or the nature of a distribution is known as the testingof hypotheses. Some examples of hypothesis, which can be subjected to statisticaltests, are as follows:
1. The average length of life l of electric light bulbs of a certain brand is equal tosome specified value l0.
2. The average number of bacteria killed by tests drops of germicide is equal tosome number.
3. Steel made by method A has a mean hardness greater than steel made bymethod B.
4. Penicillin is more effective than streptomycin in the treatment of disease X.5. The growing period of one hybrid of corn is more variable than the growing
period for other hybrids.6. The manufacturer claims that the tires made by a new process have mean life
greater than the life of a tire manufactured by an earlier process.7. Several varieties of wheat are equally important in terms of yields.8. Several brands of batteries have different lifetimes.9. The characters in the population are uncorrelated.
10. The proportion of non-defective items produced by machine A is greater thanthat of machine B.
The examples given are simple in nature, and are well established and havewell-accepted decision rules.
Problems of Non-parametric Estimation
So far we have assumed in statistical inference (parametric) that the distribution of therandom variable being sampled is known except for some parameters. In practice, thefunctional form of the distribution is unknown. Here, we are not concerned to the
xvi Introduction
techniques of estimating the parameters directly, but with certain pertinent hypothesisrelating to the properties of the population, such as equalities of distribution, tests ofrandomness of the sample without making any assumption about the nature of thedistribution function. Statistical inference under such a setup is called non-parametric.
Bayes Estimator
In case of parametric inference, we consider density function f x=hð Þ, where h is afixed unknown quantity which can take any value in parametric space H. InBayesian approach, it is assumed that h itself is a random variable and density f x=hð Þis the density of x for a given h. For example, suppose we are interested in estimatingP, the fraction of defective items in a consignment. Consider a collection of lots,called superlots. It may happen that the parameter P may differ from lot to lot. In theclassical approach, we consider P as a fixed unknown parameter, whereas inBayesian approach, we say that P varies from lot to lot. It is random variable havinga density f Pð Þ, say. Bayes method tries to use this additional information about P.
Example Let X1; X2; . . .Xn be a random sample from PDF
f x; a; bð Þ ¼ 1b a; bð Þ x
a�1 1� xð Þb�1; 0\x\1; a; b[ 0:
Find the estimators of a and b by the method of moments.
AnswerWe know
E xð Þ ¼ l11 ¼a
aþ band E x2
� � ¼ l12 ¼a aþ 1ð Þ
aþ bð Þ aþ bþ 1ð Þ
Hence
aaþ b
¼ x;a aþ 1ð Þ
aþ bð Þ aþ bþ 1ð Þ ¼1n
Xni¼1
x2i
Solving, we get
bb ¼ x� 1ð Þ Px2i � nx
� �Pxi � xð Þ2 and ba ¼ xbb
1� x
Example Let X1; X2; . . .Xn be a random sample from PDF
f x; h; rð Þ ¼ 1hr
ffiffir
p e�x=hxr� 1j ; x[ 0; h[ 0; r[ 0
Introduction xvii
Find estimator of θ and r by
(i) Method of moments(ii) Method of maximum likelihood
AnswerHere
E xð Þ ¼ l11 ¼ rh ;E x2� � ¼ l12 ¼ r rþ 1ð Þh2 and m1
1 ¼ x; m12 ¼
1n
Xni¼1
x2i
Hence
rh ¼ x; r rþ 1ð Þh2 ¼ 1n
Xni¼1
x2i
Solving, we get
br ¼ nx2Pni¼1
xi � xð Þ2and bh ¼
Pni¼1
xi � xð Þ2
nx
(i) L ¼ 1hnr
ffiffir
pð Þn e�1
h
Pni¼1
xi Qni¼1
xr�1i
(ii) log L ¼ �nr log h� n logffiffiffin
p � 1h
Pnixi þ r � 1ð ÞPn
i¼1wgxi
Now,
@ log L@h
¼ � nrh
þ nx
h2¼ 0 ) bh ¼ x
r
Or
@ log L@r
¼ �n log h� n@ log
ffiffir
p@r
þXni¼1
log xi
¼ n log r � ns rð Þffiffi
rp � n log xþ
Xni
log xi
It is, however, difficult to solve the equation
@ log L@r
¼ 0
xviii Introduction
and to get the estimate of r. Thus, for this example, estimators of θ and r are moreeasily obtained by the method of moments than the method of maximum likelihood.
Example Find the estimators of α and β by the method of moments.
Proof We know
E xð Þ ¼ l11 ¼aþ b2
and V xð Þ ¼ l2b� að Þ212
Hence
aþ b2
¼ x andb� að Þ212
¼ 1n
Xni¼1
xi � xð Þ2
Solving, we get
ba ¼ x�ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi3P
xi � xð Þ2n
sand bb ¼ xþ
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi3P
xi � xð Þ2n
s
Example If a sample of size one be drawn from the PDF f x; bð Þ ¼2b2
b� xð Þ; 0\x\b find bb, the MLE of β and b� the estimator of β based on
method of moments. Show that bb is biased but b� is unbiased. Show that the
efficiency of bb with respect to b� is 2/3.
SolutionHere suppose
L ¼ 2
b2b� xð Þ
Then
LogL ¼ Log2� 2 log bþ log b� xð Þ
Or
@ log L@b
¼ � 2
b2þ 1
b� x¼ 0 ) b ¼ 2x
Now,
E xð Þ ¼ 2b
Zb0
bx� x2� �
dx ¼ b3
Introduction xix
Hence
b3¼ x ) b ¼ 3x
Thus the estimator of β based on method of moments is given as b� ¼ 3x. Now,
E bb� �¼ 2� b
3¼ 2b
36¼ b
E b�ð Þ ¼ 3� b3¼ b
Hence bb is biased but b� is unbiased.Again
E x2� � ¼ 2
b2
Zb0
bx2 � x3� �
dx ¼ b2
6
Therefore,
V xð Þ ¼ b2
6� b2
9¼ b2
18
Solving, we get
V b�ð Þ ¼ 9V xð Þ ¼ b2
9
V bb� �¼ 4V xð Þ ¼ 2
9b2
Hence
M bb� �¼ V bb� �
þ E bb� �� b
h i2¼ 2
9b2 þ 2
3b� b
� �2
¼ 13b2
Thus the efficiency of bb with respect to b� is 2/3.
Example Let x1; x2; . . . xnð Þ be a given sample of size n. It is to be tested whetherthe sample comes from some Poisson distribution with unknown mean μ. How doyou estimate μ by the method of modified minimum chi-square?
xx Introduction
SolutionLet x1; x2; . . . xn be arranged in K groups such that there are ni observations with
x ¼ i; i ¼ rþ 1; . . .; rþ k � 2; nL observations x� r, and nu observations withx rþ k � 1; so that the smallest and the largest values of x that are fewer arepooled together and
nL þXrþ k�2
i¼rþ 1
ni þ nu ¼ n
Let
piðlÞ ¼ Pðx ¼ iÞ ¼ e�uli
i!
pLðlÞ ¼ Pðx� rÞ ¼Xni¼0
piðlÞ
puðlÞ ¼ Pðx rþ k � 1Þ ¼X1
i¼rþ k�1
piðlÞ
Now, by using
Xki¼1
nipiðhÞ
@piðhÞ@hj
¼ 0 j ¼ 1; 2; . . .:p
We have
nL
Pri¼0
il � 1
� �piðlÞPr
i¼0piðlÞ
þXrþ k�2
i¼rþ 1
niil� 1
� �þ nu
P1i¼rþ k�1
il � 1
� �piðlÞP1
i¼rþ k�1piðlÞ
¼ 0
Since there is only one parameter, i.e., p ¼ 1 we get the only above equation.Solving,we get
nl̂ ¼ nL
Pri¼0
ipiðlÞPri¼0
piðlÞþ
Xrþ k�2
i¼rþ 1
ini þ nu
P1i¼rþ k�1
ipiðlÞP1i¼rþ k�1
piðlÞ
¼ sum of all x0s
Hence l̂ is approximately the sample mean �x
Introduction xxi
Example In general, we consider n uncorrelated observations y1; y2; . . .yn such thatEðyiÞ ¼ b1x1i þ b2x2i þ . . .. . .. . .. . .þ bkxki and V(yi) = r2; i ¼ 1; 2; . . .. . .; n; x1i ¼18i; where b1; b2. . .. . .. . .. . .bk and r2 are unknown parameters. If Y and b� standfor column vectors of the variables yi and parameters bj and if X ¼ ðxjiÞ be anðn� kÞ matrix of known coefficients xji the above equation can be written as
EðYÞ ¼ Xb� and V(e) = Eðee0Þ ¼ r2I
Where e ¼ Y � Xb� is an ðn� 1Þ vector of error random variable with EðeÞ ¼ 0 andI is an ðn� nÞ identity matrix. The least square method requires that b0s be calculatedsuch that / ¼ ee0 ¼ ðY � Xb�Þ0 ðY � Xb�Þbe the minimum. This is satisfied when
@/@b�
¼ 0 on 2X 0ðY � Xb�Þ ¼ 0
The least square estimators b0 s is thus given by the vector b̂� ¼ ðX 0XÞ�1X 0Y .
Example Let yi ¼ b1x1i þ b2x2i þ . . .. . .. . .. . .þ bkxki; i ¼ 1; 2; . . .. . .; n orEðyiÞ ¼ b1x1i þ b2x2i; x1i ¼ 1 for all i. Find the least square estimates of b1 and b2.Prove that the method of maximum likelihood and the method of least square areidentical for the case of normal distribution.
SolutionIn matrix notation, we have
EðY) = Xb� whereX ¼1 x211 x22... ..
.
1 x2n
0BBB@1CCCA; b� ¼ b1
b2
� �and Y ¼
y1y2...
yn
0BBB@1CCCA
Now,
b̂� ¼ ðX 0XÞ�1X 0Y
Here
X 0X ¼ 1 1 . . . 1
x21 x22 . . . x2n
� � 1 x211 x22
..
. ...
1 x2n
0BBBB@1CCCCA ¼ n
Px2iP
x2iP
x22i
� �
X 0Y ¼P
yiPx2iyi
� �
xxii Introduction
Then
b̂� ¼ 1
nP
x22i � ðP x2iÞ2P
x22i �Px2i
�Px2i n
� � PyiP
x2iyi
� �¼ 1
nP
x22i � ðP x2iÞ2P
x22iP
yi �P
x2iP
x2iyi�P
x2iP
yi þ nP
x2iP
yi
� �Hence
b̂2 ¼nP
x2iP
yi �P
x2iP
yinP
x22i � ðP x2iÞ2
¼P
x2iP
yi � n�x2�yPx22i � n�x22
¼P ðx2i � �x2Þðyi � �yÞP ðx2i � �x2Þ2
and
b̂1 ¼P
x22iP
yi �P
x2iP
x2iyinP
x22i � ðP x2iÞ2
¼ �yP
x22i � �x2P
x2iyiPx22i � n�x2
¼ �yþ �yn�x22 � �x2P
x2iyiPx22i � n�x22
¼ �y� �x2b̂2
Let yi be an independent Nðb1 þ b2xi; r2Þ variate, i ¼ 1; 2; . . .. . .; n so that
EðyiÞ ¼ b1 þ b2xi: The estimators of b1 and b2 are obtained by the method of leastsquare on minimizing
/ ¼Xni¼1
ðyi � b1 � b2xiÞ2
The likelihood estimate is
L ¼ 1ffiffiffiffiffiffi2p
pr
� �n
e1
2r2
Pðyi�b1�b2xiÞ2
Introduction xxiii
L is maximum whenPni¼1
ðyi � b1 � b2xiÞ2 is minimum. By the method of
maximum likelihood we choose b1 and b2 such thatPni¼1
ðyi � b1 � b2xiÞ2 ¼ / is
minimum. Hence both the methods of least square and maximum likelihood esti-mator are identical.
xxiv Introduction