term paper - stats

Upload: abhay-singh

Post on 03-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/28/2019 Term Paper - Stats

    1/19

    Binomial and Normaldistributions used in

    business forecastingMade By:

    Abhay Singh

    Roll No.50202

    BBS I A

    Business Statistics

    and Applications

    Term Paper

    By Abhay Singh

  • 7/28/2019 Term Paper - Stats

    2/19

    STATISTICS TERM PAPER

    Figure

  • 7/28/2019 Term Paper - Stats

    3/19

  • 7/28/2019 Term Paper - Stats

    4/19

    What are theoretical distributions?Theoretical distributions are a more scientific way of drawing inferences about

    the population characteristics. In the population, the value of the variable may

    be distributed according to some definite probability law which can be expressed

    mathematically and the corresponding probability distribution is known as the

    theoretical probability distribution.

    Such probability laws may be based on a prior considerations or a posteriori

    inferences. These distributions are based on expectations on the bass of previous

    experience. Theoretical distributions also enable us to fit a mathematical model

    or a function of the form y = p(x) to the given

    data.

    Refer to Figure on page 4

    Figure

    Binomial DistributionIn probability theory and statistics, the binomial distribution is the discrete

    probability distribution of the number of successes in a sequence

    ofnindependent yes/no experiments, each of which yields success

    with probabilityp. Such a success/failure experiment is also called a Bernoulli

    experiment or Bernoulli trial; when n = 1, the binomial distribution is a Bernoulli

    distribution. The binomial distribution is the basis for the popular binomial

    test ofstatistical significance.

    The binomial distribution is frequently used to

    model the number of successes in a sample of

    size n drawn with replacement from a

    population of size N. If the sampling is carried out

    without replacement, the draws are not

    independent and so the resulting distribution is a

    hypergeometric distribution, not a binomial one. However, for N much larger

    than n, the binomial distribution is a good approximation, and widely used.

    DiscreteProbabilit

    Continuous

    Probabilit

    http://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Discrete_probability_distributionhttp://en.wikipedia.org/wiki/Discrete_probability_distributionhttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Bernoulli_distributionhttp://en.wikipedia.org/wiki/Bernoulli_distributionhttp://en.wikipedia.org/wiki/Binomial_testhttp://en.wikipedia.org/wiki/Binomial_testhttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Hypergeometric_distributionhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Discrete_probability_distributionhttp://en.wikipedia.org/wiki/Discrete_probability_distributionhttp://en.wikipedia.org/wiki/Statistical_independencehttp://en.wikipedia.org/wiki/Probabilityhttp://en.wikipedia.org/wiki/Bernoulli_trialhttp://en.wikipedia.org/wiki/Bernoulli_distributionhttp://en.wikipedia.org/wiki/Bernoulli_distributionhttp://en.wikipedia.org/wiki/Binomial_testhttp://en.wikipedia.org/wiki/Binomial_testhttp://en.wikipedia.org/wiki/Statistical_significancehttp://en.wikipedia.org/wiki/Hypergeometric_distributionhttp://en.wikipedia.org/wiki/Probability_theory
  • 7/28/2019 Term Paper - Stats

    5/19

    Cumulative Distribution Function

    Probability Mass Function

    Why is it important?

    The binomial distribution is widely used to test statistical probabilities and

    significance, and is a good way of visually detecting unexpected values. It is a

    useful tool in determining permutations, combinations, and probabilities, wherethe outcomes can be broken Binomial Distribution into two probabilities

    (p and q), where p and q are complementary (i.e., p + q = 1).

    For example, tossing a coin has only two possible outcomes, heads or tails. Each

    of these outcomes has a theoretical probability of 0.5. Using the binomial

    expansion, showing all possible outcomes and combinations, the probability is

    represented as follows:

    (p + q)2 = p2 + 2pq +q2, or more simply, pp + 2pq + qqIf p is heads and q is tails, the theory shows there is only one way to get two

    heads (pp), two ways to get a head and a tail (2pq), and one way to get two tails

    (qq).

    Common uses of binomial distributions in business include quality control, public

    opinion surveys, medical research, and insurance problems. It can be applied to

    complex processes such as sampling items in factory production lines or to

    estimate percentage failure rates of products and components.

  • 7/28/2019 Term Paper - Stats

    6/19

    Quick facts

    To satisfy the requirements of binomial distribution, the event being studied must

    display certain characteristics:

    the number of trials or occurrences are fixed

    there are only two possible outcomes (heads/tails or win/lose, for example)

    all occurrences are independent of each other (tossing a head does not

    make it more or less likely you will get the same result next time)

    all outcomes have the same probability of success

    Binomial distribution is best applied in cases where the population size is

    at least 10 times the sample size, and not to simple random samples.

    To find probabilities from a binomial distribution, you can perform amanual calculation, but there are online calculators available, or you can

    use a binomial table or computer spreadsheet.

    The binomial distribution is sometimes called a Bernoulli experiment or

    trial.

    The binomial probability refers to the probability that a binomial

    experiment results in exactly x successes. In example above, we see that

    the binomial probability of getting exactly one head in two coin flips is 0.5.

    A cumulative binomial probability refers to the probability that the

    binomial random variable falls within a specified range (for example, is

    greater than or equal to a stated lower limit and less than or equal to a

    stated upper limit).

    Uses in Business

    1. Quality Control

    In statistical quality control, the p-chart is a type ofcontrol chart used to monitor the proportionofnonconforming unitsin a sample, where the sample proportion nonconforming is defined as the

    ratio of the number of nonconforming units to the sample size, n.

    The p-chart only accommodates "pass"/"fail"-type inspection as determined by one or more go-no go

    gauges or tests, effectively applying the specifications to the data before they are plotted on the chart.

    Other types of control charts display the magnitude of the quality characteristic under study, making

    troubleshooting possible directly from those charts.

    The binomial distributionis the basis for the p-chart and requires the following assumptions [2]:267:

    The probability of nonconformityp is the same for each unit;

    Each unit is independent of its predecessors or successors;

    The inspection procedure is same for each sample and is carried out consistently fromsample to sample

    http://en.wikipedia.org/wiki/Statistical_process_controlhttp://en.wikipedia.org/wiki/Control_charthttp://en.wikipedia.org/wiki/Nonconformity_(quality)http://en.wikipedia.org/wiki/Nonconformity_(quality)http://en.wikipedia.org/wiki/Sample_(statistics)http://en.wikipedia.org/wiki/Go-NoGo_gaugehttp://en.wikipedia.org/wiki/Go-NoGo_gaugehttp://en.wikipedia.org/wiki/Go-NoGo_gaugehttp://en.wikipedia.org/wiki/Specification_(technical_standard)http://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/P-chart#cite_note-Montgomery2005-1http://en.wikipedia.org/wiki/P-chart#cite_note-Montgomery2005-1http://en.wikipedia.org/wiki/Statistical_process_controlhttp://en.wikipedia.org/wiki/Control_charthttp://en.wikipedia.org/wiki/Nonconformity_(quality)http://en.wikipedia.org/wiki/Sample_(statistics)http://en.wikipedia.org/wiki/Go-NoGo_gaugehttp://en.wikipedia.org/wiki/Go-NoGo_gaugehttp://en.wikipedia.org/wiki/Specification_(technical_standard)http://en.wikipedia.org/wiki/Binomial_distributionhttp://en.wikipedia.org/wiki/P-chart#cite_note-Montgomery2005-1
  • 7/28/2019 Term Paper - Stats

    7/19

    A P-Chart

    2. Public Opinion SurveyWhen doing a public opinion poll before an election in a country, they usually

    approximate the hypergeometric distribution with a binomial distribution and

    then using the normal approximation:

    m +/-1.96*sqrt(m*(1-m)/n) to calculate a 95% confidence interval?

    m= mean

    n= number of people in the poll

    is it as simple as that? assuming party A gets 30% of the votes, and 2000 voted

    we get a 95% statistically significant interval of:30 +/-1.96*sqrt(0.30*(0.7)/2000)

    However if we make a poll within a defined population of lets say 5000 people.

    and the number of people in the poll is 3000. party A gets 30% of the votes in

    the poll. What can we say about the total population of 5000.

    here we should use the hypergeometric distribution and cannot use the binomial

    approximation.

    3. Medical Research

    A binomial distribution can be used to describe the number of times an event willoccur in a group of patients, a series of clinical trials, or any other sequence of

    observations. This event is a binary variable: It either occurs or it doesn't. For

    example, when patients are treated with a new drug they are either cured or not;

    when a coin is flipped, the result is either a head or tail. The binary outcome

    associated with each event is typically referred to as either a "success" or a

    "failure." In general, a binomial distribution is used to characterize the number of

    successes over a series of observations (or trials), where each observation is

    referred to as a "Bernoulli trial."

    In a series ofn Bernoulli trials, the binomial distribution can be used to calculatethe probability of obtaining ksuccessful outcomes. If the variable X represents

  • 7/28/2019 Term Paper - Stats

    8/19

    the total number of successes in n trials, it can only take on a value from 0 to n.

    The binomial distribution can be used to calculate the probability of

    obtaining ksuccesses in n trials is calculated as follows:

    where 0 less than or equal to p less than or equal to 1 is the probability of

    success, and n!= 1 2 3[.dotmath][.dotmath][.dotmath][.dotmath](n2)(n1)n.

    The above formula assumes that the experiment consists ofn identical trials that

    are independent from one another, and that there are only two possible

    outcomes for each trial (success or failure). The probability of success (p) is also

    assumed to be the same in each of the trials.

    To further illustrate the application of the above formula, if a drug was developed

    that cured 30 percent of all patients, and it was administered to ten patients, the

    probability that exactly four patients would be cured is:

    Like other distributions, the binomial distribution can be described in terms of a

    mean and the spread, or variance, of values. The mean value of a binomial

    random variable X (i.e., the average number of successes in n trials) can be

    obtained by multiplying the number of trials byp (np). In the above example, the

    average number of persons cured in any group of 10 patients would thus be 3.

    The variance of a binomial distribution is np (1p). The variance is largest for

    p = 0.5, while it decreases as p approaches 0 or 1. Intuitively, this makes sense,

    since when p is very large or small nearly all the outcomes take on the same

    value. Returning to the example, a drug that cured every patient p would equal

    one, while for a drug that cured no one, p would equal zero. In contrast, if thedrug was effective in curing only half of the population (p = 0.5) it would be

    more difficult to predict the outcome in any particular patient, and in this case

    the variability is relatively large.

    In studies of public health, the binomial distribution is used when a researcher is

    interested in the occurrence of an event rather than in its magnitude. For

    instance, smoking cessation interventions may choose to focus on whether a

    smoker quit smoking altogether, rather than evaluate daily reductions in the

    number of cigarettes smoked. The binomial distribution plays an important role

    in statistics, as it is likely the most frequently used distribution to describe

    discrete data.

    4. Insurance Sector

    There is similarity between insurance and game of chance and thereforeunderstandingthe concept of probability and its application to general insurance is ofimportance to us.When we toss a coin, we say that the probability of getting a head is .

    There are two

  • 7/28/2019 Term Paper - Stats

    9/19

    possibilities head or tail out of which one possibility i.e. head isfavourable andtherefore we say that probability of getting head is or 50%. This is thetheoretical wayof calculating probability called a priori i.e. prior to experience. In

    contrast, we cannot deduce theoretically the probability of a car being stolen within theyear. For handlingproblems of this nature, we need to have data about the total number ofcars and theproportion that is stolen. There is another way of looking at probability ofgetting head intossing of coin which is more relevant for our purpose. If we go on tossingthe coin andnote the number of heads coming and if ideally this tossing is continued

    for infinitenumber of times we will find that the proportion of head coming is .Every time we aretossing, we are in effect generating experience. The fact that probabilityinvolves longrun concept is important in the general insurance contract. Further head &tail aremutually exclusive events. The idea of mutual exclusivity can apply forexample tocalculating the probability of an injured employee being male or female,injured or killed,

    damages being above or below certain level, etc. We may say that if theevent is certainto happen the probability is one and if the happening is absoluteimpossibility theprobability is zero. If the probability of happening a claim is one i.e. acertainty noinsurance company will assume such a risk except perhaps by chargingthe premiumwhich is more than the sum insured. If the happening is an impossibilityi.e. probabilityis zero, nobody would like to insure it. Between these two extremes, liesthe various riskthat come for insurance. The higher the probability of claims happening,the highershould be the premium. Probability thus attaches a numerical value to ourmeasurementof the likelihood of an event occurring. We shall now examine the law oflarge numberand the concept of probability distribution. We shall also see how theseprobabilitydistribution help us in estimating the number of claims that will be

    reported in future

  • 7/28/2019 Term Paper - Stats

    10/19

    during a given period of time and what will be the size of these claims.The law of largenumbers in simple terms means that the larger the data, the moreaccurate will be theestimate made. In other words the larger the sample, the more accurate

    will be theestimates of the population parameters. In general insurance it wouldmean that the largerthe past data about claims, the better will be the estimate of theprediction about claimfrequency and size. It is assumed that the claim will occur in future asthey have occurredin the past. What is a probability distribution? It is the listing of all possiblevalues of arandom variable along with their associated probabilities. For our purpose,

    the probabilitydistribution can be considered to be a mathematical model which candescribe the actual probability distribution. Of course, the actualprobability estimated from the availabledata will rarely coincide with those generated by the theoreticaldistribution. But the lawof large number says that it will tend closer & closer if we have sufficientlylargedatabase. Even if the data available is not extensive, we can make use ofvarioustheoretical distributions to make meaningful inferences about the behaviour of

    datarelating to a particular insurance portfolio. The fact that this theoreticaldistribution canbe completely summarized by a small number of parameters is of great help.

    The shapeof distribution is determined by its parameters. Parameters are numericalcharacteristicsof population. If we have set of data relating to say claims size, we cannot makebest useof them in their raw form. We may be interested to know about the average sizeof theclaim. We have a whole set of measures called the measure of central tendency.

    Similarlyto properly understand the significance of the data, it is essential to know thevariabilityof data around the central tendency. In case the variance is too high, may be onehas todecide about the required reinsurance support. Yet another aspect to properlyunderstandthe given set of data is the Skew ness aspect. The distribution may be verysymmetricor it may be skewed having long trail to the right (positively skewed). Many ofthe

    distribution we encounter in general insurance is skew with long tail to the right.We havea measure of this skew ness which is zero for symmetric distribution. Positive for

  • 7/28/2019 Term Paper - Stats

    11/19

    positively skewed distribution and negative for negatively skewed distribution(long tailto the left). Again a knowledge and measure of flatness or peaked ness of adistributionis important for us. So we have what is called a measure of kurtosis. These

    aspects ifknown properly can help in better claims management. Fortunately for us, therearetheoretical distribution models which approximate the existing claims datarelating tovarious risk categories. The actuaries make use of these models. These providemethodsof summarizing aspects of complexities. Some distributions are continuous innature andmay relate to claim size distribution and in the analysis of heterogeneity. Theothersrelate to discrete variables and hence are helpful in studying the claim numbers

    distribution.

    Normal Distribution

    In probability theory, the normal (or Gaussian) distribution is a continuous

    probability distribution that is often used as a first approximation to describe

    real-valued random variables that tend to cluster around a single mean value.

    The graph of the associated probability density function is "bell"-shaped, and is

    known as the Gaussian function or bell curve.

    where parameter is the mean or expectation (location of the peak) and 2 is

    the variance, the mean of the squared deviation, (a "measure" of the width of

    the distribution). is the standard deviation. The distribution with =

    0 and 2 = 1 is called the standard normal.

    The normal distribution is considered the most prominent probability

    distribution in statistics. There are several reasons for this: First, the normal

    distribution is very tractable analytically, that is, a large number of results

    involving this distribution can be derived in explicit form. Second, the normal

    distribution arises as the outcome of the central limit theorem, which states

    that under mild conditions the sum of a large number ofrandom variables is

    distributed approximately normally. Finally, the "bell" shape of the normal

    distribution makes it a convenient choice for modelling a large variety of

    random variables encountered in practice.

    For this reason, the normal distribution is commonly encountered in practice,

    and is used throughout statistics, natural sciences, and social sciences as a

    simple model for complex phenomena. For example, the observational

    error in an experiment is usually assumed to follow a normal distribution, and

    the propagation of uncertainty is computed using this assumption. Note thata normally-distributed variable has a symmetric distribution about its mean.

    http://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Continuous_probability_distributionhttp://en.wikipedia.org/wiki/Continuous_probability_distributionhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Meanhttp://en.wikipedia.org/wiki/Probability_density_functionhttp://en.wikipedia.org/wiki/Gaussian_functionhttp://en.wikipedia.org/wiki/Meanhttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Variancehttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Central_limit_theoremhttp://en.wikipedia.org/wiki/Random_variableshttp://en.wikipedia.org/wiki/Natural_sciencehttp://en.wikipedia.org/wiki/Social_sciencehttp://en.wikipedia.org/wiki/Observational_errorhttp://en.wikipedia.org/wiki/Observational_errorhttp://en.wikipedia.org/wiki/Propagation_of_uncertaintyhttp://en.wikipedia.org/wiki/Probability_theoryhttp://en.wikipedia.org/wiki/Continuous_probability_distributionhttp://en.wikipedia.org/wiki/Continuous_probability_distributionhttp://en.wikipedia.org/wiki/Random_variablehttp://en.wikipedia.org/wiki/Meanhttp://en.wikipedia.org/wiki/Probability_density_functionhttp://en.wikipedia.org/wiki/Gaussian_functionhttp://en.wikipedia.org/wiki/Meanhttp://en.wikipedia.org/wiki/Expected_valuehttp://en.wikipedia.org/wiki/Variancehttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Statisticshttp://en.wikipedia.org/wiki/Central_limit_theoremhttp://en.wikipedia.org/wiki/Random_variableshttp://en.wikipedia.org/wiki/Natural_sciencehttp://en.wikipedia.org/wiki/Social_sciencehttp://en.wikipedia.org/wiki/Observational_errorhttp://en.wikipedia.org/wiki/Observational_errorhttp://en.wikipedia.org/wiki/Propagation_of_uncertainty
  • 7/28/2019 Term Paper - Stats

    12/19

    Quantities that grow exponentially, such as prices, incomes or populations,

    are often skewed to the right, and hence may be better described by other

    distributions, such as the log-normal distribution or Pareto distribution. In

    addition, the probability of seeing a normally-distributed value that is far (i.e.

    more than a few standard deviations) from the mean drops off extremely

    rapidly. As a result, statistical inference using a normal distribution is not

    robust to the presence ofoutliers(data that is unexpectedly far from the

    mean, due to exceptional circumstances, observational error, etc.). When

    outliers are expected, data may be better described using a heavy-

    tailed distribution such as the Student's t-distribution.

    Probability Density Function

    Cumulative Descriptive

    Function

    Uses

    1. Modern Portfolio Theory

    Modern portfolio theory (MPT)or portfolio theorywas introduced

    by Harry Markowitz with his paper "Portfolio Selection," which appeared

    in the 1952Journal of Finance. Thirty-eight years later, he shared a Nobel

    Prize with Merton Miller and William Sharpefor what has become a broad

    theory for portfolio selection.

    http://en.wikipedia.org/wiki/Exponential_growthhttp://en.wikipedia.org/wiki/Skewnesshttp://en.wikipedia.org/wiki/Log-normal_distributionhttp://en.wikipedia.org/wiki/Pareto_distributionhttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Statistical_inferencehttp://en.wikipedia.org/wiki/Outliershttp://en.wikipedia.org/wiki/Heavy-tailedhttp://en.wikipedia.org/wiki/Heavy-tailedhttp://en.wikipedia.org/wiki/Student's_t-distributionhttp://www.riskglossary.com/articles/capital_asset_pricing_model.htmhttp://en.wikipedia.org/wiki/Exponential_growthhttp://en.wikipedia.org/wiki/Skewnesshttp://en.wikipedia.org/wiki/Log-normal_distributionhttp://en.wikipedia.org/wiki/Pareto_distributionhttp://en.wikipedia.org/wiki/Standard_deviationhttp://en.wikipedia.org/wiki/Statistical_inferencehttp://en.wikipedia.org/wiki/Outliershttp://en.wikipedia.org/wiki/Heavy-tailedhttp://en.wikipedia.org/wiki/Heavy-tailedhttp://en.wikipedia.org/wiki/Student's_t-distributionhttp://www.riskglossary.com/articles/capital_asset_pricing_model.htm
  • 7/28/2019 Term Paper - Stats

    13/19

    Prior to Markowitz's work, investors focused on assessing the risks and

    rewards of individualsecurities in constructing their portfolios. Standard

    investment advice was to identify those securities that offered the best

    opportunities for gain with the least risk and then construct a portfolio

    from these. Following this advice, an investor might conclude thatrailroad stocks all offered good risk-reward characteristics and compile a

    portfolio entirely from these. Intuitively, this would be foolish. Markowitz

    formalized this intuition. Detailing a mathematics ofdiversification, he

    proposed that investors focus on selecting portfolios based on their overall

    risk-reward characteristics instead of merely compiling portfolios from

    securities that each individually have attractive risk-reward

    characteristics. In a nutshell, inventors should select portfolios not

    individual securities.

    If we treat single-period returns for various securities as random variables,

    we can assign them expected values, standard

    deviations and correlations. Based on these, we can calculate the

    expected return and volatility of any portfolio constructed with those

    securities. We may treat volatility and expected return as proxy's for risk

    and reward. Out of the entire universe of possible portfolios, certain ones

    will optimally balance risk and reward. These comprise what Markowitz

    called an efficient frontier of portfolios. An investor should select a

    portfolio that lies on the efficient frontier.

    James Tobin (1958) expanded on Markowitz's work by adding a risk-free

    asset to the analysis. This made it possible to leverage or deleverage

    portfolios on the efficient frontier. This lead to the notions of a super-

    efficient portfolio and the capital market line. Through leverage, portfolios

    on the capital market line are able to outperform portfolio on the efficient

    frontier.

    Sharpe (1964) formalized the capital asset pricing model(CAPM). This

    makes strong assumptions that lead to interesting conclusions. Not only

    does the market portfolio sit on the efficient frontier, but it is actually

    Tobin's super-efficient portfolio. According to CAPM, all investors should

    hold the market portfolio, leveraged or de-leveraged with positions in the

    risk-free asset. CAPM also introduced beta and relates an asset's expected

    return to its beta.

    Portfolio theory provides a broad context for understanding the

    interactions ofsystematic risk and reward. It has profoundly shaped howinstitutional portfolios are managed, and motivated the use of passive

    http://www.riskglossary.com/articles/risk.htmhttp://www.riskglossary.com/articles/security.htmhttp://www.riskglossary.com/articles/common_stock.htmhttp://www.riskglossary.com/articles/hedging_and_diversification.htmhttp://www.riskglossary.com/articles/return.htmhttp://www.riskglossary.com/articles/mean.htmhttp://www.riskglossary.com/articles/standard_deviation.htmhttp://www.riskglossary.com/articles/standard_deviation.htmhttp://www.riskglossary.com/articles/correlation.htmhttp://www.riskglossary.com/articles/volatility.htmhttp://www.riskglossary.com/articles/efficient_frontier.htmhttp://www.riskglossary.com/articles/portfolio_theory.htm#Tobinhttp://www.riskglossary.com/articles/leverage.htmhttp://www.riskglossary.com/articles/capital_market_line.htmhttp://www.riskglossary.com/articles/capital_market_line.htmhttp://www.riskglossary.com/articles/capital_market_line.htmhttp://www.riskglossary.com/articles/portfolio_theory.htm#Sharpehttp://www.riskglossary.com/articles/capital_asset_pricing_model.htmhttp://www.riskglossary.com/articles/market_portfolio.htmhttp://www.riskglossary.com/articles/beta.htmhttp://www.riskglossary.com/articles_old/glossarysystemicrisk.htmhttp://www.riskglossary.com/articles/risk.htmhttp://www.riskglossary.com/articles/security.htmhttp://www.riskglossary.com/articles/common_stock.htmhttp://www.riskglossary.com/articles/hedging_and_diversification.htmhttp://www.riskglossary.com/articles/return.htmhttp://www.riskglossary.com/articles/mean.htmhttp://www.riskglossary.com/articles/standard_deviation.htmhttp://www.riskglossary.com/articles/standard_deviation.htmhttp://www.riskglossary.com/articles/correlation.htmhttp://www.riskglossary.com/articles/volatility.htmhttp://www.riskglossary.com/articles/efficient_frontier.htmhttp://www.riskglossary.com/articles/portfolio_theory.htm#Tobinhttp://www.riskglossary.com/articles/leverage.htmhttp://www.riskglossary.com/articles/capital_market_line.htmhttp://www.riskglossary.com/articles/capital_market_line.htmhttp://www.riskglossary.com/articles/capital_market_line.htmhttp://www.riskglossary.com/articles/portfolio_theory.htm#Sharpehttp://www.riskglossary.com/articles/capital_asset_pricing_model.htmhttp://www.riskglossary.com/articles/market_portfolio.htmhttp://www.riskglossary.com/articles/beta.htmhttp://www.riskglossary.com/articles_old/glossarysystemicrisk.htm
  • 7/28/2019 Term Paper - Stats

    14/19

    investment management techniques. The mathematics of portfolio theory

    is used extensively in financial risk management and was a theoretical

    precursor for today's value-at-risk measures.

    2. Human resource ManagementA disadvantage shared by all employee-comparison systems is that of employee

    comparability. This has two aspects. The first has been mentioned: are the jobs

    sufficiently similar? The second is whether employees are rated on the same

    criteria. It is likely that one employee rates high for one reason and another rates

    low for an entirely different reason. Another disadvantage is that raters do not

    always have sufficient knowledge of the people being rated. Normally the

    immediate supervisor has this knowledge, but in large ranking systems,

    supervisors two and three levels removed often have to do the rating. The very

    size of units also poses a problem. The larger the number of employees to be

    ranked, the harder it is to do so; on the other hand, the larger the number in the

    group, the more logical it is that there is a normal distribution. This brings up one

    last problem. If the manager knows that some employees must be rated below

    average, he or she will start thinking of those employees that way. This leads to a

    self-fulfilling prophecy: the manager now treats them as if they cannot do well,

    and they respond by not doing well.

    Forecasting

    The most comprehensive methodology for comparing forecast data and demand

    data is to graph the cumulative results for a given period which include upper

    and lower control limits (calculated from historic demand based on a normal

    distribution curve). The following example shows how easy it is to quickly identify

    when the demand is in control and when the demand is out of control.

    http://www.riskglossary.com/articles/risk_management.htmhttp://www.riskglossary.com/articles/value_at_risk.htmhttp://www.riskglossary.com/articles/risk_management.htmhttp://www.riskglossary.com/articles/value_at_risk.htm
  • 7/28/2019 Term Paper - Stats

    15/19

    This graph shows a period of meeting the plan (Note A) and then a period when

    the actual revenues start to deviate from the forecast (Note B). At this point, the

    demand is still within the control parameters of "normal" demand and a

    cautionary watch may be put in place, although no action is required. However, it

    is clear at Note C the lower control limit has been exceeded and the "normal"

    expected demand is not being met. This is the time for action and the process togain a complete understanding of the error should be invoked. Supplemental

    charts will be necessary to analyze what is causing the deviation. These charts

    are similar to the one shown above, but with a separate breakdown for units,

    average selling price, product types, sales channels, customer and sales agent.

    A possible result is finding that the graph represents a normal trend in the

    business with no corrective actions necessary. It is also possible that the total

    revenue versus forecast may be in sync (note A), however mismatches may exist

    in the unit, average selling price, customer or product type mixtures. In each

    case, this information would not be known unless this type of analysis wereavailable as well as being in place for some time in order to understand the long

    term trends as well.

  • 7/28/2019 Term Paper - Stats

    16/19

    Applications:

    1. Using the Normal Distribution to Determine the

    Lower 10% Limit of Delivery Times

    A pizza deliveryman's delivery time is normally distributed with a mean of 20 minutes

    and a standard deviation of 4 minutes. What delivery time will be beaten by only 10%

    of all deliveries?

    Problem Parameter Outline

    Population Mean = = "mu" = 20 minute

    Population Standard Deviation = = "sigma" = 4 minutes

    x = ?

  • 7/28/2019 Term Paper - Stats

    17/19

    Probability that (delivery time x) = 10% = 0.10

    Delivery time is Normally distributed

    Normal curve is not standardized ( 0, 1)

    Problem Solving Steps

    We know that Delivery Time data is Normally distributed and can therefore be

    mapped on the Normal curve.

    We are trying to determine Delivery Time will be lower than 90% of all Delivery

    Times.

    This probability corresponds to the x value at which 90% of area under the

    Normal curve has a greater value and is to the right this x value. This x value

    must therefore be in the left tail of the Normal curve.

    If we know that 90% of the area under the Normal curve is to the right of this x

    value (this is illustrated in the graph below), then we know that 40% of the total

    area under the Normal curve is between this x value and the mean. The

    remaining 50% of the area under the Normal curve makes up the half the

    Normal curve that is on the opposite side (the right side) of the mean.

    If we know that 40% of the area under the Normal curve is between this x value

    and the mean, we can use the Z Score Chart to determine how many standard

    deviations this x value is from the mean. The Z Score Chart belowshows this x

    value to be 1.28 standard deviations from the mean.

    If we know how many standard deviations this x value is from the mean, we can

    use the following formula to calculate the x value, as follows:

    z = ( x - ) /

    x = z * +

    x = (-1.28) * 4 + 20 = 14.87

    The delivery time of 14.87 minutes is faster (smaller) than 90% of all delivery

    times. This is illustrated in the graph below.

    Answer: The delivery time of 14.87 minutes is faster (smaller) than 90% of

    all delivery times.

    http://excelmasterseries.com/Excel_Statistical_Master/Normal-Distribution.php#Problem%203%20-%20Graph1http://excelmasterseries.com/Excel_Statistical_Master/Normal-Distribution.php#Problem%203%20-%20Z%20ScoreCharthttp://excelmasterseries.com/Excel_Statistical_Master/Normal-Distribution.php#Problem%203%20-%20Graph1http://excelmasterseries.com/Excel_Statistical_Master/Normal-Distribution.php#Problem%203%20-%20Graph1http://excelmasterseries.com/Excel_Statistical_Master/Normal-Distribution.php#Problem%203%20-%20Z%20ScoreCharthttp://excelmasterseries.com/Excel_Statistical_Master/Normal-Distribution.php#Problem%203%20-%20Graph1
  • 7/28/2019 Term Paper - Stats

    18/19

    2. Finding the probability of a certain type of

    package passing down a conveyor belt if the

    probability of that type of package passing by

    is known.

    Problem: A conveyor belt brings packages to a truck to loaded. The packages are

    either black or white. The probablity that a package is black is 40%. What is the

    probability that out of the next 10 packages, at least 2 are black and 2 are white?

    The only possibility of at least two packages being white and 2 black would occur if

    the number of black packages equaled 0, 1, 9, or 10. The probability of at least 2

    packages being black and 2 white would therefore equal 1 minus the probability that

    the number of black packages equals 0, 1, 9, or 10.

    Probability of Success (Black Package) = p = 0.40

    Probability of No Success (White Package) = q = 1 - p = 0.60

    Number of Trials (Transactions) = n = 10

    Exact Number of Successes = k = 0, 1, 9, 10

    This problem uses Binomial formulas (Sample Occurrence Formulas) because

    what is being measured is the mean number of occurrences from a large

    number of individual samples that each have only two possible outcomes.

    Probability of at least 2 Black and 2 White Packages in 10 =

    1 - [ Pr(X=0) + Pr(X=1) + Pr(9) + Pr(10) ]

    PR (X = 0) = f(k; n, p) = n! / [ k! * ( n - k )! ] * pk * q(n-k)

    PR (X = 0) = f(0; 10, 0.40) = 0! / [ 10! * ( 10 - 0 )! ] * (0.40)\0 * (0.60)(10-0)

    PR (X = 0) = 0.0060 = 0.60%

    PR (X = 1) = f(k; n, p) = n! / [ k! * ( n - k )! ] * pk * q(n-k)

    PR (X = 1) = f(1; 10, 0.40) = 1! / [ 10! * ( 10 - 1 )! ] * (0.40)\1 * (0.60)(10-1)

    PR (X = 1) = 0.040 = 4.0%

    PR (X = 9) = f(k; n, p) = n! / [ k! * ( n - k )! ] * pk * q(n-k)

  • 7/28/2019 Term Paper - Stats

    19/19

    PR (X = 9) = f(9; 10, 0.40) = 9! / [ 10! * ( 10 - 9 )! ] * (0.40)\9 * (0.60)(10-9)

    PR (X = 9) = 0.002 = 0.2%

    PR (X = 10) = f(k; n, p) = n! / [ k! * ( n - k )! ] * pk * q(n-k)

    PR (X = 10) = f(10; 10, 0.40) = 10! / [ 10! * ( 10 - 10 )! ] * (0.40)10 * (0.60)(10-10)

    PR (X = 10) = 0.0 = 0.0%

    1 - [ Pr(X=0) + Pr(X=1) + Pr(9) + Pr(10) ] = 1 - [ 0.006 + 0.040 + 0.002 + 0.0 ]

    = 1 - 0.048 = 0.952 = 95.2%

    There is a 95.2% probability at at least 2 packages will be black and 2 packages

    will be white out of the next 10 packages if the probability of a package being

    black is 40%.

    I

    Insurance, 67

    T

    the binomial distribution. See Page 2

    Theoretical distributions, 2