updated practice final exams solutions

Upload: reza-mohagi

Post on 17-Oct-2015

1.536 views

Category:

Documents


5 download

DESCRIPTION

Exam Solution

TRANSCRIPT

  • Practice Exam 1 Solutions

    Final Examination

    Directions The exam will end 3 hours minutes after it begins. The exam is divided into two parts. The first part is multiple choice. Please answer the multiple choice questions on the exam by circling the best answer (some rounding occurs in several places). The second part of the exam consists of several problems. Please answer these problems in the space provided on the exam (you may use the backs of the sheets if necessary). You will get partial credit for these problems provided that your answers are organized and legible so that your train of thought can be easily followed. All answers must also be transferred to the answer sheet to be fully counted.

    Good Luck

    DON'T EVEN THINK ABOUT PANICING

    By Printing my name below I acknowledge that Harvard has an honor code and that I will adhere to it. Failure to abide by the honor code could result in failing this course and having to wash Professor Parzens car with my toothbrush. NAME: ______________________________________________ (-50 if not printed)

  • Multiple Choice (3 points each)

    1) A hypothesis test is used to prevent a machine from under-filling or overfilling quart bottles of beer. On the basis of a sample, the null hypothesis is rejected and the machine is shut down for inspection. A thorough examination reveals there is nothing wrong with the filling machine. From a statistical point of view:

    a. A correct decision was made. b. A Type I and Type II error were made. c. A Type I error was made. d. A Type II error was made.

    2) The median waiting time for patients to see a doctor at a local emergency room is

    much smaller than the mean waiting time. Which of the following is most consistent with this information (circle one):

    a. A histogram of the waiting times would be symmetric. b. A histogram of the waiting times would be left-skewed. c. A histogram of the waiting times would be right-skewed.

    3) A student is studying very hard in a fluid dynamics course, but he knows he will

    either pass or not pass. Suppose this student, Jack Daniels, has a probability of 0.90 for studying the night before the exam. Also, he has a probability of 0.75 for passing the exam. If the probability of passing the exam, given that he studied the night before, is 0.82, what may you conclude?

    a. The probability of Jack not passing the exam is 0.10. b. P(Jack studies OR Jack passes) is greater than 0.75. c. P(Jack does not pass AND Jack studies) is greater than 0.5. d. P(Jack passes AND Jack studies) is greater than 0.75. e. None of the above

    4) Suppose a computer processor yielded the following random sample of binary

    digits: 0101001101010100010000010101010001011010 01010010010101010100110001011010

    Is the computer processor yielding an even distribution of ones and zeros? If the above sample contains 72 digits, of which thirty are ones, what is the value of the test statistic to answer this question?

    a. t = -1.41 b. t = -1.65 c. t = 1.65 d. t = 1.41 e. None of the above

  • 5) If Steve and Doug Butabi want to find the proportion of people who believe they can move their heads graciously, how large of a sample size is required so that the margin of error is at most 2 percentage points with a 95% confidence level?

    a. 3382 b. 4148 c. 34 d. 266 e. None of the above

    6) In the Land of Chocolate, three friends (Hershey, Nougat, and B.C.) all have to make tough decisions about the next phase of their life where they have ONLY two options: going to college or working in the nearby sugar mines of Dos Catorce. Suppose the following probabilities are true: P(Hershey goes to college) = 0.2 = P(Nougat works at the sugar mines), P(B.C. goes to college) = 0.7. If each of them makes their choice independent of the others, what is the probability of Hershey and Nougat going to college while B.C. works at the sugar mines?

    a. 0.112 b. 0.048 c. 0.028 d. None of the above

    7) The following diagram comes from a famous piece of music.

    A random sample of 81 people indicated that 19 people knew the piece of music from the diagram alone. A random sample of 200 people (independent from the first) indicated that 175 people knew the piece of music when it was played on a piano. Construct a 95% confidence interval for the population proportion of people who know the piece of music from the diagram alone.

    a) (.14,.33) b) (.23,.56) c) (.12,.23) d) (.32,44) e) None of the above

  • Below you are given the graphs of two normal density curves, both with the same mean. Use these density curves to answer the following questions:

    8) The area under each of these curves is equal to 1.

    a) True b) False

    9) Which curve has the larger standard deviation?

    a) Graph A b) Graph B

    10) Which distribution has a smaller percent of its data between 35 and 40 units?

    a) Graph A b) Graph B

    11) Give a rough estimate of the standard deviation of the density curve in Graph B

    a) 5 b) 10 c) 15 d) 20 e) None of the above

  • 12) An instructor gives the same y versus x data as given below to four students.

    They each come up with four different answers for the straight line regression model. Only one is correct. The correct model is

    a. y = 60x 1200 b. y = 30x 200 c. y = 139.43 + 29.684x d. y = 1+ 22.782x

    13) A scientist finds that regressing the y versus x data given below results in the

    coefficient of determination for the straight-line regression model to be one.

    The missing value for y at x = 17 most nearly is

    a. -2.444 b. 2.000 c. 6.889 d. 34.00

    14) Suppose Z is a standard normal random variable. Then what is the probability that X=1+2Z will be less than 3?

    a) .1587 b) .3413 c) .8413 d) .0013 e) None of the Above.

  • 15) Suppose that X is a binomial random variable with n=3, p=.22. What is the probability that X will take the value 2?

    a) .886744 b) .113256 c) .037752 d) .962248 e) None of the Above.

    16) Consider a game where you win $9 with probability .1 and lose $1 with probability .9. What is your expected profit for this game?

    a) $1 b) $1 c) 0 d) $2 e) None of the Above.

    17) Suppose that X is a random variable taking the value 5 with probability .4, and taking the value 5 with probability .6. What is the standard deviation of X?

    a) 10 b) 4 c) 24 d) 24 e) None of the Above.

    18) Suppose that X is a random variable with a binomial distribution. If the expected value of X is 50 and the variance of X is 25 then the distribution of X must be:

    a) Symmetric b) Skewed Right c) Skewed Left d) It cannot be determined from the given information.

  • 19) If two random variables X and Y have a negative covariance, then:

    a) high values of x tend to be associated with high values of y and low values of x tend to be associated with low values of y.

    b) high values of x tend to be associated with low values of y and low values of x tend to go with high values of y.

    c) negative values of x tend to go with negative y values, and vice versa. d) the expected value of x times y is less than zero.

    20) A management-consultant firm uses a regression model where X1 stands for previous experience, X2 for number of years at current job, and X3 for score on a job-aptitude test. These variables are used in a regression model to predict job satisfaction. Job satisfaction ranges from 1 to 20, with 20 indicating that an employee is satisfied with every aspect of his or her job. The prediction equation is Yhat = 1.7 0.15 X1 + 0.25 X2 + 0.14 X3. What would the consulting firm predict for the job satisfaction of an employee who has 15 years of prior experience, 10 years of employment at the present job, and an aptitude test score of 85?

    a. 14.83 b. 13.85 c. 17.79 d. 15.12 e. None of the above

  • 21) The average cost of tuition, room and board at small private liberal arts colleges is reported to be $8,500 per term, but a financial administrator believes that the average cost is higher. A study was conducted using a sample of 150 small liberal arts colleges. The computer output below was obtained. Let = 0.05.

    Hypothesis test results: : population mean H0 : = 8500 HA : > 8500

    Based on the output, the conclusion should be

    a. the true average cost is higher than $8,500. b. the true average cost is lower than $8,500. c. the true average cost is equal to $8,500. d. the true average cost is equal to $8,708.90.

    22) In developing a confidence interval for a population mean, a sample size of 40 observations was used. The CI was 17.25 2.42. Had the sample size been 160 instead of 40, the CI would have been

    a) 17.25 1.68 b) 17.25 1.21 c) 69.00 9.68 d) 17.25 9.68

    23) After fitting a regression model, if the sum of the residuals equals 0 ( 1

    0n

    ii

    e

    )

    (a) The model is fitting well. (b) We have reason to doubt the normality assumption. (c) It tells us nothing. (d) The slope parameter must be zero.

    Mean Sample Mean Std. Err. DF T-Stat P-value

    8708.9 96.36292 149 2.1678462 0.0159

  • 24) If the coefficient of determination 2 100%R , then

    (a) none of the variability in the observations is explained by the model fit. (b) all observations fall on the fitted line exactly. (c) the model is not true. (d) a quadratic model would fit the data better.

    25) In testing the hypothesis Ho : = 75 vs Ha : 75, the following information is

    known: n = 64, x = 72, and s = 10. The computed test statistic is equal to

    a) 1.96 b) 2.4 c) -2.4 d) -1.96

    26) Suppose the 95% confidence interval for the true population proportion p is (0.36,

    0.54). Based on this confidence interval alone, in which of the following set(s) of hypotheses would the null hypothesis be rejected (at the 0.05 significance level)?

    a) Ho : p = 0.3 versus Ha : p 0.3 b) Ho : p = 0.4 versus Ha : p 0.4 c) Ho : p = 0.5 versus Ha : p 0.5 d) All of the above.

    27) Your boss asks you to calculate a 99% confidence interval instead of a 90%

    confidence interval. What is an advantage and a disadvantage of this action?

    a) The advantage is higher confidence. The disadvantage is a wider interval. b) The advantage is higher confidence. The disadvantage is a narrower interval. c) The advantage is lower confidence. The disadvantage is a wider interval. d) The advantage is lower confidence. The disadvantage is a narrower interval.

  • Based on a random sample of 1000 high school students, 280 of them said they are current smokers. The 90% confidence interval for the true proportion of all high school students that are current smokers is (0.26, 0.30).

    28) Does the sample proportion lie in the interval (0.26, 0.30)?

    a) Yes b) No c) Can't tell

    29) Does the population proportion lie in the interval (0.26, 0.30)?

    a) Yes b) No c) Can't tell

    30) If we use a 95% confidence level instead of a 90% confidence level, will the

    confidence interval calculation from the same data produce an interval narrower than (0.26, 0.30)?

    a) Yes b) No c) Can't tell

    31) Will the sample proportion for a future sample of 1000 high school students lie in

    the interval (0.26,0.30)?

    a) Yes b) No c) Can't tell

  • The scatterplot below displays information for 50 states for the year 2000 with regard to the variables: M.D.s per 100,000 which represents the number of doctors per 100,000 residents and Percent Poverty, the percentage of the population considered to be living in poverty. The R-sq value is 5.6% and the least squares regression line is

    M.D.s per 100,000 = 279.3 4.175 (Percent Poverty)

    32) Which of these options better interprets the value of the slope?

    a) For each additional percent in poverty the estimated number of M.D.s per 100,000 goes down by 4.175 on the average.

    b) For each additional percent in poverty the estimated number of M.D.s per 100,000 goes up by 4.175 on the average.

    c) For each additional M.D. the estimated percent in poverty goes down by 4.175% on the average.

    d) For each additional M.D. the estimated percent in poverty goes up by 4.175% on the average.

    e) For every 1 M.D. the percent in poverty goes down by 4.175%.

  • 33) In the year 2000 the percent in poverty for Tennessee was 13.4. According to the model (or regression equation), how many doctors would we have expected per 100,000 people?

    a) About 279 b) About 275 c) About 223 d) About 335 e) About 250

    34) Which of these statements is the best interpretation of R-sq in this example?

    a) 5.6% of the people living in poverty have enough M.D.s b) Only 5.6% of the variability in the number of M.D.s per 100,000 is explained by the percent of the population living in poverty. c) Only 5.6% of the M.D.s live in poverty. d) Only 5.6% of the people living in poverty have no M.D.s

    35) In the year 2000 the District of Columbia had 23.5% in poverty and 702 M.D.s

    per 100,000. If this data point was added to the scatterplot, it would be a) a residual. b) negatively correlated with the data. c) an outlier and influential observation. d) a weak influence on the least-squares regression line. e) a lurking variable.

  • A persons muscle mass is expected to decrease with age. To explore this relationship in women, a nutritionist randomly selected 15 women from each 10-year age group, beginning with age 40 and ending with age 79. The observations and least-squares regression line appear in the scatterplot and the R-sq value is 75%.

    36) Which of the following statements is the most accurate ?

    (A) For each additional year of age the estimated mean muscle mass increases and decreases. (B) The relationship between age and muscle mass is weak because the correlation is negative. Higher muscle mass goes with both lower and higher age. (C) The scatterplot shows a negative direction, with higher muscle mass going with lower age. The plot is generally straight with a moderate amount of scatter. (D) The relationship between age and muscle mass is weak because R-sq=75% is a small number compared to the intercept of 156.35. (E) The correlation between age and muscle mass turns out to be -0.866. This is an indication that age is causing muscle mass to decrease with time.

  • 37) Which is the most appropriate statement regarding the interpretation of the

    intercept?

    (A) For each additional year of age the estimated mean muscle mass decreases by approximately 1.19 MMIs. (B) The average muscle mass is 156.35 MMI for women at age 0. (C) The minimum muscle mass is 156.35 MMI. (D) For each additional year of age muscle mass decreases by approximately 156.35 MMIs. (E) We cannot interpret the intercept here since it does not make sense that a newborn female child would have a muscle mass index of 156.35.

    38) The following probability density curve represents waiting times at a customer

    service counter at a national department store. The mean waiting time is 5 minutes with standard deviation 5 minutes. If we took all possible samples of size n=100, how would you describe the sampling distribution of the resulting sample means?

    (A) Shape = right skewed, mean = 5, standard deviation = 5 (B) Shape = same as above graph, mean = 5, standard deviation = 0.5 (C) Shape = approximately normal, mean = 5, standard deviation = 0.5 (D) Shape = approximately normal, mean = 5, standard deviation = 5 (E) Shape = binomial, n =100; p = .05

  • Hy-Vee Inc. collected data to measure the impact of television advertising on the price which customers expect to pay for a deluxe pre-packaged dinner sold in Hy-Vee grocery stores. For each local TV market, Hy-Vee determined two marketing inputs:

    x1 = Number of one-week TV promotions x2 = Advertised discount (in percent) for price of the dinner

    In particular, Hy-Vee used x1 = 1, 3, 5, and 7 promotions in combination with x2 = 10%, 20%, 30%, and 40% discounts. Hy-Vee advertised in 10 local TV markets for each of the (4x4) = 16 combinations of x1 and x2, for a total of 160 markets. Hy-Vee also conducted a post-advertising customer survey in each market to measure

    y = Expected price for the dinner, in dollars Here is the resulting computer output:

    39) Which of the following conclusions is supported by the output?

    (a) Promotions is linearly related to Price. (b) Discount is linearly related to Price, after accounting for Promotions. (c) The regression assumptions are satisfied. (d) Neither Promotions nor Discount is linearly related to Price. (e) The price of beer is likely to fall now that the national elections are over.

  • 40) Interpret the slope for Promotions.

    (a) Promotions decrease on average by 0.102 for each one-dollar increase in expected price. (b) Expected price decreases on average by $0.102 for each additional promotion. (c) Promotions decrease on average by 0.102 for each one-dollar increase in expected price, when discount is held constant. (d) Expected price decreases on average by $0.102 for each additional promotion, when discount is held constant. (e) Expected price decreases on average by $0.0174 for each additional promotion, when discount is held constant.

    41) Suppose that Hy-Vee plans an ad campaign which features two promotions of a

    35% price discount in each local market. Estimate the mean expected price with 95% certainty.

    (a) $4.31 (b) ($4.24, $4.37) (c) ($3.78, $4.84) (d) Stop! Im too tired to calculate this.

    42) Suppose that the goal of the ad campaign described in the previous question is for

    customers to expect the price to be at most $4.35, on average. Which modification to the ad campaign should be recommended to help Hy-Vee achieve its goal?

    (a) Feature a 10% price discount instead of a 35% discount. (b) Feature a 30% price discount instead of a 35% discount. (c) Run seven promotions in each market. (d) Run a single promotion in each market. (e) None of the modifications is recommended.

  • Short Answers

    1) (9 points) In a recent study, 928 women were asked about their smoking habits during pregnancy and then again five years later. The data are summarized in the table below.

    a) What is the approximate probability that a randomly chosen woman smoked 5 years after pregnancy?

    (230+95)/928

    b) If a randomly selected woman smoked during pregnancy, what is the probability

    that she smoked 5 years after pregnancy?

    P(5 yrs later|smoked during) = 230/271

    c) Are the events Smoking during Pregnancy and Smoking Five Years Later independent or dependent? Explain.

    No, the conditional probability in (b) does not equal the unconditional probability in (a).

  • 2) (21 points) Consider the following multiple regression computer output and then answer the questions on the following pages.

    Multiple linear regression results Dependent Variable: var13 Independent Variable(s): var1, var2, var3, var4, var5, var6, var7, var8, var9, var10, var11, var12 Parameter estimates:

    Analysis of variance table for multiple regression model:

    Root MSE (also called se ) 0.32969475 R-squared (adjusted): 0.2242

    Variable Estimate Std. Err. Tstat P-value

    Intercept -0.057636276 0.048596717 -1.1860118 0.236

    var1 -0.001675507 0.028293129 -0.059219576 0.9528

    var2 4.140445E-5 1.442341E-4 0.28706425 0.7741

    var3 0.0025728503 0.0026423088 0.973713 0.3305

    var4 0.0386679 0.016722031 2.3123925 0.021

    var5 -0.002243308 0.0021889468 -1.0248344 0.3058

    var6 0.029505625 0.02173865 1.3572887 0.1751

    var7 0.06405778 0.027510468 2.3284876 0.0202

    var8 0.088917315 0.021120988 4.2099032

  • a) Which variable is the most important variable in the model ?

    Var8 since it has the lowest p-value b) Which variable would be removed first when performing a backwards

    stepwise regression ?

    Var1 since it has the highest p-value.

    c) What would happen to the value of R-sq when you remove the variable in part (b) (circle one answer)

    Go Up Go Down

    d) What is the coefficient of determination (R-sq) for the full model ?

    R-sq = SSR/SST = 1 (SSE/SST) = 1 (77.067/101.02)

    e) Compute a 95% confidence interval for var10.

    -0.07517062 +/- 1.96*(0.02721936)

    f) Do we need var2 in the model ? Explain.

    No, the p-value is above .05

    g) Test the null hypothesis that var8 equals 0.1

    Ho: var8=0.1 Ha: no it doesnt t = (0.088917315-0.1)/ 0.021120988 = -0.5247 Since |t|

  • Practice Exam 2 Solutions Final Examination

    Directions The exam will end 3 hours after it begins. The exam is divided into three parts. The first and second parts are true-false and multiple choice, respectively. Please answer the true-false and multiple choice questions on the exam by circling the best answer. There will be some partial credit for the multiple-choice questions as long as some credible work is shown. The third part of the exam consists of several problems. Please answer these problems in the space provided on the exam (you may use the backs of the sheets if necessary). You will get partial credit for these problems provided that your answers are organized and legible so that your train of thought can be easily followed.

    Unless stated, all confidence intervals and hypothesis test should be calculated at the 95% confidence level (use 1.96). A note on re-grade requests: Only written requests will be considered. Clerical errors will be changed without question, but other inquiries will result in a re-grade of the entire exam.

    GOOD LUCK By signing my name here I acknowledge that the GBS has an honor code and I will abide by it. ___________________________________________ NAME (PLEASE PRINT) : _______________________________________________________________ (-100 if not printed)

  • Multiple Choice (5 points each)

    1) Consider the following sample data: 25 11 6 4 2 17 9 6

    For these data the median is:

    a. 7.5 b. 3.5 c. 10. d. None of the above.

    2) The owner of a fish market has an assistant who has determined that the weights of catfish are normally distributed, with mean of 3.2 pounds and standard deviation of 0.8 pound. If a sample of 64 fish yields a mean of 3.4 pounds, what is probability of obtaining a sample mean this large or larger?

    a) 0.0001 b) 0.0013 c) 0.0228 d) 0.4987

    3) In the construction of confidence intervals, if all other quantities are unchanged, an

    increase in the sample size will lead to a interval.

    a) narrower b) wider c) less significant d) biased

    4) A major department store chain is interested in estimating the average amount its credit

    card customers spent on their first visit to the chains new store in the mall. Fifteen credit card accounts were randomly sampled and analyzed with the following results: X = $50.50 and 2s = 400 . Construct a 95% confidence interval for the average amount its credit card customers spent on their first visit to the chains new store in the mall assuming that the amount spent follows a normal distribution.

    a) $50.50 $9.09 b) $50.50 $10.12 c) $50.50 $11.00 d) $50.50 $11.08

  • 5) In the annual report, a major food chain stated that the distribution of daily sales at their Detroit stores is known to be bell-shaped, and that 95 percent of all daily sales fell between $19,200 and $36,400. Based on this information, what were the mean sales?

    a. Around $20,000 b. Close to $30,000 c. Approximately $27,800 d. Cant be determined without more information.

    6) For some positive value of X, the probability that a standard normal variable is between 0 and +2X is 0.1255. The value of X is

    a) 0.99 b) 0.40 c) 0.32 d) 0.16 e) None of the above

    7) If we know that the length of time it takes a college student to find a parking spot in the

    library parking lot follows a normal distribution with a mean of 3.5 minutes and a standard deviation of 1 minute, find the probability that a randomly selected college student will find a parking spot in the library parking lot in less than 3 minutes.

    a) 0.3551 b) 0.3085 c) 0.2674 d) 0.1915 e) None of the above

    8) The Central Limit Theorem is important in statistics because

    a) for a large n, it says the population is approximately normal. b) for any population, it says the sampling distribution of the sample mean is approximately normal, regardless of the sample size. c) for a large n, it says the sampling distribution of the sample mean is approximately normal, regardless of the shape of the population. d) for any sized sample, it says the sampling distribution of the sample mean is approximately normal.

  • 9) It is believed that number of people who attend a Mardi Gras parade each year depends on the temperature that day. A regression has been conducted on a sample of years where the temperature ranged from 28 to 64 degrees and the number of people attending ranged from 8400 to 14,600. The regression equation was found to be xy 1912378 . Which of the following is true?

    a. The average change in parade attendance is an additional 2378 people per one degree increase in temperature.

    b. The average change in parade attendance is an additional 191 people per one degree increase in temperature.

    c. If the temperature is 75 degrees, we can expect that 16,703 people will attend. d. If the temperature is 0 degrees this year, then we should expect 2378 people to attend

    10) An analyzing the residuals to determine whether the simple regression analysis satisfies the regression assumptions, which of the following is the best answer?

    a. The histogram of the residuals should be approximately bell shaped

    b. The scatter plot of the residuals against the dependent variable should

    illustrate that the variation in residuals is the same over all levels of y (should

    have no patterns).

    c. Neither a nor b are true

    d. Both a and b are true

    11) Assume that after running a regression that you have calculated a prediction of 110 y . Also assume that n = 201 and that s = 4.5. Find the approximate 95% prediction interval.

    a. About 101 to 119

    b. About 109.4 to 110.6

    c. About 105.5 to 104.5

    d. About 98.4 to 121.6

  • 12) Residual analysis is conducted to check whether regression assumptions are met. Which of the following is not an assumption made in simple linear regression?

    a. Errors are independent of each other

    b. Errors are normally distributed

    c. Errors are linearly related to x

    d. Errors have constant variance

    13) The following regression output was generated based on a sample of utility customers. The dependent variable was the dollar amount of the monthly bill and the independent variable was the size of the house in square feet.

    Based on this regression output, which of the following statements is not true?

    a. The number of square feet in the house explains only about 2 percent of the

    variation in the monthly power bill

    b. At the usual alpha level equal to 0.05, there is no basis for rejecting the

    hypothesis that the slope coefficient is equal to zero

    c. The average increase in the monthly power bill is about 66.4 for each

    additional square foot of space in the house

    d. The total number of observations is 30.

  • 14) In an effort to estimate the mean dollars spent per visit by customers of a food store, the manager has selected a random sample of 100 cash register receipts. The mean of these was $45.67 with a sample standard deviation equal to $12.30. Assuming that he wants to develop a 95 percent confidence interval estimate, the upper limit of the confidence interval estimate is:

    a. about $2.02. b. approximately $65.90. c. about $48.08 d. None of the above.

    15) A random sample of 340 people in Chicago showed that 66 listened to WJKT 1450, a radio station in South Chicago Heights. Based on this sample information, what is the point estimate for the proportion of people in Chicago that listen to WJKT 1450?

    a. 0.231 b. 0.194 c. 0.51 d. 66 e. None of the above

    16) The finishing process on new furniture leaves slight blemishes. The table below displays a managers probability assessment of the number of blemishes in the finish of new furniture.

    Number of Blemishes 0 1 2 3 4 5 Probability 0.34 0.25 0.19 0.11 0.07 0.04

    On average, how many defects would we expect on a piece of furniture?

    A) 0.28 B) 0.85 C) 1.44 D) 0.77 E) None of the above

    17) In a recent survey, 70% of human resource directors thought that it was very important for business students to take a course in business ethics. For a sample of 12 human resource directors, what is the probability that at least one of them does not think it very important for business students to take a business ethics course?

    A) 0.9833 B) 0.9521 C) 0.9862 D) 0.9714 E) None of the above

  • 18) Which of the following statements regarding a binomial experiment is false, where n is the number of trials, and p is the probability of success in each trial?

    A) The n trials are independent. B) The standard deviation is np(1 - p). C) The mean is np. D) There are only two possible outcomes.

    19) Woof Chow Dog Food Company believes that it has a market share of 25%. They survey n = 100 dog owners and ask whether or not Woof Chow is their regular brand of dog food, and 23 people say yes. Based upon this information, what is the value of the test statistic?

    a. -0.462 b. -0.475 c. 0.462 d. 0.475 e. None of the above

    20) A company that makes shampoo wants to test whether the average amount of shampoo per bottle is 16 ounces. The standard deviation is known to be 0.20 ounces. Assuming that the hypothesis test is to be performed using 0.05 level of significance and a random sample of n = 64 bottles, how large could the sample mean be before they would reject the null hypothesis [i.e. testing : 16 : 16o aH H ]?

    a. 16.2 ounces b. 16.049 ounces c. 15.8 ounces d. 16.041 ounces

    21) The managers of a local golf course have recently conducted a study of the types of golf balls used by golfers based on handicap. A joint frequency table for the 100 golfers covered in the survey is show below:

    Type of Golf Ball Handicap Strata Titleist Nike Other

    < 2 5 8 3 2 2 to < 10 8 7 9 10

    > 10 7 8 10 23

    If a player comes to the course using a Nike golf ball, the probability that he or she has a handicap of at least 10 is:

    a. 0.223. b. 0.48. c. 0.455. d. 0.108. e. None of the above

  • 22) Suppose we want to test : 30 : 30o aH H Which of the following possible sample results based on a sample of size 36 gives the strongest evidence to reject oH in favor of aH ?

    a) X = 28, s = 6 b) X = 27, s = 4 c) X = 32, s = 2 d) X = 26, s = 9

    23) How many Kleenex should the Kimberly Clark Corporation package of tissues contain?

    Researchers determined that 60 tissues is the average number of tissues used during a cold. Suppose a random sample of 100 Kleenex users yielded the following data on the number of tissues used during a cold: X = 52, s = 22. Using the sample information provided, calculate the value of the test statistic for testing : 60 : 60o aH H

    a) t =(5260)/ 22 b) t =(5260)/(22 /100) c) t =(5260)/(22 /1002) d) t =(5260)/(22 /10)

    24) The owner of a local nightclub has recently surveyed a random sample of n = 250

    customers of the club. She would now like to determine whether or not the mean age of her customers is over 30. If so, she plans to alter the entertainment to appeal to an older crowd. If not, no entertainment changes will be made. Suppose she found that the sample mean was 30.45 years and the sample standard deviation was 5 years. If she wants to be 95% confident in her decision, what conclusion can she make?

    a) There is not sufficient evidence that the mean age of her customers is over 30. b) There is sufficient evidence that the mean age of her customers is over 30. c) There is not sufficient evidence that the mean age of her customers is not over 30. d) There is sufficient evidence that the mean age of her customers is not over 30.

    25) A survey claims that 9 out of 10 doctors recommend aspirin for their patients with headaches. To test this claim against the alternative that the actual proportion of doctors who recommend aspirin is less than 0.90, a random sample of 100 doctors results in 83 who indicate that they recommend aspirin. The value of the test statistic in this problem is approximately equal to:

    a) 4.12 b) 2.33 c) 1.86 d) 0.07 e) None of the above

  • A student claims that he can correctly identify whether a person is a business major or an agriculture major by the way the person dresses. Suppose in actuality that if someone is a business major, he can correctly identify that person as a business major 87% of the time. When a person is an agriculture major, the student will incorrectly identify that person as a business major 16% of the time. Presented with one person and asked to identify the major of this person (who is either a business or agriculture major), he considers this to be a hypothesis test with the null hypothesis being that the person is a business major and the alternative that the person is an agriculture major.

    26) Referring to the above, what would be a Type I error? a) Saying that the person is a business major when in fact the person is a business major. b) Saying that the person is a business major when in fact the person is an agriculture major. c) Saying that the person is an agriculture major when in fact the person is a business major. d) Saying that the person is an agriculture major when in fact the person is an agriculture

    27) Referring to the above, what would be a Type II error?

    a) Saying that the person is a business major when in fact the person is a business major. b) Saying that the person is a business major when in fact the person is an agriculture major. c) Saying that the person is an agriculture major when in fact the person is a business major. d) Saying that the person is an agriculture major when in fact the person is an agriculture major.

    Health care issues are receiving much attention in both academic and political arenas. A sociologist recently conducted a survey of citizens over 60 years of age whose net worth is too high to qualify for Medicaid and have no private health insurance. The descriptive statistics for the ages of 25 uninsured senior citizens were as follows:

    28) Which of the following is the best correct statement.

    a) One fourth of the senior citizens sampled are below 66 years of age. b) The middle 50% of the senior citizens sampled are between 66 and 73.0 years of age. c) The average age of senior citizens sampled is 73.5 years of age. d) All of the above are correct.

    29) Which of the following is the best correct statement.

    a) One fourth of the senior citizens sampled are below 64 years of age. b) The middle 50% of the senior citizens sampled are between 66 and 73.0 years of age. c) 25% of the senior citizens sampled are older than 81 years of age. d) All of the above are correct.

  • 30) To explain personal consumption (CONS) measured in dollars, data is collected for

    INC: personal income in dollars CRDTLIM: $1 plus the credit limit in dollars available to the individual APR: average annualized percentage interest rate for borrowing for the individual ADVT: per person advertising expenditure in dollars by manufacturers in the city where the individual lives SEX: gender of the individual; 1 if female, 0 if male A regression analysis was performed with CONS as the dependent variable and ln(CRDTLIM), ln(APR), ln(ADVT), and SEX as the independent variables. The estimated model was

    What is the correct interpretation for the estimated coefficient for SEX?

    a. Holding everything else fixed, personal consumption for females is estimated to be $0.39 higher than males on the average.

    b. Holding everything else fixed, personal consumption for males is estimated to be $0.39 higher than females on the average.

    c. Holding everything else fixed, personal consumption for females is estimated to be 0.39% higher than males on the average.

    d. Holding everything else fixed, personal consumption for males is estimated to be 0.39% higher than females on the average.

  • A school superintendent is interested in what factors effect the sixth grade proficiency test in her state.. She obtained the data on percentage of students passing the proficiency test (% Passing), daily average of the percentage of students attending class (% Attendance), average teacher salary in dollars (Salaries), and instructional spending per pupil in dollars (Spending) of 47 schools in the state. The following is the multiple regression output with Y = % Passing as the dependent variable, X1=:% Attendance, X2 = Salaries and X3 = Spending.

    31) Which of the following is a correct statement?

    a. The average percentage of students passing the proficiency test is estimated to go up by 8.50% when daily average of percentage of students attending class increases by 1%.

    b. The daily average of the percentage of students attending class is expected to go up by an estimated 8.50% when the percentage of students passing the proficiency test increases by 1%.

    c. The average percentage of students passing the proficiency test is estimated to go up by 8.50% when daily average of the percentage of students attending class increases by 1% holding constant the effects of all the remaining independent variables.

    d. The daily average of the percentage of students attending class is expected to go up by an estimated 8.50% when the percentage of students passing the proficiency test increases by 1% holding constant the effects of all the remaining independent variables.

  • 32) Which of the following is a correct statement based on the previous output?

    a. 62.88% of the total variation in the percentage of students passing the proficiency test can be explained by daily average of the percentage of students attending class, average teacher salary, and instructional spending per pupil.

    b. 62.88% of the total variation in the percentage of students passing the proficiency test can be explained by daily average of the percentage of students attending class, average teacher salary, and instructional spending per pupil after adjusting for the number of predictors and sample size.

    c. 62.88% of the total variation in the percentage of students passing the proficiency test can be explained by daily average of the percentage of students attending class holding constant the effect of average teacher salary, and instructional spending per pupil.

    d. 62.88% of the total variation in the percentage of students passing the proficiency test can be explained by daily average of the percentage of students attending class after adjusting for the effect of average teacher salary, and instructional spending per pupil.

    33) The average length of stay in a hospital is useful for planning purposes. Suppose that the following is the distribution of the length of stay in a hospital (in days) after a minor operation:

    Days 2 3 4 5 6 Probability .05 .20 .40 .20 ?

    The average length of stay is:

    a) 0.15 days b) 0.20 days c) 4.0 days d) 4.2 days e) 4.3 days

    34) The sale of luxury boats has been found to be extremely dependent on whether or not consumers think the economy is shrinking. The specific question is whether or not more than half of all consumers think the economy has entered a recession. In a sample of 1600 randomly selected consumers, 845 answered they believed the economy was in a recession. A 95% confidence interval for the true population proportion of consumers who believed the economy was in recession is closest to:

    a. (0.5176, 0.5486) b. (0.4397, 0.5040) c. (0.5036, 0.5526) d. (0.4960, 0.5603)

    (counted both as correct)

  • The next 2 questions are based on the following information: An insurance company analyst is interested in analyzing the dollar value of damage

    in automobile accidents. She collects data from 115 accidents, and records the amount of damage as well as the age of the driver. The results of her regression analysis are listed below.

    SUMMARY OUTPUT

    Regression Statistics

    Multiple R 0.187 R Square 0.035 Adjusted R Square

    0.026

    Standard Error 5652.090 (what we call se)

    Observations 115.000

    ANOVA Df SS MS F Significance

    F Regression 1 130433116.219 130433116.21

    9 4.083 0.046

    Residual 113 3609911959.868

    31946123.539

    Total 114 3740345076.087

    Coefficien

    ts Standard Error t Stat P-value

    Intercept 10725.802

    1535.215 6.987 0.000

    Age 69.964 34.625 2.021 0.046

    35) How would you best explain the y-intercept in this situation?

    A) For each additional 1-year increase in the age of the driver, we would expect damage to increase by $10,726.

    B) For each additional 1-year increase in the age of the driver, we would expect damage to increase by $70.

    C) It makes no sense to explain the intercept in this situation, since we can not have a driver with age of zero.

    D) The average amount of damage was $10,726.

  • 36) On average, what would be the dollar value of an accident involving a 25-year-old driver?

    A) $11,836.56 B) $10,795.47 C) $13,372.58 D) $12,474.90

    37) The residual is defined as the difference between the: A) actual value of y and the estimated value of y B) actual value of x and the estimated value of x C) actual value of y and the estimated value of x D) actual value of x and the estimated value of y

    Use the following graph for the next question

    38) The value of a to the nearest hundredth is

    a) 0.973 b) 0.975 c) 1.36 d) 2 e) 2.1 f) None of the above

  • 39) A teacher determined that the class average on an exam is 65 and the standard deviation is 7. After the class completes a review assignment, he adjusts the marks by adding 10 to each exam. An analysis of the two sets of marks showed that

    a) The standard deviation decreased while the mean increased b) The standard deviation increased and the mean increased c) The standard deviation stayed the same while the mean increased d) The standard deviation and the mean stayed the same.

    40) The number of hours for which a lightbulb works before it burns out is normally distributed with mean 5000 hours and standard deviation 200 hours. What is the probability that a lightbulb burns out in under 4800 hours?

    a) 0.2345 b) 0.1587 c) 0.0124 d) 0.3245 e) None of the above

    41) The number of hours for which a lightbulb works before it burns out is normally distributed with mean 5000 hours and standard deviation 200 hours. Suppose you have a room where you install five of these lightbulbs at the same time. What is the probability that at no lightbulbs burn out before 4800 hours (use your answer from the question above)?

    a) .0001 b) .0002 c) .0003 d) .1587 e) None of the above

    42) To determine the reliability of experts used in interpreting the results of polygraph examinations in criminal investigations, 280 cases were studied. The results were:

    If the hypotheses were Ho: suspect is innocent versus Ha: suspect is guilty, then the probability of making a wrong decision is:

    a. 0.05 b. .086 c. .067 d. .032 e. None of the above

  • 43) In a multiple regression analysis involving 40 observations and 5 independent variables, SST= 350 and SSE = 30. The coefficient of determination (R2) is:

    a) .9408 b) .8571 c) .9143 d) .8529

    (The next 2 questions are based on the following information.) Sam Pull attended the Career Showcase at the University of Chicago. The following table depicts the pdf of the random variable Y representing the number of job offers Sam receives.

    y p(y) 0 0.07 1 0.53 2 0.26 3 0.11 4 0.03

    44) The expected number of job offers Sam receives is

    a) 1.07 b) 1.50 c) 1.93 d) 2.00 e) None of the above

    45) Sams parents give him $1000 for just trying to interview, plus $50 for each job offer he receives. What is the expected amount of money Sam receives ?

    a) $2050 b) $1053.5 c) $1075 d) $1100 e) None of the above

  • Short Answer (points as marked)

    1. (28 points 7 points each) At a semiconductor plant, 60% of the workers are skilled and 80% of the workers are full-time. Ninety percent (90%) of the skilled workers are full time.

    a) What is the probability that an employee selected at random is a skilled full-time employee ?

    P(S) = 0.6 P(F) = 0.8 and P(F|S) = 0.90 Hence P(F and S) = (0.9)(0.6) = 0.54

    b) What is the probability that an employee selected at random is a skilled worker or a full-time worker ?

    P(S or F) = P(S)+P(F)-P(S and F) = 0.6+0.9-0.54=0.86

    c) What percentage of the full-time workers are skilled ?

    P(S|F) = P(S and F)/P(F) = .54/.8 = .675

    d) Is being a skilled worker and being a full-time worker independent or dependent events ? (explain).

    No. P(S)P(F) = .6*.8 which does not equal P(S and F)=0.54

  • 2. (48 Points 6 points each) We have data on the sales of 950 single-family homes in Springfield, MA.. We wish to explain and predict the price of a single-family home (the Y variable, in thousands of dollars) using the following predictor variables: Variable Name Description s_p Sale price in dollars (response variable) inv Sale date inventory of homes on market bath Number of bathrooms ltsz Lot size in acres hssz Sq. ft. of living area bsemt 1 if basement, 0 otherwise a_c 1 if central a/c, 0 otherwise f_place 1 if fireplace, 0 otherwise garsz_a 1 if garage, 0 otherwise dw 1 if dishwasher, 0 otherwise dr 1 if dining room, 0 otherwise fr 1 if family room, 0 otherwise age5 1 if age
  • a. Besides the intercept, which variables are important in explaining selling price ? (explain).

    All variables with |t|.1.96 or pvalue

  • c. How much more does a house with a fireplace go for (everything else being equal)?

    $9019.883

    d. Do you even need the dishwasher variable in the model ? Explain.

    No, as the |t|.05 so this variable is not significant. e. What is the value of R2 ?

    R2=SSR/SST = 4.8714/8.0748=0.6032

    f. Test the hypothesis 0 : 10 : 10hssz a hsszH H

    Test statistics is t=(11.446-10)/2.311 = 0.625 Since 0.625 < 1.96 we fail to reject the null hypothesis.

    g. For the model as is, if we use it for predictions, how accurate would our predictions be (put the appropriate units on your answer) ?

    +/- 1.96 se = +/- 1.96*18509.72 = +/- $36279.06

    h. Give a 95% confidence interval for the intercept in the model.

    -2606.02 +/- 1.96*(5166.2554)