Stats without Tears
Solutions for Chapter 8
Updated 1 Apr 2015
Copyright © 2013–2017 by Stan Brown
Updated 1 Apr 2015
Copyright © 2013–2017 by Stan Brown
← Exercises for Ch 8
Sample means are ND with mean 800 hours and SD 5 hours. The sketch is at right.
Common mistake: The correct standard deviation is 5 hours, not 50. You’re not sketching the population of light bulbs. Rather, you’re now interested in the distribution of average lifetimes in samples of 100 bulbs. (The axis is the x̅ axis, not the x axis.)
780 hours, the sample mean that the problem asks about, is 20 hours below the population mean of 800. 20/5 = 4 standard errors, so you should have marked 780 hours at four standard deviations below the mean.
A sample mean of 780 is less than the population mean of 800 hours. Therefore you compute the probability of a sample mean of 780 hours or less. It will be surprising (unusual, unexpected) if the probability is under 5%.
P(x̅ ≤ 780) = normalcdf(−10^99, 780, 800, 50/√(100)) = 3.1686E-5 → P(x̅ ≤ 780) = 0.00003
(You can also give the probability as <0.0001.) Yes, this is surprising.
Common mistake: Don’t give the probability as 3.1686. Probabilities are never greater than 1.
(b) If the manufacturer’s claim is true, there are only three chances in a hundred thousand of getting a sample mean this low. It’s very unlikely that the manufacturer’s claim is true.
Answer: normally distributed with mean = 0.72, standard deviation (standard error) = 0.020
Common mistake: Don’t write n≥30 when testing the normal approximation. The n≥30 test applies to numeric data, but in this problem you have binomial data.
(b) 350/500 = 0.70 exactly, and 370/500 = 0.74 exactly. In a sample of 500, finding 350 to 374 successes is the same as finding 70% to 74% successes.
If you stored the computed SEP in part (a), then your screen will look like the one on the left. Otherwise, it will look like the one on the right:
Answer: P(70% ≤ p̂ ≤ 74%) = 0.6808.
Remark: Always check for reasonableness. 70% and 74% are one standard error below and above the mean, so you know from the Empirical Rule that about 68% of the data should be within that region.
Remark: The problem wanted you to use the normal approximation, but it’s always good to check answers by a different method if possible. 70%×500 = 350; 74%×500 = 370. MATH200A part 3 with n=500, p=.72, from 350 to 370, gives a probability of 0.7044, pretty good agreement.
P(x̅ ≤ $31,000) = normalcdf(−10^99, 31000, 32400, 19000/√(1000)) = 0.0099, almost exactly 1%. That would be pretty unlikely if the population mean was still $32,400, so the city manager is most likely correct.
Remark: This problem was adapted from Freedman, Pisani, Purves (2007, 415) [see “Sources Used” at end of book].
|Black or Green||−10||20/38|
|Total||n/a||38/38 = 1|
(a) The model is at right. You could list green and black separately, but since they have the same outcome there’s no need to do that. It’s important to have the probabilities as exact fractions, not approximate decimals.
(b) x’s in L1, P’s in L2.
gives μ = −$0.53, σ = $9.99.
Interpretation: In the long run, a player who bets $10 on red will
lose an average of 53¢ per bet.
Remark: Notice that the SD is about 20 times the mean. This is why gambling is so exciting for the player: there’s a lot of variability from one bet to the next.
(c) With n = 10,000, the sampling distribution of x̅ is normally distributed. (10n = 10×10,000 = 100,000, less than the total number of bets while the casino is in business. The bets placed in a given day are not random, but they are representative of all possible bets and therefore effectively random.) The mean of the sampling distribution is the mean of the population: μx̅ = +$0.53. (Whatever players lose, the casino wins, so the mean is the opposite of a player’s mean.) The standard error of the mean is σ/√n = 9.986139979/√10000; σx̅ ≈ $0.10.
Remark: This is why gambling is predictable for the operators: the SD is small compared to the mean.
(d) 10,000×$.5263157895 = $5,263.16
To lose money, the casino has to make less than $0.00. Zero is more
than five standard errors below the mean (has a z-score below
−5), so you know right off that it would be unusual for the
casino to lose money.
normalcdf confirms that:
P(lose on 10,000 bets) = 6.8×10-8. The casino has essentially no risk (7 chances in 100 million) of losing money on 10,000 bets.
Remember the elevator example. A total of $2000 on 10,000 bets is an
average of 2000/10,000 = $0.20 per bet. Use
normalcdf to compute the probability of doing that well
P(make ≥$2000) = 0.9995. Not only is the casino virtually certain not to lose money, it’s almost certain to make a handsome profit, as long as people come in to place bets.
P(∑x > 75.6) = P(x̅ > 5.04) = normalcdf(75.6/15, 10^99, 5.00, 0.05/√(15)) = 9.7295E-4 ≈ 0.0010, about one chance in a thousand.
Answer: P(x > 43.0) = 0.1634
(b) The sampling distribution of x̅ is ND, even for this small sample, because the population is ND. The standard error is σx̅ = 5.1/√14 ≈ 1.4.
P(x̅ > 43.0) = normalcdf(43, 10^99, 38, 5.1/√(14)) 1.2212E-4 → P(x̅>43.0) = 0.0001 or 0.01%
Remark: This sketch is not very well proportioned, because it makes the probability look much larger than it actually is.
The standard error of the mean is σx̅ = 3.5/√1000, about 0.11. The sampling distribution of the mean is normal because data are numeric and n =1000, greater than 30. (Treat the sample as random because it’s a “typical neighborhood”. And a thousand households is less than 10% of all the households that there are.)
P(x̅ > 12.778) = normalcdf(12.778, 10^99, 12.5, 3.5/√(1000) = 0.0060
Therefore the sampling distribution can be approximated by a normal distribution.
The standard error of the proportion or SEP is σp̂ = √[pq/n] = √[.0171(1−.0171)/11037] ≈ 0.0012
If you use my shortcut, your screen will look like the one at the left; if not, it will look like the one at the right.
Either way, the probability is 2.2013×10-10, or 0.000 000 000 2. There are only two chances in ten billion of getting a sample proportion of 0.94% or less with sample size 11,037, if the true population proportion is 1.71%. That’s pretty darn unlikely, so based on this experiment you can rule out coincidence and decide that aspirin does reduce the chance of a heart attack among adult males.
The standard error of the mean or SEM is σx̅ = σ/√n = 2.92/√16 = 0.73″.
μx̅ ± 2σx̅ = 69.3 ± 2×.73 = 67.84 to 70.76.
Sample means between those values would not be surprising, and therefore a sample mean would be surprising if it is under 67.84″ or over 70.76″.
Alternative solution: That back-of-the-envelope calculation is good enough, but you could also get a more precise answer:
L = invNorm(0.025, 69.3, 2.92/√(16)) = 67.87
H = invNorm(1−0.025, 69.3, 2.92/√(16)) = 70.73
What does the sampling distribution look like? The center is μp̂ = p = 0.45. The standard error is σp̂ = √[.45×(1−.45)/1504] ≈ 0.013. Check requirements to make sure that a normal model can be used for the sampling distribution:
P(x ≥ 737) = P(p̂ ≥ 49%) = normalcdf(737/1504, 10^99, .45, √(.45*(1−.45)/1504)) ≈ 9E-4 or 0.0009.
Can you draw a conclusion? Yes, you can. In a population with 45% unfavorable rating of the Tea Party, there are only 9 chances in 10,000 of getting a sample as unfavorable as this one (or more unfavorable). That’s pretty unlikely, so you conclude that the true unfavorable rating in October was most likely more than 45% of all Americans. (In Chapter 9, you’ll learn how to estimate that proportion from a sample.)
Updates and new info: http://BrownMath.com/swt/