BrownMath.com → Stats w/o Tears → Ch 8 Solutions
Stats w/o Tears home page

Stats without Tears
Solutions for Chapter 8

Updated 1 Apr 2015 (What’s New?)
Copyright © 2013–2017 by Stan Brown

View or
Print:
These pages change automatically for your screen or printer. Underlined text, printed URLs, and the table of contents become live links on screen; and you can use your browser’s commands to change the size of the text or search for key words. If you print, I suggest black-and-white, two-sided printing.
Because this textbook helps you,
please click to donate!
Because this textbook helps you,
please donate at
BrownMath.com/donate.

← Exercises for Ch 8 

1 This is numeric data. You have a random sample, and it’s less than 10% of the households in a country. Despite the skew, with sample size so far above 30 you can be sure that the shape of the sampling distribution is approximately normal. The mean of the sampling distribution is μ = μ = $48,000 The SD of the sampling distribution of the mean, a/k/a standard error of the mean, is σ = σ/√n = $2000/√64 → σ = $250
2 normal curve, mu=800, SEM=50/sqrt(100), left tail past 780 shaded (a) First, describe the distribution and sketch the situation. For the population, you’re given μ = 800, σ = 50, n = 100.

Sample means are ND with mean 800 hours and SD 5 hours. The sketch is at right.

Common mistake: The correct standard deviation is 5 hours, not 50. You’re not sketching the population of light bulbs. Rather, you’re now interested in the distribution of average lifetimes in samples of 100 bulbs. (The axis is the axis, not the x axis.)

780 hours, the sample mean that the problem asks about, is 20 hours below the population mean of 800. 20/5 = 4 standard errors, so you should have marked 780 hours at four standard deviations below the mean.

A sample mean of 780 is less than the population mean of 800 hours. Therefore you compute the probability of a sample mean of 780 hours or less. It will be surprising (unusual, unexpected) if the probability is under 5%.

P( ≤ 780) = normalcdf(−10^99, 780, 800, 50/√(100)) = 3.1686E-5 → P( ≤ 780) = 0.00003

(You can also give the probability as <0.0001.) Yes, this is surprising.

Common mistake: Don’t give the probability as 3.1686. Probabilities are never greater than 1.

(b) If the manufacturer’s claim is true, there are only three chances in a hundred thousand of getting a sample mean this low. It’s very unlikely that the manufacturer’s claim is true.

3 normal curve, mu=0.72, SEM=0.02, middle 0.70 to 0.74 shaded (a) “Describe the distribution” means shape, center and spread. You can always get center and spread, but if the test for normal approximation fails then you can’t say anything about the shape.

Answer: normally distributed with mean = 0.72, standard deviation (standard error) = 0.020

Common mistake: Don’t write n≥30 when testing the normal approximation. The n≥30 test applies to numeric data, but in this problem you have binomial data.

(b) 350/500 = 0.70 exactly, and 370/500 = 0.74 exactly. In a sample of 500, finding 350 to 374 successes is the same as finding 70% to 74% successes.

If you stored the computed SEP in part (a), then your screen will look like the one on the left. Otherwise, it will look like the one on the right:

root of quantity .72 times (1-.72) over 500 yields .0200798406 stored in X; then normalcdf(.70, .74, .72, X) yields .6807614304   or   root of quantity .72 times (1-.72) over 500 yields .0200798406; then normalcdf(.70, .74, .72, root of quantity .72 times (1-.72) over 500) yields .6807614304

Answer: P(70% ≤  ≤ 74%) = 0.6808.

Remark: Always check for reasonableness. 70% and 74% are one standard error below and above the mean, so you know from the Empirical Rule that about 68% of the data should be within that region.

Remark: The problem wanted you to use the normal approximation, but it’s always good to check answers by a different method if possible. 70%×500 = 350; 74%×500 = 370. MATH200A part 3 with n=500, p=.72, from 350 to 370, gives a probability of 0.7044, pretty good agreement.

4 normal distribution shaded left of 31,000; probability=.0099 The sampling distribution of is ND because the sample size of 1000 is greater than 30 and the random sample is smaller than 10% of the population (10% of 100,000 households is 10,000 households). The SEM is σ = 19000/√1000 ≈ $601.

P( ≤ $31,000) = normalcdf(−10^99, 31000, 32400, 19000/√(1000)) = 0.0099, almost exactly 1%. That would be pretty unlikely if the population mean was still $32,400, so the city manager is most likely correct.

Remark: This problem was adapted from Freedman, Pisani, Purves (2007, 415) [see “Sources Used” at end of book].

5
xP(x)
Red+1018/38
Black or Green−1020/38
Totaln/a38/38 = 1

(a) The model is at right. You could list green and black separately, but since they have the same outcome there’s no need to do that. It’s important to have the probabilities as exact fractions, not approximate decimals.

 

result of 1-VarStats; see text (b) x’s in L1, P’s in L2. 1-VarStats L1,L2 gives μ = −$0.53, σ = $9.99. Interpretation: In the long run, a player who bets $10 on red will lose an average of 53¢ per bet.

Remark: Notice that the SD is about 20 times the mean. This is why gambling is so exciting for the player: there’s a lot of variability from one bet to the next.

 

(c) With n = 10,000, the sampling distribution of is normally distributed. (10n = 10×10,000 = 100,000, less than the total number of bets while the casino is in business. The bets placed in a given day are not random, but they are representative of all possible bets and therefore effectively random.) The mean of the sampling distribution is the mean of the population: μ = +$0.53. (Whatever players lose, the casino wins, so the mean is the opposite of a player’s mean.) The standard error of the mean is σ/√n = 9.986139979/√10000; σ ≈ $0.10.

Remark: This is why gambling is predictable for the operators: the SD is small compared to the mean.

(d) 10,000×$.5263157895 = $5,263.16

sketch showing ND with mean 0.53, standard error 0.10, and a tiny probability in the far left tail (e) To lose money, the casino has to make less than $0.00. Zero is more than five standard errors below the mean (has a z-score below −5), so you know right off that it would be unusual for the casino to lose money. normalcdf confirms that:
normalcdf computing probability of 6.817836E minus 8
P(lose on 10,000 bets) = 6.8×10-8. The casino has essentially no risk (7 chances in 100 million) of losing money on 10,000 bets.

 

sketch showing ND with mean 0.53, standard error 0.10, and almost the whole distribution shaded (f) Remember the elevator example. A total of $2000 on 10,000 bets is an average of 2000/10,000 = $0.20 per bet. Use normalcdf to compute the probability of doing that well or better:
normalcdf computing probability of .9995
P(make ≥$2000) = 0.9995. Not only is the casino virtually certain not to lose money, it’s almost certain to make a handsome profit, as long as people come in to place bets.

6 normal curve, mu=5.0, SEM=0.05/sqrt(15), right tail past 5.04 shaded Given: μ = 5.00, σ = 0.05, n = 15. Needed: P(∑x>75.6). A sample weighing 75.6 lb total will have a sample mean of 75.6/15 = 5.04 lb, so this is really just another problem in finding the probability of turning up a sample mean in a given range.

P(∑x > 75.6) = P( > 5.04) = normalcdf(75.6/15, 10^99, 5.00, 0.05/√(15)) = 9.7295E-4 ≈ 0.0010, about one chance in a thousand.

7 (a) This part is a standard Chapter 7 problem about individuals, not samples, so the axis is x rather than .

normal distribution, shaded to right of 43, probability .1634     normalcdf of 43, 10^99, 38, 5.1; probability is .1634462878

Answer: P(x > 43.0) = 0.1634

normal distribution, shaded to right of 43, probability .0001 (b) The sampling distribution of is ND, even for this small sample, because the population is ND. The standard error is σ = 5.1/√14 ≈ 1.4.

P( > 43.0) = normalcdf(43, 10^99, 38, 5.1/√(14))  1.2212E-4 → P(>43.0) = 0.0001 or 0.01%

Remark: This sketch is not very well proportioned, because it makes the probability look much larger than it actually is.

8 normal distribution shaded right of 12.778, probability 0.0060 12,778 KW shared among 1000 households is 12778/1000 = 12.778 KW per household on average. “Fail to supply enough power” means that the households are using more power than that. You need P( > 12.778) for n = 1000.

The standard error of the mean is σ = 3.5/√1000, about 0.11. The sampling distribution of the mean is normal because data are numeric and n =1000, greater than 30. (Treat the sample as random because it’s a “typical neighborhood”. And a thousand households is less than 10% of all the households that there are.)

P( > 12.778) = normalcdf(12.778, 10^99, 12.5, 3.5/√(1000) = 0.0060

9 normal distribution shaded left of .0094; probability is 2 times 10 to the minus 10th p = 0.0171, n = 11,037, and you want to find P( ≤ 0.0094). First check that the sampling distribution of is a ND:

Therefore the sampling distribution can be approximated by a normal distribution.

The standard error of the proportion or SEP is σ = √[pq/n] = √[.0171(1−.0171)/11037] ≈ 0.0012

If you use my shortcut, your screen will look like the one at the left; if not, it will look like the one at the right.

short form of computation; see text  or  square root of quantity .0171, times 1 minus .0171, over 11,037 yields .0012340342; normalcdf of minus 10^99, .0094, .0171, same square root yields 2.2013E−10

Either way, the probability is 2.2013×10-10, or 0.000 000 000 2. There are only two chances in ten billion of getting a sample proportion of 0.94% or less with sample size 11,037, if the true population proportion is 1.71%. That’s pretty darn unlikely, so based on this experiment you can rule out coincidence and decide that aspirin does reduce the chance of a heart attack among adult males.

10 Heights are ND, so the sampling distribution is also. By the Empirical Rule or 68–95–99.7 Rule, 95% of a ND falls within 2 SD of the mean. The distribution that concerns you in this problem is the sampling distribution of , not the original distribution of individual men’s heights. Therefore, the SD that concerns you is the standard error of the mean, not the SD of men’s heights.

The standard error of the mean or SEM is σ = σ/√n = 2.92/√16 = 0.73″.

μ ± 2σ = 69.3 ± 2×.73 = 67.84 to 70.76.

Sample means between those values would not be surprising, and therefore a sample mean would be surprising if it is under 67.84″ or over 70.76″.

Alternative solution: That back-of-the-envelope calculation is good enough, but you could also get a more precise answer:

L = invNorm(0.025, 69.3, 2.92/√(16)) = 67.87

H = invNorm(1−0.025, 69.3, 2.92/√(16)) = 70.73

11 This is like the Swain v. Alabama example. You have to convert the sample counts into a proportion:  = 737/1504 ≈ 49%. The problem is really asking you for P( ≥ 49%) in a sample of 1504 with population proportion of 45%.

sampling distribution of p-hat with center 0.49, standard error .013, and right=hand tail from 0.49 shaded What does the sampling distribution look like? The center is μ = p = 0.45. The standard error is σ = √[.45×(1−.45)/1504] ≈ 0.013. Check requirements to make sure that a normal model can be used for the sampling distribution:

P(x ≥ 737) = P( ≥ 49%) = normalcdf(737/1504, 10^99, .45, √(.45*(1−.45)/1504)) ≈ 9E-4 or 0.0009.

Can you draw a conclusion? Yes, you can. In a population with 45% unfavorable rating of the Tea Party, there are only 9 chances in 10,000 of getting a sample as unfavorable as this one (or more unfavorable). That’s pretty unlikely, so you conclude that the true unfavorable rating in October was most likely more than 45% of all Americans. (In Chapter 9, you’ll learn how to estimate that proportion from a sample.)

What’s New

Because this textbook helps you,
please click to donate!
Because this textbook helps you,
please donate at
BrownMath.com/donate.

Updates and new info: https://BrownMath.com/swt/

Site Map | Home Page | Contact