BrownMath.com → Stats w/o Tears → Ch 11 Solutions

# Stats without TearsSolutions for Chapter 11

Updated 1 Jan 2016

View or
Print:
These pages change automatically for your screen or printer. Underlined text, printed URLs, and the table of contents become live links on screen; and you can use your browser’s commands to change the size of the text or search for key words. If you print, I suggest black-and-white, two-sided printing.
Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1

(a) Use MATH200A part 5 and select `2-pop binomial`. You have no prior estimates, so enter 0.5 for 1 and 2. E is 0.03, and C-Level is 0.95. Answer: you need at least 2135 per sample, 2135 people under 30 and 2135 people aged 30 and older. Here’s what it looks like, using MATH200A part 5:  Caution! Even if you don’t identify the groups, at least you must say “per sample”. Plain “2135” makes it look like you need only that many people in the two groups combined, or around 1068 per group, and that is very wrong.

Caution! You must compute this as a two-population case. If you compute a sample size for just one group or the other, you get 1068, which is just about half of the correct value. If you don’t have the program, you have to use the formula: [1(1−1)+2(1−2)]·(zα/2/E)². You don’t have any prior estimates, so 1 and 2 are both equal to 0.5. Multiply out 1 × (1−1) × 2 × (1−2) to get .5.

Next, 1−α = 0.95, so α = 0.05 and α/2 = 0.025. zα/2 = z0.025 = invNorm(1−0.025). Divide that by E (.03), square, and multiply by the result of the computation with the ’s.

(b) Using MATH200A Program part 5 with .3, .45, .03, .95 gives 1953 per sample.

Alternative solution: Using the formula, .3(1−.3)+.45(1−.45) = .4575. Multiply by (invNorm(1−.05/2)/.03)² as before to get 1952.74157 → 1953 per sample.

Again, you must do this as two-population binomial. If you do the under-30 group and the 30+ group separately, you get sample sizes of 897 and 1057, which are way too small. If your samples are that size, the margins of error for under-30 and 30+ will each be 3%, but the margin of error for the difference, which is what you care about, will be around 4.2%, and that’s greater than the desired 3%.

2

(a) You have numeric data in two independent samples. You’re testing the difference between the means of two populations, Case 4 in Inferential Statistics: Basic Cases. (The data aren’t paired because you have no reason to associate any particular Englishman with any particular Scot.)

(1) Population 1 = English; population 2 = Scots. H0: μ1 = μ2 (or μ1−μ2 = 0) H1: μ1 > μ2 (or μ1−μ2 > 0) α = 0.05 The problem states that samples were random. For English, r=.9734 and crit=.9054; for Scots, r=.9772 and crit=.9054. Both r’s are greater than crit, so both are nearly normally distributed. The stacked boxplot shows no outliers. And obviously the samples of 8 are far less than 10% of the populations of England and Scotland.   English numbers in L1, Scottish numbers in L2. 2-SampTTest with Data; L1, L2, 1, 1, μ1>μ2, Pooled:No Outputs: t=1.57049305 → t = 1.58, p=.0689957991 → p = 0.0690, df=13.4634, x̅1=6.54, x̅2=4.85, s1=1.91, s2=2.34, n1=8, n2=8 p > α. Fail to reject H0. At the 0.05 level of significance, we can’t say whether English or Scots have a stronger liking for soccer. Or, We can’t say whether English or Scots have a stronger liking for soccer (p = 0.0690).

2-SampTInt, C-Level=.90
Results: (−.2025, 3.5775)
We’re 90% confident that, on a scale from 1=hate to 10=love, the average Englishman likes soccer between 0.2 points less and 3.6 points more than the average Scot.

3

(a) This is the difference of proportions in two populations, Case 5 in Inferential Statistics: Basic Cases.

(1) Population 1 = English, population 2 = Scots. H0: p1 = p2 (or p1−p2 = 0) H1: p1 ≠ p2 (or p1−p2 ≠ 0) α = 0.05 Populations of England and Scotland are greater than 10×150 = 1500 and 10×200 = 2000. England: 105 successes, 150−105 = 45 failures, both ≥ 10. Scotland: 160 successes, 200−160 = 40 failures, both ≥ 10. The samples were stated to be random. 2-PropZTest x1=105, n1=150, x2=160, n2=200, p1≠p2 results: z=−2.159047761 → z = −2.16, p=.030846351 → p = 0.0308, p̂1 = 0.70, p̂2 = 0.80, p̂ = 0.7571428751 p < α. Reject H0 and accept H1. The English and Scots are not equally likely to be soccer fans, at the 0.05 level of significance; in fact the English are less likely to be soccer fans. Or, The English and Scots are not equally likely to be soccer fans, (p = .0308); in fact the English are less likely to be soccer fans.

2-PropZInt with C-Level = .95 → (−.1919, −.0081)

That’s the estimate for p1p2, English minus Scots. Since that’s negative, English like soccer less than Scots do. With 95% confidence, Scots are more likely than English to be soccer fans, by 0.8 to 19.2 percentage points.

(c) [(−.0081) − (−.1919)] / 2 = 0.0919, a little over 9 percentage points.

(d) MATH200A part 5, 2-pop binomial, 1=.7, 2=.8, E=.04, C-Level .95 gives 889 per sample

By formula, zα/2 = z0.025 = invNorm(1−0.025) = 1.96.
n1 = n2 = [.7(1−.7)+.8(1−.8)]×(1/96/.04)² = 888.37 → 889 per sample

4

(a) This is before-and-after paired data, Case 3 in Inferential Statistics: Basic Cases. You’re testing the mean difference.

(1) d = After−Before H0: μd = 0, running makes no difference in HDL H1: μd > 0, running increases HDL Remark: If this was a research study, they would probably test for a difference in HDL, not just an increase. Maybe this study was done by a fitness center or a running-shoe company. They would want to find an increase, and HDL decreasing or staying the same would be equally uninteresting to them. α = 0.05 Before in L1, After in L2, L3=L2−L1 Random sample. Five women is obviously less than 10% of all women. Box-whisker (L3) shows no outliers. Normality check (L3): r(.9131)>crit(.8804).  T-Test 0, L3, 1, μ>0 results: t=3.059874484 → t = 3.06, p=.0188315555 → p = 0.0188, d̅=4.6, s=3.36, n=5 p < α. Reject H0 and accept H1. At the 0.05 level of significance, running 4 miles daily for six months raises HDL level. Or, Running 4 miles daily for six months raises HDL level (p = 0.0188).

(b) TInterval with C-Level .9 gives (1.3951, 7.8049).

Interpretation: You are 90% confident that running an average of four miles a day for six months will raise HDL by 1.4 to 7.8 points for the average woman.

Caution! Don’t write something like “I’m 90% confident that HDL will be 1.4 to 7.8”. The confidence interval is not about the HLD level, it’s about the change in HDL level.

Remark: Notice the correspondence between hypothesis test and confidence interval. The one-tailed HT at α = 0.05 is equivalent to a two-tailed HT at α = 0.10, and the complement of that is a CI at 1−α = 0.90 or a 90% confidence level. Since the HT did find a statistically significant effect, you know that the CI will not include 0. If the HT had failed to find a significant effect, then the CI would have included 0. See Confidence Interval and Hypothesis Test.

5

(a) Each participant either had a heart attack or didn’t, and the doctors were all independent in that respect. This is binomial data. You’re testing the difference in proportions between two populations, Case 5 in Inferential Statistics: Basic Cases.

(1) Population 1: Aspirin takers; population 2: non-aspirin takers. H0: p1 = p2, taking aspirin makes no difference H1: p1 ≠ p2, taking aspirin makes a difference α = 0.001 SRS. 10n1 = 10×11,037 = 110,370. According to A Census of Actively Licensed Physicians in the United States, 2010 (Young (2011) [see “Sources Used” at end of book]), in that year there were 850,085 actively licensed physicians in the US. Even if we assume half were women and there were fewer doctors in 1982 when the study began, still 10n1 is lower. 10n2 = 10×11,034 = 110,340, also within the limit. Treatment group: 139 successes, 11037−139 = 10898 failures, both ≥ 10. Placebo group: 239 successes, 11034−239 = 10795 failures, both ≥ 10. 2-PropZTest: x1=139, n1=11037, x2=239, n2=11034, p1≠p2 results: z=−5.19, p-value = 2×10-7, p̂1 = .0126, p̂2 = .0217, p̂ = .0171  p < α. Reject H0 and accept H1. At the 0.001 level of significance, aspirin does make a difference to the likelihood of heart attack. In fact it reduces it. Or, Aspirin makes a difference to the likelihood of heart attack (p < 0.0001). In fact, aspirin reduces the risk.

Remark The study was conducted from 1982 to 1988 and was stopped early because the results were so dramatic. For a non-technical summary, see Physicians’ Health Study (2009) [see “Sources Used” at end of book]. More details are in the original article from the New England Journal of Medicine (Steering Committee 1989 [see “Sources Used” at end of book]).

(b) `2-PropZInt` with C-Level .95 gives (−.0125, −.0056). We’re 95% confident that 325 mg of aspirin every other day reduces the chance of heart attack by 0.56 to 1.25 percentage points.

Caution! You’re estimating the change in heart-attack risk, not the risk of heart attack. Saying something like “with aspirin, the risk of heart attack is 0.56 to 1.25%” would be very wrong.

6

(a) You’re estimating the difference in means between two populations. This is Case 4 in Inferential Statistics: Basic Cases. Requirements:

• Random samples (given).
• Sample sizes both >30.
• 10×30 = 300 and 10×32 = 320 are less than the numbers of houses in the two counties.

Population 1 = Cortland County houses, population 2 = Broome County houses.
2-SampTInt, 134296, 44800, 30, 127139, 61200, 32, .95, No

results: (−20004, 34318)

June is 95% confident that the average house in Cortland County costs \$20,004 less to \$34,318 more than the average house in Broome County.

(b) A 95% confidence interval is the complement of a significance test for ≠ at α = 0.05. Since 0 is in the interval, you know the p-value would be >0.05 and therefore June can’t tell, at the 0.05 significance level, whether there is any difference in average house price in the two counties or not.

If both ends of the interval were positive, that would indicate a difference in averages at the 0.05 level, and you could say Cortland’s average is higher than Broome’s. Similarly, if both ends were negative you could say Cortland’s average is lower than Broome’s. But as it is, nada.

Remark: Obviously Broome County is cheaper in the sample. But the difference is not great enough to be statistically significant. Maybe the true mean in Broome really is less than in Cortland; maybe they’re equal; maybe Broome is more expensive. You simply can’t tell from these samples.

7

The immediate answer is that those are proportions in the sample, not the proportions among all voters. This is two-population binomial data, Case 5 in Inferential Statistics: Basic Cases.
Requirements check:

• Random samples, OK.
• Each sample 10n = 10×1000 = 10,000. There are far more than 10,000 voters nationally; OK.
• The two samples were independent, OK.
• Red: 520 successes and 1000−520 = 480 failures, OK.
Blue: 480 successes and 1000−480 = 520 failures, OK.

Population 1 = Red voters, population 2 = Blue voters.
2-PropZInt 520, 1000, 480, 1000, .95
Results: (−.0038, .08379), 1=.48, 2=.52

With 95% confidence, the Red candidate is somewhere between 0.4 percentage points behind Blue and 8.4 ahead of Blue. The confidence interval contains 0, and so it’s impossible to say whether either one is leading.

Remark: Newspapers often report the sample proportions 1 and 2 as though they were population proportions, but now you know that they aren’t. A different poll might have similar results, or it might have samples going the other way and showing Blue ahead of Red.

8 (a) For a confidence interval, each sample must have at least 10 successes and at least 10 failures. Sample 1 has only 7 successes. Requirements are not met, and you cannot compute a confidence interval with 2-PropZInt.

(b) For a hypothesis test, we often use “at least 10 successes and 10 failures in each sample” as a shortcut requirements test, but the real requirement is at least 10 successes and 10 failures expected in each sample, using the blended proportion . If the shortcut procedure fails, you must check the real requirement. In this problem, the blended proportion is

= (x1+x2)/(n1+n2) = (7+18)/(28+32) =25/60, about 42%.

For sample 1, with n1 = 28, you would expect 28×25/60 ≈ 11.7 successes and 28−11.7 = 16.3 failures. For sample 2, with n2 = 32, you would expect 32×25/60 ≈ 13.3 successes and 32−13.3 = 18.7 failures. Because all four of these expected numbers are at least 10, it’s valid to compute a p-value using 2-PropZTest.

## What’s New

• 1 Jan 2016: Retake the screen shots here, here, and here, for the new version of MATH200A Program part 4.
• 5 Dec 2015: Correct “%” to “percentage points”, here.
• 19 Jan 2015: Add a problem that requires the formal requirements check for Case 5.
• (intervening changes suppressed)
• 29 Apr 2013: New document.
Because this textbook helps you,