Stats without Tears
Solutions to Review Problems
Updated 1 Jan 2016
(What’s New?)
Copyright © 2007–2017 by Stan Brown
Updated 1 Jan 2016
(What’s New?)
Copyright © 2007–2017 by Stan Brown
Write your answer to each question. There’s no work to be shown. Don’t bother with a complete sentence if you can answer with a word, number, or phrase.
Common mistake: Binomial is a subtype of qualitative data so it’s not really a synonym. Discrete and continuous are subtypes of numeric data.
The Gambler’s Fallacy is believing that the die is somehow “due for a 6”. The Law of Large Numbers says that in the long run the proportion of 6’s will tend toward 1/6, but it doesn’t tell us anything at all about any particular roll.
Common mistake: You must specify which is population 1 and which is population 2.
Common mistake: The data type is binomial: a student is in trouble, or not. There are no means, so μ is incorrect in the hypotheses.
This is a binomial PD.
Remark: The significance level α is the level of risk of a Type I error that you can live with. If you can live with more risk, you can reach more conclusions.
Complementary events can’t happen at the same time and one or the other must happen. Example: rolling a die and getting an odd or an even. Complementary events are a subtype of disjoint events.
For a small to moderatesized set of numeric data, you might prefer a stemplot.
Remark: C is wrong because “model good” is H_{0}. D is also wrong: every hypothesis test, without exception, compares a pvalue to α. For E, df is number of cells minus 1. F is backward: in every hypothesis test you reject H_{0} when your sample is very unlikely to have occurred by random chance.
Remark: As stated, what you can prove depends partly on your H_{1}. There are three things it could be:
Regardless of H_{1}, if pvalue>α your conclusion will be D or similar to it.
Common mistake: Conclusion A is impossible because it’s the null hypothesis and you never accept the null hypothesis.
Conclusion B is also impossible. Why? because “no more than” translates to ≤. But you can’t have ≤ in H_{1}, and H_{1} is the only hypothesis that can be accepted (“proved”) in a hypothesis test.
Remark: A Type I error is a wrong result, but it is not necessarily the result of a mistake by the experimenter or statistician.
(b) The size is unknown, but certainly in the millions. You also could call it infinite, or uncountable. Common mistake: Don’t confuse size of population with size of sample. The population size is not the 487 from whom you got surveys, and it’s not the 321 churchgoers in your sample.
(c) The sample size n is the 321 churchgoers from whom you collected surveys. Yes, you collected 487 surveys in all, but you have to disregard the 166 that didn’t come from churchgoers, because they are not your target group. Common mistake: 227 isn’t the sample size either. It’s x, the number of successes within the sample.
(d) No. You want to know the attitudes of churchgoers, so it is correct sampling technique to include only churchgoers in your sample.
If you wanted to know about Americans in general, then it would be selection bias to include only churchgoers, since they are more likely than nonchurchgoers to oppose teaching evolution in public schools.
Common mistake: Your answer will probably be worded differently from that, but be careful that it is a conditional probability: If H_{0} is true, then there’s a pvalue chance of getting a sample this extreme or more so. The pvalue isn’t the chance that H_{0} is true.
Remark: If you are at all shaky about this, review What Does the pValue Mean?
binomcdf(100,.08,5)
(b) This is a binomial distribution, for exactly the same reasons.
MATH200A part 3, or binompdf(100,.08,5)
(c) The probability of success is
p = 0.08 on every trial, but you don’t have a fixed
number of trials. This is a geometric distribution.
geometpdf(.08,5)
Remark: There is no specific claim, so this is not a hypothesis test.
Caution: The percentages must add to 100%. Therefore you must have complete data on all categories to display a pie chart. Also, if multiple responses from one subject are allowed, then a pie chart isn’t suitable, and you should use some other presentation, such as a bar graph.
Remark: This problem tests for several very common mistakes by students. Always make sure that
This leaves you with G and K as possibilities. Either can be correct, depending on your textbook. For example, the textbooks used at TC3 always put a plain = sign in H_{0} regardless of H_{1}, so for TC3 students the correct answer is G. Students at some other institutions might have K as the correct answer.
Remark: The ZTest is wrong because you don’t know the SD of the selling price of all 2006 Honda Civics in the US. The 1PropZTest and χ²test are for nonnumeric data. There is no such thing as a 1PropTTest.
Example: “812 of 1000 Americans surveyed said they believe in ghosts” is an example of descriptive statistics: the numbers of yeses and noes in the sample were counted. “78.8% to 83.6% of Americans believe in ghosts (95% confidence)” is an example of inferential statistics: sample data were used to make an estimate about the population. “More than 60% of Americans believe in ghosts” is another example of inferential statistics: sample data were used to test a claim and make a statement about a population.
Remark: Remember that the confidence interval derives from the central 95% or 90% of the normal distribution. The central 90% is obviously less wide than the central 95%, so the interval will be less wide.
Example: You want to know the average amount of money a fulltime TC3 student spends on books in a semester. The population is all fulltime TC3 students. You randomly select a group of students and ask each one how much s/he spent on books this semester. That group is your sample.
Remark: This is unpaired numeric data, Case 4.
(b) For binomial data, requirements are slightly different between CI and HT. Here you are doing a hypothesis test.
Common mistake: For hypothesis test, you need expected successes and failures. It’s incorrect to use actual successes (150) and failures (350).
Common mistake: Some students answer this question with “n > 30”. That’s true, but not relevant here. Sample size 30 is important for numeric data, not binomial data.
Common mistake: You cannot do a 2SampZTest because you do not know the standard deviations of the two populations.
(1) 
Population 1 = Judge Judy’s decisions; Population 2 = Judge
Wapner’s decisions
H_{0}: μ_{1} = μ_{2}, no difference in awards H_{1}: μ_{1} > μ_{2}, Judge Judy gives higher awards 

(2)  α = 0.05 
(RC) 

(3–4)  2SampTTest: x̅1=650, s1=250, n1=32,
x̅2=580, s2=260, n2=32, μ_{1}>μ_{2}, Pooled: No
Results: t=1.10, pvalue = .1383 
(5)  p > α. Fail to reject H_{0}. 
(6)  At the 0.05 level of significance, we can’t tell whether Judge Judy was more friendly to plaintiffs (average award higher than Judge Wapner’s) or not. 
Some instructors do a preliminary Ftest. It gives p=0.9089>0.05, so after that test you would use Pooled:Yes in the 2SampTTest and get p=0.1553.
Solution: This is onepopulation numeric data, and you don’t know the standard deviation of the population: Case 1. Put the data in L1, and 1VarStats L1 tells that x̅ = 4.56, s = 1.34, n = 8.
(1) 
H_{0}: μ = 4, 4% or less improvement in drying time
H_{1}: μ > 4, better than 4% decrease in drying time Remark: Why is a decrease in drying time tested with > and not <? Because the data show the amount of decrease. If there is a decrease, the amount of decrease will be positive, and you are interested in whether the average decrease is greater than 4 (4%). 

(2)  α = 0.05 
(RC) 
(You don’t have to show these graphs on your exam paper; just show the numeric test for normality and mention that the modified boxplot shows no outliers.) 
(3–4) 
TTest: μ_{o}=4, x̅=4.5625, s=1.34..., n=8, μ>μ_{o}
Results: t = 1.19, p = 0.1370 
(5)  p > α. Fail to reject H_{0}. 
(6)  At the 0.05 significance level, we can’t tell whether the average drying time improved by more than 4% or not. 
(b) TInterval: CLevel=.95
Results: (3.4418, 5.6832)
(There’s no need to repeat the requirements check or to write down all the sample statistics again.)
With 95% confidence, the true mean decrease in drying time is between 3.4% and 5.7%.
n = 5, p = 0.28, from = 0, to = 0. Answer: 0.1935
Alternative solution: If you don’t have the program, you can compute the probability that one rabbit has short hair (1−.28 = 0.72), then that all the rabbits have short hair (0.72^5 = 0.1935), which is the same as the probability that none of the rabbits have long hair.
(b) The complement of “one or more” is none, so you can use the previous answer.
P(one or more) = 1−P(none) = 1−0.1935 = 0.8065
Alternative solution: MATH200A part 3 with n=5, p=.28, from=1, to=5; probability = 0.8065
(c) Again, use MATH200A part 3 to compute binomial probability: n = 5, p = 0.28, from = 4, to = 5. Answer: 0.0238
Alternative solution: If you don’t have the program, do binompdf(5, .28) and store into L3, then sum(L3,5,6) or L3(5)+L3(6) = 0.0238. Avoid the dreaded offbyone error! For x=4 and x=5 you want L3(5) and L3(6), not L3(4) and L3(5).
For n=5, P(x≥4) = 1−P(x≤3). So you can also compute the probability as 1−binomcdf(5, .28, 3) = 0.0238.
(d) For this problem you must know the formula:
μ = np = 5×0.28 = 1.4 per litter of 5, on average
Common mistake: It might be tempting to do this problem as a goodnessoffit, Case 6, taking the Others row as the model and the doctors’ choices as the observed values. But that would be wrong. Both the Doctors row and the Others row are experimental data, and both have some sampling error around the true proportions. If you take the Others row as the model, you’re saying that the true proportions for all nondoctors are precisely the same as the proportions in this sample. That’s rather unlikely.
(1) 
H_{0}: Doctors eat different breakfasts in the same proportions as others.
H_{1}: Doctors eat different breakfasts in different proportions from others. 

(2)  α = 0.05 
(3–4)  χ²Test gives χ² = 9.71, df = 4, p=0.0455 
(RC) 

(5)  p < α. Reject H_{0} and accept H_{1}. 
(6)  Yes, doctors do choose breakfast differently from other selfemployed professionals, at the 0.05 significance level. 
(b) 70−67.6 = 2.4″, and therefore z = −1. By the Empirical Rule, 68% of data lie between z = ±1. Therefore 100−68 = 32% lie outside z = ±1 and 32%/2 = 16% lie below z = −1. Therefore 67.6″ is the 16th percentile.
Alternative solution: Use the big chart to add up the proportion of men below 67.6″ or below z = −1. That is 0.15+2.35+13.5 = 16%.
(c) z = (74.8−70)/2.4 = +2. By the Empirical Rule, 95% of men fall between z = −2 and z = +2, so 5% fall below z = −2 or above z = +2. Half of those, 2.5%, fall above z = +2, so 100−2.5 = 97.5% fall below z = +2. 97.5% of men are shorter than 74.8″.
Alternative solution: You could also use the big chart to find that P(z > 2) = 2.35+0.15 = 2.5%, and then P(z < 2) = 100−2.5 = 97.5%.
(b) Compute the class marks or midpoints: 575, 725, and so on. Put
them in L1 and the frequencies in L2. Use 1VarStats L1,L2
and get n = 219.
See
Summary Numbers on the TI83.
(c) Further data from 1VarStats L1,L2
:
x̅ = 990.1 and
s = 167.3
Common mistake:
If you answered x̅ = 950 you probably did
1VarStats L1
instead of 1VarStats L1,L2
.
Your calculator depends on you to supply one list when you have a
simple list of numbers and two lists when you have a frequency
distribution.
(d) f/n = 29/219 ≈ 0.13 or 13%
invNorm(0.85, 57.6, 5.2) = 62.98945357 → 63.0 mph
MATH200A/sample size/binomial: p̂ = .2, E = 0.04, CLevel = 0.90
answer: 271.
Common mistake: The margin of error is E = 4% = 0.04, not 0.4.
Alternative solution: See Sample Size by Formula and use the formula at right. With the estimated population proportion p̂ = 0.2 in the formula, you get z_{α/2} = z_{0.05} = invNorm(1−0.05) = 1.6449, and n = 270.5543 → 271
(b) If you have no prior estimate, use p̂ = 0.5. The other inputs are the same, and the answer is 423
You expect positive correlation because points trend upward to the right (or, because y tends to increase as x increases). Even before plotting, you could probably predict a positive correlation because you assume higher calories come from fat; but you can’t just assume that without running the numbers.
(b) See Step 2 of
Scatterplot, Correlation, and Regression on TI83/84.
r = .8863314629 → r = 0.8862
a = .0586751909 → a = 0.0587
b = −3.440073602 → b = −3.4401
ŷ = 0.0587x − 3.4401
Common mistake: The symbol is ŷ, not y.
(c) The y intercept is −3.4401. It is the number of grams of fat you expect in the average zerocalorie serving of fast food. Clearly this is not a meaningful concept.
Remark: Remember that you can’t trust the regression outside the neighborhood of the data points. Here x varies from 130 to 640. The y intercept occurs at x = 0. That is pretty far outside the neighborhood of the data points, so it’s not surprising that its value is absurd.
(d) See Finding ŷ from a Regression on TI83/84. Trace at x = 310 and read off ŷ = 14.749... ≈ 14.7 grams fat. This is different from the actual data point (x=310, y=25) because ŷ is based on a trend reflecting all the data. It predicts the average fat content for all 310calorie fastfood items.
Alternative solution: ŷ = .0586751909(310) − 3.440073602 = 14.749 ≈ 14.7.
(e) The residual at any (x,y) is y−ŷ. At x = 310, y = 25 and ŷ = 14.7 from the previous part. The residual is y−ŷ = 10.3
Remark: If there were multiple data points at x = 310, you would calculate one residual for each point.
(f) From the LinReg(ax+b)
output,
R² = 0.7855834621 →
R² = 0.7856
About 79% of the variation in fat content is associated with variation in calorie content.
The other 21% comes from lurking
variables such as protein and carbohydrate count and from sampling
error.
(g) See Decision Points for Correlation Coefficient. Since 0.8862 is positive and 0.8862 > 0.602, you can say that there is some positive correlation in the population, and highercalorie fast foods do tend to be higher in fat.
(1) 
d = After − Before
H_{0}: μ_{d} = 0, no improvement H_{1}: μ_{d} > 0, improvement in number of situps Remark: Why After−Before instead of the other way round? Since we expect After to be greater than Before, doing it this way you can expect the d’s to be mostly positive (if H_{1} is true). Also, it feels more natural to set things up so that an improvement is a positive number. But if you do d=Before−After and H_{1}:μ_{d}<0, you get the same pvalue. 

(2)  α = 0.01 
(RC) 
The plots are shown here for comparison to yours, but you don’t need to copy these plots to an exam paper.

(3–4) 
TTest: μ_{o}=0, List:L4, Freq:1, μ>μ_{o}
Results: t = 2.74, p = 0.0169, x̅ = 4.4, s = 4.3, n = 7 
(5)  p > α. Fail to reject H_{0}. 
(6)  At the 0.01 significance level, we can’t say whether the physical fitness course improves people’s ability to do situps or not. 
(b) normalcdf(10^99, 24, 27, 4/√5) = .0467662315 → 0.0468 or about a 5% chance
(1) 
H_{0}: Nebraska preferences are the same as national proportions.
H_{1}: Nebraska preferences are different from national proportions. 

(2)  α = 0.05 
(3–4)  US percentages in L1, Nebraska observed counts in
L2. MATH200A part 6.
The result is χ² = 12.0093 → 12.01, df = 4, pvalue = 0.0173 Common mistake: Some students convert the Nebraska numbers to percentages and perform a χ² test that way. The χ² test model can equally well be percentages or whole numbers, but the observed numbers must be actual counts. 
(RC) 

(5)  p < α. Reject H_{0} and accept H_{1}. 
(6)  Yes, at the 0.05 significance level Nebraska preferences in vacation homes are different from those for the US as a whole. 
(1) 
Population 1 = Course, Population 2 = No course
H_{0}: μ_{1} = μ_{2}, no benefit from diabetic course H_{1}: μ_{1} < μ_{2}, reduced blood sugar from diabetic course 

(2)  α = 0.01 
(RC)  Independent random samples, both n’s >30 
(3–4) 
2SampTTest: x̅1=6.5, s1=.7, n1=50, x̅2=7.1, s2=.9, n2=50,
μ_{1}<μ_{2}, Pooled:No
Results: t=−3.72, p=1.7E4 or 0.0002 Though we do not, some classes use the preliminary 2SampFTest. That test gives p=0.0816>0.05. Those classes would use Pooled:Yes in 2SampTTest and get p=0.00016551 and the same conclusion. 
(5)  p < α. Reject H_{0} and accept H_{1}. 
(6)  At the 0.01 level of significance, the course in diabetic selfcare does lower patients’ blood sugar, on average. 
(b) For twopopulation numeric data, paired data do a good job of controlling for lurking variables. You would test each person’s blood sugar, then enroll all thirty patients in the course and test their blood sugar six months after the end of the course. Your variable d is blood sugar after the course minus blood sugar before, and your H_{1} is μ_{d} < 0.
One potential problem is that all 30 patients receive a heightened level of attention, so you have to worry about the placebo effect. (With the original experiment, the control group did not receive the extra attention of being in the course, so any difference from the attention is accounted for in the different results between control group and treatment group.)
It seems unlikely that the placebo effect would linger for six months after the end of a short course, but you can’t rule out the possibility. There are two answers to that. You could retest the patients after a year, or two years. Or, you could ask whether it really matters why patients do better. If they do better because of the course itself, or because of the attention, either way they’re doing better. A short course is relatively inexpensive. If it works, why look a gift horse in the mouth? In fact, medicine is beginning to take advantage of the placebo effect in some treatments.
(1) 
H_{0}: μ = 2.5 years
H_{1}: μ > 2.5 years 

(2)  α = 0.05 
(RC)  random sample, normal with no outliers (given) 
(3–4) 
TTest: μ_{o}=2.5, x̅=3, s=.5, n=6, μ>μ_{o}
Results: t = 2.45, p = 0.0290 
(5)  p < α. Reject H_{0} and accept H_{1}. 
(6)  Yes, at the 0.05 significance level, the mean duration of pain for all persons with the condition is greater than 2.5 years. 
(1) 
Population 1 = men, Population 2 = women
H_{0}: p_{1} = p_{2}, men and women equally likely to refuse promotions H_{1}: p_{1} > p_{2}, men more likely to refuse promotions 

(2)  α = 0.05 
(RC) 

(3–4) 
2PropZTest: x1=60, n1=200, x2=48, n2=200, p1>p2
Results: z=1.351474757 → z = 1.35, p=.0882717604 → pvalue = .0883, p̂1=.3, p̂2=.24, p̂=.27 
(5)  p > α. Fail to reject H_{0}. 
(6)  At the 0.05 level of significance, we can’t determine whether the percentage of men who have refused promotions to spend time with their family is more than, the same as, or less than the percentage of women. 
(b) 2PropZInt with the above inputs and CLevel=.95 gives (−.0268, .14682). The English sentence needs to state both magnitude and direction, something like this: Regarding men and women who refused promotion for family reasons, we’re 95% confident that men were between 2.7 percentage points less likely than women, and 14.7 percentage points more likely.
Common mistake: With twopopulation confidence intervals, you must state the direction of the difference, not just the size of the difference.
If the middle 95% runs from 70 to 130, then the mean must be μ = (70+130)÷2 → μ = 100
95% of any population are within 2 standard deviations of the mean. The range 70 to 100 (or 100 to 130) is therefore two SD. 2σ = 100−70 = 30 → σ = 15
(1) 
H_{0}: p = .75
H_{1}: p < .75 

(2)  α = 0.05 
(RC) 

(3–4) 
1PropZTest: p_{o}=.75, x=40, n=65, prop<p_{o}
Results: z=−2.506402059 → z = −2.51, p=.006098358 → pvalue = 0.0061, p̂=.6154 
(5)  p < α. Reject H_{0} and accept H_{1}. 
(6)  At the 0.05 level of significance, less than 75% of claims do settle within 2 months. 
P(Brand A and mislabeled) = P(Brand A) × P(mislabeled  Brand A)
and similarly for brand B.
P(mislabeled) = 0.40 × 0.025 + 0.60 × 0.015 = 0.019 or just under 2%
Alternative solution: The formulas can be confusing, and often there’s a way to do without them. You could also do this as a matter of proportions:
Out of 1000 shoes, 400 are Brand A and 600 are Brand B.
Out of 400 Brand A shoes, 2.5% are mislabeled. 0.025×400 = 10 brand A shoes mislabeled.
Out of 600 Brand B shoes, 1.5% are mislabeled. 0.015×600 = 9 brand B shoes mislabeled.
Out of 1000 shoes, 10 + 9 = 19 are mislabeled. 19/1000 is 1.9% or 0.019.
This is even easier to do if you set up a twoway table, as shown below. The values in bold face are given in the problem, and those in light face are derived from them.
Brand A  Brand B  Total  

Mislabeled  40% × 2.5% = 1%  60% × 1.5% = 0.9%  1% + 0.9% = 1.9% 
Correctly labeled  40% − 1% = 39%  60% − 0.9% = 59.1%  39% + 59.1% = 98.1% 
Total  40%  60%  100% 
Solution: This is paired numeric data, Case 3.
Common mistake: You must do this as paired data. Doing it as unpaired data will not give the correct pvalue.
(1) 
d = A−B
H_{0}: μ_{d} = 0, no difference in smoothness H_{1}: μ_{d} ≠ 0, a difference in smoothness Remark: You must define d as part of your hypotheses. 

(2)  α = 0.10 
(RC) 

(3–4) 
TTest: μ_{o}=0, List:L3, Freq: 1, μ≠μ_{o}
Results: t = 1.73, p = 0.1173, x̅ = 1, s = 1.83, n = 10 
(5)  p > α. Fail to reject H_{0}. 
(6)  At the 0.10 level of significance, it’s impossible to say whether the two brands of razors give equally smooth shaves or not. 
Solution: (a) Use MATH200A part 3 with n=2, p=.9, from=1, to=1. Answer: 0.18
You could also use binompdf(2, .9, 1) = 0.18.
Alternative solution: The probability that exactly one is tainted is sum of two probabilities: (i) that the first is tainted and the second is not, and (ii) that the first is not tainted and the second is. Symbolically,
P(exactly one) = P(first and second^{C}) + P(first^{C} and second)
P(exactly one) = 0.9×0.1 + 0.1×0.9
P(exactly one) = 0.09 + 0.09 = 0.18
Solution: (b) When sampling without replacement, the probabilities change. You have the same two scenarios — first but not second, and not first but second — but the numbers are different.
P(exactly one) = P(first and second^{C}) + P(first^{C} and second)
P(exactly one) = (9/10)×(1/9) + (1/10)×(9/9)
P(exactly one) = 1/10 + 1/10 = 2/10 = 0.2
Common mistake: Many, many students forget that both possible orders have to be considered: first but not second, and second but not first.
Common mistake: You can’t use binomial distribution in part (b), because when sampling without replacement the probability changes from one trial to the next.
For example, if the first card is an ace then the probability the second card is also an ace is 3/51, but if the first card is not an ace then the probability that the second card is an ace is 4/51. Symbolically, P(A_{2}A_{1}) = 3/51 but P(A_{2} not A_{1}) = 4/51.
(a) p̂_{T} = 128/300 = 0.4267. p̂_{C} = 135/400 = 0.3375. p̂_{T}−p̂_{C} = 0.0892 or about 8.9%
Remark: The point estimate is descriptive statistics, and requirements don’t enter into it. But the confidence interval is inferential statistics, so you must verify that each sample is random, each sample has at least 10 successes and 10 failures, and each sample is less than 10% of the population it came from.
(b) 2PropZInt: The 98% confidence interval is 0.0029 to 0.1754 (about 0.3% to 17.5%), meaning that with 98% confidence Tompkins viewers are more likely than Cortland viewers, by 0.3 to 17.5 percentage points, to prefer a movie over TV.
(c) E = 0.1754−0.0892 = 0.0862 or about 8.6%
You could also compute it as 0.0892−0.0029 = 0.0863 or (0.1754−0.0029)/2 = 0.0853. All three methods get the same answer except for a rounding difference.
(1) 
Population 1 = no treatment, Population 2 = special treatment
H_{0} p_{1} = p_{2}, no difference in germination rates H_{1} p_{1} ≠ p_{2}, there’s a difference in germination rates 

(2)  α = 0.05 
(RC) 

(3–4) 
2PropZTest: x1=80, n1=80+20, x2=135, n2=135+15,
p_{1}≠p_{2}
Results: z = −2.23, pvalue = 0.0256, p̂1 = .8, p̂2 = .9, p̂ = .86 
(5)  p < α. Reject H_{0} and accept H_{1}. 
(6) 
Yes, at the 0.05 significance level, the special treatment made a
difference in germination rate.
Specifically, seeds with the special treatment were more likely to
germinate than seeds that were not treated.
Remark: p < α in TwoTailed Test: What Does It Tell You? explains how you can reach a onetailed result from a twotailed test. 
Alternative solution: You could also do this as a test of homogeneity, Case 7. The χ²Test gives χ² = 4.98, df = 1, p=0.0256
Updates and new info: http://BrownMath.com/swt/