Stats without Tears
Solutions to Review Problems
Updated 17 Nov 2020
(What’s New?)
Copyright © 2007–2023 by Stan Brown, BrownMath.com
Updated 17 Nov 2020
(What’s New?)
Copyright © 2007–2023 by Stan Brown, BrownMath.com
Write your answer to each question. There’s no work to be shown. Don’t bother with a complete sentence if you can answer with a word, number, or phrase.
Common mistake: Binomial is a subtype of qualitative data so it’s not really a synonym. Discrete and continuous are subtypes of numeric data.
The Gambler’s Fallacy is believing that the die is somehow “due for a 6”. The Law of Large Numbers says that in the long run the proportion of 6’s will tend toward 1/6, but it doesn’t tell us anything at all about any particular roll.
Common mistake: You must specify which is population 1 and which is population 2.
Common mistake: The data type is binomial: a student is in trouble, or not. There are no means, so μ is incorrect in the hypotheses.
This is a binomial PD.
Remark: The significance level α is the level of risk of a Type I error that you can live with. If you can live with more risk, you can reach more conclusions.
Complementary events can’t happen at the same time and one or the other must happen. Example: rolling a die and getting an odd or an even. Complementary events are a subtype of disjoint events.
For a small- to moderate-sized set of numeric data, you might prefer a stemplot.
Remark: C is wrong because “model good” is H0. D is also wrong: every hypothesis test, without exception, compares a p-value to α. For E, df is number of cells minus 1. F is backward: in every hypothesis test you reject H0 when your sample is very unlikely to have occurred by random chance.
Remark: As stated, what you can prove depends partly on your H1. There are three things it could be:
Regardless of H1, if p-value>α your conclusion will be D or similar to it.
Common mistake: Conclusion A is impossible because it’s the null hypothesis and you never accept the null hypothesis.
Conclusion B is also impossible. Why? because “no more than” translates to ≤. But you can’t have ≤ in H1, and H1 is the only hypothesis that can be accepted (“proved”) in a hypothesis test.
Remark: A Type I error is a wrong result, but it is not necessarily the result of a mistake by the experimenter or statistician.
(b) The size is unknown, but certainly in the millions. You also could call it infinite, or uncountable. Common mistake: Don’t confuse size of population with size of sample. The population size is not the 487 from whom you got surveys, and it’s not the 321 churchgoers in your sample.
(c) The sample size n is the 321 churchgoers from whom you collected surveys. Yes, you collected 487 surveys in all, but you have to disregard the 166 that didn’t come from churchgoers, because they are not your target group. Common mistake: 227 isn’t the sample size either. It’s x, the number of successes within the sample.
(d) No. You want to know the attitudes of churchgoers, so it is correct sampling technique to include only churchgoers in your sample.
If you wanted to know about Americans in general, then it would be selection bias to include only churchgoers, since they are more likely than non-churchgoers to oppose teaching evolution in public schools.
Common mistake: Your answer will probably be worded differently from that, but be careful that it is a conditional probability: If H0 is true, then there’s a p-value chance of getting a sample this extreme or more so. The p-value isn’t the chance that H0 is true.
Remark: If you are at all shaky about this, review What Does the p-Value Mean?
binomcdf(100,.08,5)
(b) This is a binomial distribution, for exactly the same reasons.
MATH200A part 3, or binompdf(100,.08,5)
(c) The probability of success is
p = 0.08 on every trial, but you don’t have a fixed
number of trials. This is a geometric distribution.
geometpdf(.08,5)
Remark: There is no specific claim, so this is not a hypothesis test.
Caution: The percentages must add to 100%. Therefore you must have complete data on all categories to display a pie chart. Also, if multiple responses from one subject are allowed, then a pie chart isn’t suitable, and you should use some other presentation, such as a bar graph.
Remark: This problem tests for several very common mistakes by students. Always make sure that
This leaves you with G and K as possibilities. Either can be correct, depending on your textbook. The most common practice is always to put a plain = sign in H0 regardless of H1, which makes G the correct answer. But some textbooks or profs prefer ≤ or ≥ in H0 for one-tailed tests, whch makes K the correct answer.
Remark: The Z-Test is wrong because you don’t know the SD of the selling price of all 2006 Honda Civics in the US. The 1-PropZTest and χ²-test are for non-numeric data. There is no such thing as a 1-PropTTest.
Example: “812 of 1000 Americans surveyed said they believe in ghosts” is an example of descriptive statistics: the numbers of yeses and noes in the sample were counted. “78.8% to 83.6% of Americans believe in ghosts (95% confidence)” is an example of inferential statistics: sample data were used to make an estimate about the population. “More than 60% of Americans believe in ghosts” is another example of inferential statistics: sample data were used to test a claim and make a statement about a population.
Remark: Remember that the confidence interval derives from the central 95% or 90% of the normal distribution. The central 90% is obviously less wide than the central 95%, so the interval will be less wide.
Example: You want to know the average amount of money a full-time TC3 student spends on books in a semester. The population is all full-time TC3 students. You randomly select a group of students and ask each one how much s/he spent on books this semester. That group is your sample.
Remark: This is unpaired numeric data, Case 4.
(b) For binomial data, requirements are slightly different between CI and HT. Here you are doing a hypothesis test.
Common mistake: For hypothesis test, you need expected successes and failures. It’s incorrect to use actual successes (150) and failures (350).
Common mistake: Some students answer this question with “n > 30”. That’s true, but not relevant here. Sample size 30 is important for numeric data, not binomial data.
Common mistake: You cannot do a 2-SampZTest because you do not know the standard deviations of the two populations.
(1) |
Population 1 = Judge Judy’s decisions; Population 2 = Judge
Wapner’s decisions
H0: μ1 = μ2, no difference in awards H1: μ1 > μ2, Judge Judy gives higher awards |
---|---|
(2) | α = 0.05 |
(RC) |
|
(3–4) | 2-SampTTest: x̅1=650, s1=250, n1=32,
x̅2=580, s2=260, n2=32, μ1>μ2, Pooled: No
Results: t=1.10, p-value = .1383 |
(5) | p > α. Fail to reject H0. |
(6) | At the 0.05 level of significance, we can’t tell whether Judge Judy was more friendly to plaintiffs (average award higher than Judge Wapner’s) or not. |
Some instructors have you do a preliminary F-test. It gives p=0.9089>0.05, so after that test you would use Pooled:Yes in the 2-SampTTest and get p=0.1553.
Solution: This is one-population numeric data, and you don’t know the standard deviation of the population: Case 1. Put the data in L1, and 1-VarStats L1 tells that x̅ = 4.56, s = 1.34, n = 8.
(1) |
H0: μ = 4, 4% or less improvement in drying time
H1: μ > 4, better than 4% decrease in drying time Remark: Why is a decrease in drying time tested with > and not <? Because the data show the amount of decrease. If there is a decrease, the amount of decrease will be positive, and you are interested in whether the average decrease is greater than 4 (4%). |
---|---|
(2) | α = 0.05 |
(RC) |
(You don’t have to show these graphs on your exam paper; just show the numeric test for normality and mention that the modified boxplot shows no outliers.) |
(3–4) |
T-Test: μo=4, x̅=4.5625, s=1.34…, n=8, μ>μo
Results: t = 1.19, p = 0.1370 |
(5) | p > α. Fail to reject H0. |
(6) | At the 0.05 significance level, we can’t tell whether the average drying time improved by more than 4% or not. |
(b) TInterval: C-Level=.95
Results: (3.4418, 5.6832)
(There’s no need to repeat the requirements check or to write down all the sample statistics again.)
With 95% confidence, the true mean decrease in drying time is between 3.4% and 5.7%.
n = 5, p = 0.28, from = 0, to = 0. Answer: 0.1935
Alternative solution: If you don’t have the program, you can compute the probability that one rabbit has short hair (1−.28 = 0.72), then that all the rabbits have short hair (0.72^5 = 0.1935), which is the same as the probability that none of the rabbits have long hair.
(b) The complement of “one or more” is none, so you can use the previous answer.
P(one or more) = 1−P(none) = 1−0.1935 = 0.8065
Alternative solution: MATH200A part 3 with n=5, p=.28, from=1, to=5; probability = 0.8065
(c) Again, use MATH200A part 3 to compute binomial probability: n = 5, p = 0.28, from = 4, to = 5. Answer: 0.0238
Alternative solution: If you don’t have the program, do binompdf(5, .28) and store into L3, then sum(L3,5,6) or L3(5)+L3(6) = 0.0238. Avoid the dreaded off-by-one error! For x=4 and x=5 you want L3(5) and L3(6), not L3(4) and L3(5).
For n=5, P(x≥4) = 1−P(x≤3). So you can also compute the probability as 1−binomcdf(5, .28, 3) = 0.0238.
(d) For this problem you must know the formula:
μ = np = 5×0.28 = 1.4 per litter of 5, on average
Common mistake: It might be tempting to do this problem as a goodness-of-fit, Case 6, taking the Others row as the model and the doctors’ choices as the observed values. But that would be wrong. Both the Doctors row and the Others row are experimental data, and both have some sampling error around the true proportions. If you take the Others row as the model, you’re saying that the true proportions for all non-doctors are precisely the same as the proportions in this sample. That’s rather unlikely.
(1) |
H0: Doctors eat different breakfasts in the same proportions as others.
H1: Doctors eat different breakfasts in different proportions from others. |
---|---|
(2) | α = 0.05 |
(3–4) | χ²-Test gives χ² = 9.71, df = 4, p=0.0455 |
(RC) |
|
(5) | p < α. Reject H0 and accept H1. |
(6) | Yes, doctors do choose breakfast differently from other self-employed professionals, at the 0.05 significance level. |
(b) 70−67.6 = 2.4″, and therefore z = −1. By the Empirical Rule, 68% of data lie between z = ±1. Therefore 100−68 = 32% lie outside z = ±1 and 32%/2 = 16% lie below z = −1. Therefore 67.6″ is the 16th percentile.
Alternative solution: Use the big chart to add up the proportion of men below 67.6″ or below z = −1. That is 0.15+2.35+13.5 = 16%.
(c) z = (74.8−70)/2.4 = +2. By the Empirical Rule, 95% of men fall between z = −2 and z = +2, so 5% fall below z = −2 or above z = +2. Half of those, 2.5%, fall above z = +2, so 100−2.5 = 97.5% fall below z = +2. 97.5% of men are shorter than 74.8″.
Alternative solution: You could also use the big chart to find that P(z > 2) = 2.35+0.15 = 2.5%, and then P(z < 2) = 100−2.5 = 97.5%.
(b) Compute the class marks or midpoints: 575, 725, and so on. Put
them in L1 and the frequencies in L2. Use 1-VarStats L1,L2
and get n = 219.
See
Summary Numbers on the TI-83.
(c) Further data from 1-VarStats L1,L2
:
x̅ = 990.1 and
s = 167.3
Common mistake:
If you answered x̅ = 950 you probably did
1-VarStats L1
instead of 1-VarStats L1,L2
.
Your calculator depends on you to supply one list when you have a
simple list of numbers and two lists when you have a frequency
distribution.
(d) f/n = 29/219 ≈ 0.13 or 13%
invNorm(0.85, 57.6, 5.2) = 62.98945357 → 63.0 mph
MATH200A/sample size/binomial: p̂ = .2, E = 0.04, C-Level = 0.90
answer: 271.
Common mistake: The margin of error is E = 4% = 0.04, not 0.4.
Alternative solution:
See
Sample Size by Formula
and use the
formula at right.
With the estimated population proportion p̂ = 0.2 in
the formula, you get zα/2 =
z0.05 = invNorm(1−0.05) = 1.6449, and
n = 270.5543 → 271
(b) If you have no prior estimate, use p̂ = 0.5. The other inputs are the same, and the answer is 423
You expect positive correlation because points trend upward to the right (or, because y tends to increase as x increases). Even before plotting, you could probably predict a positive correlation because you assume higher calories come from fat; but you can’t just assume that without running the numbers.
(b) See Step 2 of
Scatterplot, Correlation, and Regression on TI-83/84.
r = .8863314629 → r = 0.8862
a = .0586751909 → a = 0.0587
b = −3.440073602 → b = −3.4401
ŷ = 0.0587x − 3.4401
Common mistake: The symbol is ŷ, not y.
(c) The y intercept is −3.4401. It is the number of grams of fat you expect in the average zero-calorie serving of fast food. Clearly this is not a meaningful concept.
Remark: Remember that you can’t trust the regression outside the neighborhood of the data points. Here x varies from 130 to 640. The y intercept occurs at x = 0. That is pretty far outside the neighborhood of the data points, so it’s not surprising that its value is absurd.
(d) See How to Find ŷ from a Regression on TI-83/84. Trace at x = 310 and read off ŷ = 14.749… ≈ 14.7 grams fat. This is different from the actual data point (x=310, y=25) because ŷ is based on a trend reflecting all the data. It predicts the average fat content for all 310-calorie fast-food items.
Alternative solution: ŷ = .0586751909(310) − 3.440073602 = 14.749 ≈ 14.7.
(e) The residual at any (x,y) is y−ŷ. At x = 310, y = 25 and ŷ = 14.7 from the previous part. The residual is y−ŷ = 10.3
Remark: If there were multiple data points at x = 310, you would calculate one residual for each point.
(f) From the LinReg(ax+b)
output,
R² = 0.7855834621 →
R² = 0.7856
About 79% of the variation in fat content is associated with variation in calorie content.
The other 21% comes from lurking
variables such as protein and carbohydrate count and from sampling
error.
(g) See Decision Points for Correlation Coefficient. Since 0.8862 is positive and 0.8862 > 0.602, you can say that there is some positive correlation in the population, and higher-calorie fast foods do tend to be higher in fat.
(1) |
d = After − Before
H0: μd = 0, no improvement H1: μd > 0, improvement in number of sit-ups Remark: Why After−Before instead of the other way round? Since we expect After to be greater than Before, doing it this way you can expect the d’s to be mostly positive (if H1 is true). Also, it feels more natural to set things up so that an improvement is a positive number. But if you do d=Before−After and H1:μd<0, you get the same p-value. |
---|---|
(2) | α = 0.01 |
(RC) |
The plots are shown here for comparison to yours, but you don’t need to copy these plots to an exam paper.
|
(3–4) |
T-Test: μo=0, List:L4, Freq:1, μ>μo
Results: t = 2.74, p = 0.0169, x̅ = 4.4, s = 4.3, n = 7 |
(5) | p > α. Fail to reject H0. |
(6) | At the 0.01 significance level, we can’t say whether the physical fitness course improves people’s ability to do sit-ups or not. |
(b) normalcdf(-10^99, 24, 27, 4/√5) = .0467662315 → 0.0468 or about a 5% chance
(1) |
H0: Nebraska preferences are the same as national proportions.
H1: Nebraska preferences are different from national proportions. |
---|---|
(2) | α = 0.05 |
(3–4) | US percentages in L1, Nebraska observed counts in
L2. MATH200A part 6.
The result is χ² = 12.0093 → 12.01, df = 4, p-value = 0.0173 Common mistake: Some students convert the Nebraska numbers to percentages and perform a χ² test that way. The χ² test model can equally well be percentages or whole numbers, but the observed numbers must be actual counts. |
(RC) |
|
(5) | p < α. Reject H0 and accept H1. |
(6) | Yes, at the 0.05 significance level Nebraska preferences in vacation homes are different from those for the US as a whole. |
(1) |
Population 1 = Course, Population 2 = No course
H0: μ1 = μ2, no benefit from diabetic course H1: μ1 < μ2, reduced blood sugar from diabetic course |
---|---|
(2) | α = 0.01 |
(RC) | Independent random samples, both n’s >30 |
(3–4) |
2-SampTTest: x̅1=6.5, s1=.7, n1=50, x̅2=7.1, s2=.9, n2=50,
μ1<μ2, Pooled:No
Results: t=−3.72, p=1.7E−4 or 0.0002 Though we do not, some classes use the preliminary 2-SampFTest. That test gives p=0.0816>0.05. Those classes would use Pooled:Yes in 2-SampTTest and get p=0.00016551 and the same conclusion. |
(5) | p < α. Reject H0 and accept H1. |
(6) | At the 0.01 level of significance, the course in diabetic self-care does lower patients’ blood sugar, on average. |
(b) For two-population numeric data, paired data do a good job of controlling for lurking variables. You would test each person’s blood sugar, then enroll all thirty patients in the course and test their blood sugar six months after the end of the course. Your variable d is blood sugar after the course minus blood sugar before, and your H1 is μd < 0.
One potential problem is that all 30 patients receive a heightened level of attention, so you have to worry about the placebo effect. (With the original experiment, the control group did not receive the extra attention of being in the course, so any difference from the attention is accounted for in the different results between control group and treatment group.)
It seems unlikely that the placebo effect would linger for six months after the end of a short course, but you can’t rule out the possibility. There are two answers to that. You could re-test the patients after a year, or two years. Or, you could ask whether it really matters why patients do better. If they do better because of the course itself, or because of the attention, either way they’re doing better. A short course is relatively inexpensive. If it works, why look a gift horse in the mouth? In fact, medicine is beginning to take advantage of the placebo effect in some treatments.
(1) |
H0: μ = 2.5 years
H1: μ > 2.5 years |
---|---|
(2) | α = 0.05 |
(RC) | random sample, normal with no outliers (given) |
(3–4) |
T-Test: μo=2.5, x̅=3, s=.5, n=6, μ>μo
Results: t = 2.45, p = 0.0290 |
(5) | p < α. Reject H0 and accept H1. |
(6) | Yes, at the 0.05 significance level, the mean duration of pain for all persons with the condition is greater than 2.5 years. |
(1) |
Population 1 = men, Population 2 = women
H0: p1 = p2 men and women equally likely to refuse promotions H1: p1 > p2 men more likely to refuse promotions |
---|---|
(2) | α = 0.05 |
(RC) |
|
(3–4) |
2-PropZTest: x1=60, n1=200, x2=48, n2=200, p1>p2
Results: z=1.351474757 → z = 1.35, p=.0882717604 → p-value = .0883, p̂1=.3, p̂2=.24, p̂=.27 |
(5) | p > α. Fail to reject H0. |
(6) | At the 0.05 level of significance, we can’t determine whether the percentage of men who have refused promotions to spend time with their family is more than, the same as, or less than the percentage of women. |
(b) 2-PropZInt with the above inputs and C-Level=.95 gives (−.0268, .14682). The English sentence needs to state both magnitude and direction, something like this: Regarding men and women who refused promotion for family reasons, we’re 95% confident that men were between 2.7 percentage points less likely than women, and 14.7 percentage points more likely.
Common mistake: With two-population confidence intervals, you must state the direction of the difference, not just the size of the difference.
If the middle 95% runs from 70 to 130, then the mean must be μ = (70+130)÷2 → μ = 100
95% of any population are within 2 standard deviations of the mean. The range 70 to 100 (or 100 to 130) is therefore two SD. 2σ = 100−70 = 30 → σ = 15
(1) |
H0: p = .75
H1: p < .75 |
---|---|
(2) | α = 0.05 |
(RC) |
|
(3–4) |
1-PropZTest: po=.75, x=40, n=65, prop<po
Results: z=−2.506402059 → z = −2.51, p=.006098358 → p-value = 0.0061, p̂=.6154 |
(5) | p < α. Reject H0 and accept H1. |
(6) | At the 0.05 level of significance, less than 75% of claims do settle within 2 months. |
P(Brand A and mislabeled) = P(Brand A) × P(mislabeled | Brand A)
and similarly for brand B.
P(mislabeled) = 0.40 × 0.025 + 0.60 × 0.015 = 0.019 or just under 2%
Alternative solution: The formulas can be confusing, and often there’s a way to do without them. You could also do this as a matter of proportions:
Out of 1000 shoes, 400 are Brand A and 600 are Brand B.
Out of 400 Brand A shoes, 2.5% are mislabeled. 0.025×400 = 10 brand A shoes mislabeled.
Out of 600 Brand B shoes, 1.5% are mislabeled. 0.015×600 = 9 brand B shoes mislabeled.
Out of 1000 shoes, 10 + 9 = 19 are mislabeled. 19/1000 is 1.9% or 0.019.
This is even easier to do if you set up a two-way table, as shown below. The values in bold face are given in the problem, and those in light face are derived from them.
Brand A | Brand B | Total | |
---|---|---|---|
Mislabeled | 40% × 2.5% = 1% | 60% × 1.5% = 0.9% | 1% + 0.9% = 1.9% |
Correctly labeled | 40% − 1% = 39% | 60% − 0.9% = 59.1% | 39% + 59.1% = 98.1% |
Total | 40% | 60% | 100% |
Solution: This is paired numeric data, Case 3.
Common mistake: You must do this as paired data. Doing it as unpaired data will not give the correct p-value.
(1) |
d = A−B
H0: μd = 0, no difference in smoothness H1: μd ≠ 0, a difference in smoothness Remark: You must define d as part of your hypotheses. |
---|---|
(2) | α = 0.10 |
(RC) |
|
(3–4) |
T-Test: μo=0, List:L3, Freq: 1, μ≠μo
Results: t = 1.73, p = 0.1173, x̅ = 1, s = 1.83, n = 10 |
(5) | p > α. Fail to reject H0. |
(6) | At the 0.10 level of significance, it’s impossible to say whether the two brands of razors give equally smooth shaves or not. |
Solution: (a) Use MATH200A part 3 with n=2, p=0.9, from=1, to=1. Answer: 0.18
You could also use binompdf(2, .9, 1) = 0.18.
Alternative solution: The probability that exactly one is tainted is sum of two probabilities: (i) that the first is tainted and the second is not, and (ii) that the first is not tainted and the second is. Symbolically,
P(exactly one) = P(first and secondC) + P(firstC and second)
P(exactly one) = 0.9×0.1 + 0.1×0.9
P(exactly one) = 0.09 + 0.09 = 0.18
Solution: (b) When sampling without replacement, the probabilities change. You have the same two scenarios — first but not second, and not first but second — but the numbers are different.
P(exactly one) = P(first and secondC) + P(firstC and second)
P(exactly one) = (9/10)×(1/9) + (1/10)×(9/9)
P(exactly one) = 1/10 + 1/10 = 2/10 = 0.2
Common mistake: Many, many students forget that both possible orders have to be considered: first but not second, and second but not first.
Common mistake: You can’t use binomial distribution in part (b), because when sampling without replacement the probability changes from one trial to the next.
For example, if the first card is an ace then the probability the second card is also an ace is 3/51, but if the first card is not an ace then the probability that the second card is an ace is 4/51. Symbolically, P(A2|A1) = 3/51 but P(A2| not A1) = 4/51.
(a) p̂T = 128/300 = 0.4267. p̂C = 135/400 = 0.3375. p̂T−p̂C = 0.0892 or about 8.9%
Remark: The point estimate is descriptive statistics, and requirements don’t enter into it. But the confidence interval is inferential statistics, so you must verify that each sample is random, each sample has at least 10 successes and 10 failures, and each sample is less than 10% of the population it came from.
The problem states that the samples were random, which takes care of the first requirement. There were 128 successes and 300−128 = 172 failures in Tompkins, 135 successes and 400−135 = 265 failures in Cortland, so the second reqirement is met.
What about the third requirement? You don’t know the populations of the counties, but remember that you can work it backwards. 10×300 = 3000 (Tompkins) and 10×400 = 4000 (Cortland), and surely the two counties must have populations greater than 3000 and 4000, so the third requirement must be met.
(b) 2-PropZInt: The 98% confidence interval is 0.0029 to 0.1754 (about 0.3% to 17.5%), meaning that with 98% confidence Tompkins viewers are more likely than Cortland viewers, by 0.3 to 17.5 percentage points, to prefer a movie over TV.
(c) E = 0.1754−0.0892 = 0.0862 or about 8.6%
You could also compute it as 0.0892−0.0029 = 0.0863 or (0.1754−0.0029)/2 = 0.0853. All three methods get the same answer except for a rounding difference.
(1) |
Population 1 = no treatment, Population 2 = special treatment
H0 p1 = p2, no difference in germination rates H1 p1 ≠ p2, there’s a difference in germination rates |
---|---|
(2) | α = 0.05 |
(RC) |
|
(3–4) |
2-PropZTest: x1=80, n1=80+20, x2=135, n2=135+15,
p1≠p2
Results: z = −2.23, p-value = 0.0256, p̂1 = .8, p̂2 = .9, p̂ = .86 |
(5) | p < α. Reject H0 and accept H1. |
(6) |
Yes, at the 0.05 significance level, the special treatment made a
difference in germination rate.
Specifically, seeds with the special treatment were more likely to
germinate than seeds that were not treated.
Remark: p < α in Two-Tailed Test: What Does It Tell You? explains how you can reach a one-tailed result from a two-tailed test. |
Alternative solution: You could also do this as a test of homogeneity, Case 7. The χ²-Test gives χ² = 4.98, df = 1, p=0.0256
Supplied a missing requirements check.
Explained about variant practice in using only = in H0, or using ≤, =, and ≥.
Converted from HTML 4.01 to HTML5, and italicized variable names.
Updates and new info: https://BrownMath.com/swt/