BrownMath.com → Stats w/o Tears → All Solutions

# Stats without TearsSolutions to All Exercises

Updated 17 Nov 2020

View or
Print:
These pages change automatically for your screen or printer. Underlined text, printed URLs, and the table of contents become live links on screen; and you can use your browser’s commands to change the size of the text or search for key words. If you print, I suggest black-and-white, two-sided printing.

## Solutions for Chapter 1

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1 Sampling error is another name for sample variability, the fact that each sample is different from the next because no sample perfectly represents the population it was drawn from. Nonsampling errors are problems in setting up or carrying out the data collection, such as poorly worded survey questions and failure to randomize.

Nothing can eliminate sampling error, but you can reduce it by increasing your sample size. (Most nonsampling errors can be avoided by proper experimental design and technique.)

2 (a) systematic sample.
(b) It is probably a good sample of that gynecologist’s patients, since there’s no reason to think that one month is different from another. But it’s a bad sample of pregnant women in general, because it suffers from selection bias. This gynecologist’s patients may use prenatal vitamins differently from pregnant women who see other gynecologists or who don’t have a regular gynecologist.
(c) observational study
3 (a) completely randomized
(c) no food, Gro-Mor, Magi-Grow
(d) 13 heights at the end of the 13 weeks (You could also make a case for growth rate.)
(e) the 150 bulbs
(f) selection of plant food
(g) the group that gets no plant food
4 Each family answered the question “How many children do you have?”
(a) The variable is number of children.
(b) It is a discrete variable.
(c) It summarizes population data, and therefore it is a parameter.

Although “numeric” or “quantitative” is correct, it’s not an adequate answer because it is not as specific as possible. Discrete and continuous data are treated differently in descriptive statistics, so it matters which type you have.

Students are sometimes fooled by the decimal. Always ask yourself what was the original question asked or the original measurement taken from each member of the sample.

5 (a) The sample is the 80 people in your focus group. (It is not the drinks. It’s also not the people’s preferences: Their preferences are the data or sample data.)
(b) The sample size is 80, because that’s the number of people you took data from. It’s not 55: That’s just the number who gave one particular response.
(c) The population is not stated explicitly, but you can infer that it’s cola drinkers in general, or Whoopsie Cola drinkers in general.
(d) You don’t know how many cola drinkers (or Whoopsie Cola drinkers) there are. You can’t know, since people change their soft-drink habits all the time. You can say that the population is indefinitely large, or you can say that it’s infinite. (You can say that the population is uncountable, but don’t say that the population size is uncountable.)

Common mistake: Students sometimes answer “80” for population size, but this is not correct. You took data from 80 people, so those 80 people are your sample and 80 is your sample size.

6 (a) sampling error (or sample variability) (b) increase sample size

What can be done to reduce response bias? Interviewers should be trained to be absolutely neutral in voice and facial expression, which is how the Kinsey team gathered data on sexual behavior. Or the question can be asked on a written questionnaire, so that the subject isn’t looking another person in the face when answering. The question can also be made less threatening: “Have you ever left an infant alone in the house, even for just a minute?”

8
• Random sample: get a list of the resident students. On your calculator, do `randInt(1,2000)` 50 times, not counting duplicates, and interview the students who came up in those positions.
• Systematic sample: You can’t station yourself in the cafeteria because that would exclude all students who don’t use it. Instead, station yourself at the main entrance to the dorm complex (or station yourself and confederates at the main entrance to each dorm) and interview every 20th person. Why k=20 and not 2000/50 = 40? Because whenever you’re there, you’re bound to miss a sizable proportion of students.

To select the first person to survey, use `randInt(1,20)`. Remember that a systematic survey begins with a randomly selected person from 1 to k, not 1 to 50 (sample size) or 1 to 2000 (population size).

Notice that I didn’t suggest a time frame. What do you think would be a good time to do this?

An alternative procedure might be to walk through the dorms (assuming you can get in) and interview the students in every 20th room. You may get better coverage that way than if you wait for them to come to you.

• Cluster sample: Randomly select 25 rooms, and interview both of the students in those rooms. (This is a single-stage cluster.)

Best balance? Probably the cluster sample. The true random sample is a lot of work for a sample of 50, because after selecting the names you have to track the students down. The systematic sample, no matter how you do it, is going to miss a lot of students, and you have that time-period problem. With the cluster sample, you can time it for when students are likely to be home, and you can go back to follow up on those you missed.

But nothing is perfect, in this life where we are born to trouble as the sparks fly upward. The cluster sample works if the students were randomly assigned to rooms. When students pick their own roommates, they tend to pick people with similar attitudes, interests, and activities. That means those two are more similar to each other than other students, and there’s no way you can treat that cluster sample as a random sample. The cluster would probably be safe for freshman, where the great majority would be randomly assigned, but less so for students in later years.

9 No, you can’t reach that conclusion, because you can never conclude causation from an observational study. You would have to do an experiment, where people were randomly assigned to watch Fox News or to watch no news at all, and then see if there was a difference in how much they knew about the world.

Students often answer questions like this with hand-waving arguments, either coming up with reasons why it’s a plausible conclusion or coming up with reasons why it isn’t. This is statistics, and we have to follow the facts. Whatever you may think about Fox News, the fact is that observational studies can’t prove causation.

10 (a) It excludes people who don’t use the bus. This means that people who are dissatisfied with the bus are systematically under-represented. Your survey will probably show that willingness to pay is higher than it actually is.
(b) sampling bias
11 “Random” doesn’t mean unplanned; it takes planning. This is a bogus sample. If you want a more formal statistical word, call it a convenience sample, an opportunity sample or a non-probability sample.
12 (a) This is attribute data or qualitative data or non-numeric data. Don’t be fooled by the number 42: the original question asked was “Do you have at least one streaming device?” and that’s a yes/no question.

Alternative: the more specific answer binomial data, which you may have heard in the lecture though it’s not in the book till Chapter 6.

(b) This is descriptive statistics because it’s reporting data actually measured: 42% of the sample. If it said “42% of Americans”, then it would be inferential because you know not every American was asked, so the investigators must have extrapolated from a sample to the population.

(c) It is a statistic because it is a number that summarizes data from a sample.

13
• The first people who present themselves are chosen. You should randomly select from among all volunteers. (Better still would be to randomly select from among all patients, and ask the selected individuals to volunteer.)
• Participants are not randomly assigned to control and experimental groups. This is always bad, but it’s especially bad when you accept a block of volunteers in order.
• The experiment is not double blind, only single blind. When doctors know who is getting a placebo and who is getting medicine, they may treat the two groups differently, consciously or unconsciously.

All of these are nonsampling errors.

14 2.145E-4 is 0.0002145, and 0.0004 is larger than that.
15 It’s spurious precision. (That much precision could be appropriate if you had surveyed a few hundred thousand households.)

To fix it, round to one decimal place: 1.9. (Don’t make the common mistake of “rounding” to 1.8.)

16 (a) Non-numeric. (It has the form of a number, but think about the average area code in a group and you’ll realize an area code is not a number.)
(b) Continuous.
(c) Discrete.
(d) Non-numeric.
(e) Non-numeric.
(f) Discrete. (or continuous if you allow answers like 6.3)
17 (a) was done for you.
(b) Measurement: Amount of each dinner check. Continuous.
(c) Question: “Did you experience bloating and stomach pain?” Non-numeric.
(d) Measurement: Number of people in each party. Discrete.

## Solutions for Chapter 2

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1 2

There’s no scale to interpret the quantities. And if one fruit in each row is supposed to represent a given quantity, then banana and apple have the same frequency, yet banana looks like its frequency is much greater.

3

90% of 15 is 13.5, 80% is 12, 70% is 10.5, and 60% is 9.

13.5–15A||2
12–13.4B|1
10.5–11.9C||||5
9–10.4D|||3
0–8.9F||||4 Alternatives: Instead of a title below the category axis, you could have a title above the graph. You could order the grades from worst to best (F through A) instead of alphabetically as I did here. And you could list the class boundaries as 13.5–15, 12–13.5, 10.5–12, and so on, with the understanding that a score of 12 goes into the 12–13.5 class, not the 10.5–12 class. (Data points “on the cusp” always go into the higher class.)

4

(a) The variable is discrete, “number of deaths in a corps in a given year”.
(b) Alternatives: Some authors would draw a histogram (bars touching) or even a pie chart. Those are okay but not the best choice.

5
 ``` Commuting Distance 0 | 5 9 8 1 1 | 5 2 2 1 9 6 2 8 7 6 5 7 2 | 3 2 6 1 6 4 0 3 | 1 4 | 5 Key: 2 | 3 = 23 km```
6

Relative frequency is f/n. f = 25, and n = 35+10+25+45+20 = 135. Dividing 25/135 gives 0.185185… ≈ 0.19 or 19%

7 (a) Bar graph, histogram, stemplot. A bar graph or histogram can be used for any ungrouped discrete data. (Some authors use one, some use the other. I like the bar graph for ungrouped discrete data.) A stemplot, or stem-and-leaf diagram, can be used when you have a moderate data range without too many data points.
(b) Histogram.
(c) Bar graph, pie chart.
8

skewed right

9 (a) Group the data when you have a lot of different values.
(b) The classes must all be the same width, and there must be no gaps.
10 (a) See the histogram at right. Important features:

• The bars are labeled at their edges, not their centers, because this is a grouped histogram.
• Both axes are titled.
• The horizontal axis has a real-world title. (Sometimes you also need an overall title for the graph, but here the axis title says all that needs to be said.)

(b) 480.0−470.0 = 10.0 or just plain “10”.

Don’t make the common mistake of subtracting 479.9−470.0. Subtract consecutive lower bounds, always.

(c) skewed left

## Solutions for Chapter 3

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1 When the data set is skewed, the median is better. Outliers tend to skew a data set, so usually the median is a better choice when you have outliers.
2 15% of people have cholesterol equal to or less than yours, so yours is on the low end. Though you might not really celebrate by eating high-cholesterol foods, there is no cause for concern.
3 (a) It uses only the two most extreme values.
(b) It uses only two values, but they are not the most extreme, so it is resistant.
(c) It uses all the numbers in the data set.
(d) Any two of: It is in the same units as the original data, it can be used in comparing z-scores from different data sets, you can predict what percentage of the data set will be within a certain number of SD from the mean.
4 (a) s is standard deviation of a sample; σ is standard deviation of a population.
(b) μ is mean of a population; is mean of a sample.
(c) N is population size or number of members of the population; n is sample size or number of members of the sample.
5 You were 1.87 standard deviations above average. This is excellent performance. 1.87 is almost 2, and in a normal distribution, z = +2 would be better than 95+2.5 = 97.5% of the students. 1.87 is not quite up there, but close. (In Chapter 7, you’ll learn how to compute that a z-score of 1.87 is better than 96.9% of the population.)
6 Since the weights are normally distributed, 99.7% (“almost all”) of them will be within three SD above and below the mean. 3σ above and below is a total range of 6σ. The actual range of “almost all” the apples was 8.50−4.50 = 4.00 ounces. 6σ = 4.00; therefore σ = 0.67 ounces.

Alternative solution: In a normal distribution, the mean is half way between the given extremes: μ = (4.50+8.50)/2 = 6.50. Then the distance from the mean to 8.50 must be three SD: 8.50−6.50 = 2.00 = 3σ; σ = 0.67 ounces.

7
AgesMidpoint (L1)Frequency (L2)
20 – 292534
30 – 393558
40 – 494576
50 – 5955187
60 – 6965254
70 – 7975241
80 – 8985147
(a) This is a grouped distribution, so you need the class midpoints, as shown at right. Enter the midpoints in L1 and the frequencies in L2.

Caution! The midpoints are not midway between lower and upper bounds, such as (20+29)/2 = 24.5. They are midway between successive lower bounds, such as (20+30)/2 = 25.

1-VarStats L1,L2 (Check n first!)

= 63.85656971 →  = 63.86

s = 15.43533244 → s = 15.44

n = 997

Common mistake: People tend to run 1-VarStats L1, leaving off the L2, which just gives statistics of the seven numbers 25, 35, …, 85. Always check n first. If you check n and see that n = 7, you realize that can’t possibly be right since the frequencies obviously add up to more than 7. You fix your mistake and all is well.

(b) You need the original data to make a boxplot, and here you have only the grouped data. A boxplot of a grouped distribution doesn’t show the shape of the data set accurately, because only class midpoints are taken into account. The class midpoints are good enough for approximating the mean and SD of the data, but not the five-number summary that is pictured in the boxplot.

8
Points (L1)
Statistics3A4.0
Calculus4B+3.3
Microsoft Word1C−1.7
Microbiology3B−2.7
English Comp3C2.0
You need the weighted average, so put the quality points in L1 and the credits in L2. (No, you can’t do it the other way around. The quality points are the numeric forms of your grades, and you have to give them weights according to the number of credits in each course.)

1-VarStats L1,L2

n = 14 (This is the number of credits attempted. If you get 5, you forgot to include L2 in the command.)

= 2.93

9 You don’t have the individual quiz scores, but remember what the average means: it’s the total divided by the number of data points. If your quiz average is 86%, then on 10 quizzes you must have a total of 86×10 = 860 percentage points. If you need an 87% average on 11 quizzes, you need 11×87 = 957 percentage points. 957−860 = 97; you can still skip the final exam if you get a 97 on the last quiz.
10
 (a) ```Commute Distance, km 0- 9 4 10-19 12 20-29 7 30-39 1 40-49 1 Total 25``` (b) The class width is 10 (not 9). The class midpoints are 5, 15, 25, 35, 45 (not 4.5, 14.5, etc.). (c) Class midpoints in one list such as L2 and frequencies in another list such as L3. This is a sample, so symbols are x̅, s, n, not μ, σ, N. 1-VarStats L2,L3 x̅ = 18.2 km s = 9.5 km n = 25

(d) Data in a list such as L1. 1-VarStats L1 gives  = 17.6 km, Median = 17, s = 9.0  km, n = 25

(e) (f) Mean, because the data are nearly symmetric. Or, median, because there is an outlier.
Comment: The stemplot made the data look skewed, but that was just an artifact of the choice of classes. The boxplot shows that the data are nearly symmetric, except for that outlier. This is why the mean and median are close together. This is a good illustration that sometimes there is no uniquely correct answer. It’s why your justification or explanation is an important part of your answer.

(g) The five-number summary, from MATH200A part 2 [`TRACE`], is 1, 12, 17 22.5, 45. There is one outlier, 45.
(The five-number summary includes the actual min and max, whether they are outliers or not.)

11 Since 500 equals the mean, its z-score is 0. For 700, compute the z-score as z = (700−500)/100 = 2. So you need the probability of data falling between the mean and two SD above the mean. Make a sketch and shade this area.

Draw an auxiliary line at z = −2. You know that the area between z = −2 and z = +2 is 95%, so the area between z = 0 and z = 2 is half that, 47.5% or 0.475.

12 To compare apples and oranges, compute their z-scores:

zJ = (2070−1500)/300 = 570/300 = 1.90

zM = (129−100)/15 = 29/15 = about 1.93

Because she has the higher z-score, according to the tests Maria is more intelligent.

Remark: The difference is very slight. Quite possibly, on another day Jacinto might do slightly better and Maria slightly worse, reversing their ranking.

13
Test Scores Frequencies, f
(L2)
Class Midpoints, x
(L1)
470.0–479.915475.0
480.0–489.922485.0
490.0–499.929495.0
500.0–509.950505.0
510.0–519.938515.0
Start with the class marks or midpoints, as shown at right. (Class midpoints are halfway between successive lower bounds: (470+480)/2 = 475. You can’t calculate them between lower and upper bounds, (470+479.9)/2=474.95.)

Put class midpoints in a list, such as L1, and frequencies go in another list, such as L2. (Either label the columns with the lists you use, as I did here, or state them explicitly: “class marks in L1, frequencies in L2”.)

`1-VarStats L1,L2` (Always write down the command that you used.)

(a) n = 154

(b) = 499.81 (before rounding, 499.8051948)

(c) s = 12.74 (before rounding, 12.74284519)

Be careful with symbols. Use the correct one for symbol or population, whichever you have.

Common mistake: The SD is 12.74 (Sx), not 12.70 (σ), because this is a sample and not the population.

14 The mean is much greater than the median. This usually means that the distribution is skewed right, like incomes at a corporation.

## Solutions for Chapter 4

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1 64% of the variation in salary is associated with variation in age.

Common mistake: Don’t use any form of the word “correlation” in your answer. Your friend wouldn’t understand it, but it’s wrong anyway. Correlation is the interpretation of r, not R². Yes, r is related to R², but R² as such is not about correlation.

Common mistake: R² tells you how much of the variation in y is associated with variation in x, not the other way around. It’s not accurate to say 64% of variation in age is associated with variation in salary.

Common mistake: Don’t say “explained by” to non-technical people. The regression shows an association, but it does not show that growing older causes salary increases.

2 (a) We know that power boats kill manatees, so the boat registrations must be the explanatory variable (x) and the manatee power-boat kills must be the response variable (y). (Although this is an observational study, the cause of death is recorded, so we do know that the boats cause these manatee deaths.)

(b) Yes

(c) The results of `LinReg(ax+b) L1,L2,Y1` are shown at right. The correlation coefficient is r = 0.91

(d) ŷ = 0.1127x − 35.1786
Note: ŷ, not y. Note: −35.1786, not +−35.1786.

(e) The slope is 0.1127. An increase of 1000 power-boat registrations is associated with an increase of about 0.11 manatee deaths, on average.
It’s every 1000 boats, not every boat, because the original table is in thousands. Always be specific: “increase”, not just “change”.

Remark: Although this is mathematically accurate, people may not respond well to 0.11 as a number of deaths, which obviously is a discrete variable. You might multiply by 100 and say that 100,000 extra registrations are associated with 11 more manatee deaths on average; or multiply by 10 and round a bit to say that 10,000 extra registrations are associated with about one more manatee death on average.

(f) The y intercept is −35.1786. Mathematically, if there were no power boats there would be about minus 35 manatees killed by power boats. But this is not applicable because x=0 (no boats) is far outside the range of x in the data set.

(g) R² = 0.83. About 83% of variation in manatee deaths from power boats is associated with the variation in registrations of power boats.
It’s R², not r². And don’t use any form of the word “correlate” in your answer.

100% of manatee power-boat deaths come from power boats, so why isn’t the association 100%? The other 17% is lurking variables plus natural variability. For instance, maybe the weather was different in some years, so owners were more or less likely to use their boats. Maybe a campaign of awareness in some years caused some owners to lower their speeds in known manatee areas.

(h) ŷ = 27.8

(i) y−ŷ = 34−27.8 = 6.2

(j) Remember that x is in thousands, so a million boats is x = 1000. But x=1000 is far outside the data range, so the regression can’t be used to make a prediction.

3 The decision point for n=10 is 0.632, and |r| = 0.57. |r| < d.p., and therefore you can’t reach a conclusion. From the sample data, it’s impossible to say whether there is any association between TV watching and GPA for TC3 students in general.
Note: Always state the decision point and show the comparison to r.
4 (a) Yes
The point (0,6) is hard to see behind the y axis, but it’s there.

(b) The results of `LinReg(ax+b) L3,L4,Y2` are shown at right. ŷ = −3.5175x+6.4561

(c) The slope is −3.5175. Increasing the dial setting by one unit decreases temperature by about 3.5°.
Again, state whether y increases or decreases with increasing x.

(d) The y intercept is 6.4561. A dial setting of 0 corresponds to about 6.5°.

(e) r = −0.99

(f) R² = 0.98. About 98% of variation in temperature is associated with variation in dial setting.

This seems almost too good to be true, as though the data were just made up. ☺ But it’s hard to think of many lurking variables. Maybe it happened that some measurements were taken just after the compressor shut off, and others were taken just before the compressor was ready to switch on again in response to a temperature rise.

(g) ŷ = 2.9°

5 For n = 12, the decision point is 0.576. |r| = 0.85 is greater than that, so there is an association. Increased study time is associated with increased exam score for statistics students in general.
6 No. There’s a lurking variable here: age. Older pupils tend to have larger feet and also tend to have increased reading ability.
7

r, the linear correlation coefficient, would be roughly zero. Taking the plot as a whole, as x increases, y is about equally likely to increase or decrease. A straight line would be a terrible model for the data.

Clearly there is a strong correlation, but it is not a linear correlation. Probably a good model for this data set would be a quadratic regression, ŷ = ax²+bx+c. Though we study only linear regressions, your calculator can perform quadratic and many other types.

8 The coefficient of determination, R², answers this question. For linear correlations, R² is indeed the square of the correlation coefficient r. r = 0.30 ⇒ R² = 0.09. Therefore 9% of the variation in IQ is associated with variation in income.

Remark: Don’t say “caused by” variation in family income. Correlation is not causation. You can think of some reasons why it might be plausible that wealthier families are more likely to produce smarter children, or at least children who do better on standardized tests, but you can’t be sure without a controlled experiment.

Remark: Though it’s an interesting fact, the correlation in twins’ IQ scores is not needed for this problem. In real life, an important part of solving problems and making decisions is focusing on just the relevant information and not getting distracted.

## Solutions for Chapter 5

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.

### Problem Set 1

1 (a) There are three coins, and each has two possible outcomes, so the sample space will have 2³ = 8 entries.

(b)

S = { HHH HTH THH TTH HHT HTT THT TTT

(c) Three events out of eight equally likely events: P(2H) = 3/8

Common mistake: Sometimes students write the sample space correctly but miss one of the combinations of 2 heads. I wish I could offer some “magic bullet” for counting correctly, but the only advice I have is just to be really careful.

2
Service typeProb.
Landline and cell58.2%
Landline only37.4%
Cell only2.8%
No phone1.6%
Total100.0%

(a) In a probability model, the probabilities must add to 1 (= 100%). The given probabilities add to 62.6%. What is the missing 37.4%? They’ve accounted for cell and landline, cell only, and nothing; the remaining possibility is landline only. The model is shown at right.

(b) P(Landline) = P(Landline only) + P(Landline and cell)

P(Landline) = 37.4% + 58.2% = 95.6%

Remark: “Landline” and “cell” are not disjoint events, because a given household could have both. But “landline only” and “landline and cell” are disjoint, because a given house can’t both have a cell phone with landline and have no cell phone with landline.

3 No, because the events are not disjoint. The figures are for being struck or attacked, not killed. You’d have to be pretty unlucky to be struck by lightning and attacked by a shark in the same year, but it could happen. If the question were about being killed by lightning or by a shark, then the events would be disjoint and you could add the probabilities.
4 (a) P(not A) = 1−P(A) = 1−0.7 → P(not A) = 0.3

(b) That A and B are complementary means that one or the other must happen, but not both. Therefore P(B) = P(not A) → P(B) = 0.3

(c) Since the events are complementary, they can’t both happen: P(A and B) = 0

Common mistake: Many students get (c) wrong, giving an answer of 1. If events are complementary, they can’t both happen at the same time. That means P(A and B) must be 0, the probability of something impossible.

Maybe those students were thinking of P(A or B). If A and B are complementary, then one or the other must happen, so P(A or B) = P(A) + P(B) = 1. But part (c) was about probability and, not probability or.

5 Yes, because the events are disjoint or mutually exclusive: a person might have both cancer and heart disease, but the death certificate will list one cause of death. (1/5 + 1/7 ≈ 34%.)
6 P(divorced | man) is the probability that a randomly selected man is divorced, or the proportion of men who are divorced. P(man | divorced) is the probability that a randomly selected divorced person is a man, or the proportion of divorced persons that are men.
7 If the probability of a future event is zero, then that event is impossible. If the probability of a past event is zero, that just means that it didn’t happen in the cases that were studied, not that it couldn’t have happened.

This is the difference between theoretical and empirical probability. A truly impossible event has a theoretical probability of zero. But the 0 out of 412 figure is an empirical probability (based on past experience). Empirical probabilities are just estimates of the “real” theoretical probability. From the empirical 0/412, you can tell that the theoretical probability is very low, but not necessarily zero. In plain language, an unresolved complaint is unlikely, but just because it hasn’t happened yet doesn’t mean it can’t happen.

8 13/52 or 1/4

Common mistake: Students often try some sort of complicated calculation here. You would have to do that if conditions were stated on all five of those cards, but they weren’t. Think about it: any card has a 1/4 chance of being a spade.

9 S = { HH, HT, TH, TT }
(a) Three outcomes (HH, HT, TH) have at least one head. One of the three has both coins heads. Therefore the probability is 1/3.
(b) Two outcomes (HH, HT) have heads on the first coin. One of the two has both coins heads. Therefore the probability is 1/2.
10

(a) 0.0171 × 0.0171 = 0.0003

(b) The events are not independent. When a married couple are at home together or out together, any attack that involves one of them will involve the other also.

11 (a) P(divorced) = 22.8/219.7 ≈ 0.1038

(b) About 10.38% of American adults in 2006 were divorced. If you randomly selected an American adult in 2006, there was a 0.1038 probability that he or she was divorced.

(c) Empirical or experimental

(d) P(divorcedC) = 1−P(divorced) = 1−22.8/219.7 ≈ 0.8962
About 89.62% of American adults in 2006 were not divorced (or, had a marital status other than divorced).

(e) P(man and married) = 63.6/219.7 ≈ 0.2895 (You can’t use a formula on this one.)

(f) Add up P(man) and P(not man but married):

P(man or married) = 106.2/219.7 + 64.1/219.7 ≈ 0.7751

Alternative solution: By formula:

P(man or married) = P(man) + P(married) − P(man and married)

P(man or married) = 106.2/219.7 + 127.7/219.7 − 63.6/219.7 = 0.7751

Remember, math “or” means one or the other or both.

(g) What proportion of males were never married? 30.3/106.2 = 28.53%.

(h) P(man | married) uses the sub-subgroup of men within the subgroup of married persons.

P(man | married) = 63.6/127.7 = 0.4980

49.80% of married persons were men.

Remark: You might be surprised that it’s under 50%. Isn’t polygamy illegal in the US? Yes, it is. But the table considers only resident adults. Women tend to marry slightly earlier than men, so fewer grooms than brides are under 18. Also, soldiers deployed abroad are more likely to be male.

(i) P(married | man) used the sub-subgroup of married persons within the subgroup of men.

P(married | men) = 63.6/106.2 = 0.5989

59.89% of men were married.

12 P(five cards, all diamonds) = (13/52) × (12/51) × (11/50) × (10/49) × (9/48) ≈ 0.0005
(I was surprised that the probability is that high, about once every 2000 hands. And the probability of being dealt a five-card flush of any suit is four times that, about once in every 500 hands.)
13

(a) 3 of 20 M&Ms are yellow, so 17 are not yellow. You want the probability of three non-yellows in a row:
(17/20)×(16/19)×(15/18) ≈ 0.5965

(b) The probability is zero, since there are only two reds to start with.

14 You’re being asked about all three possibilities: two fail, one fails, none fail. Therefore the three probabilities must add up to 1, and you need to compute only two of them. It’s also important to note that the companies are independent: whether one fails has nothing to do with whether the other fails. (Without knowing that the companies are independent, you could not compute the probability that both fail.)

(a) Since the companies are independent, you can use the simple multiplication rule:

P(A bankrupt and W bankrupt) = P(A bankrupt) × P(W bankrupt)

P(A bankrupt and W bankrupt) = .9 × .8 = 0.72

At this point you could compute (b), but it’s little messy because you need the probability that A fails and W is okay, plus the probability that A is okay and W fails. (c) looks easier, so do that first.

(c) “Neither bankrupt” means both are okay. Again, the events are independent so you can use the simple multiplication rule.

P(neither bankrupt) = P(A okay and W okay)

P(A okay) = 1−.9 = 0.1; P(W okay) = 1−.8 = 0.2

P(neither bankrupt) = .1 × .2 = 0.02

(b) is now a piece of cake.

P(only one bankrupt) = 1 − P(both bankrupt) − P(none bankrupt)

P(only one bankrupt) = 1 − .72 − .02 = 0.26

Remark: If you have time, it’s always good to check your work and work out (b) the long way. You have only independent events (whether A is okay or fails, whether W is okay or fails) and disjoint events (A fails and W okay, A okay and W fails). The “okay” probabilities were computed in part (c).

P(only one bankrupt) = (A bankrupt and W okay) or (A okay and W bankrupt)

P(only one bankrupt) = (.9 × .2) + (.1 × .8) = 0.26

Common mistake: When working this out the long way, students often solve only half the problem. But when you have probability of exactly one out of two, you have to consider both A-and-not-W and W-and-not-A.

You can’t use the “or” formula here, even if you studied it. That computes the probability of one or the other or both, but you need the probability of one or the other but not both.

Remark: If you computed all three probabilities the long way, pause a moment to check your work by adding them to make sure you get 1. Whenever possible, check your work with a second type of computation.

(a) 15(You can assume independence because it’s a small sample from a large population.) P(red1 and red2 and red3) = 0.13×0.13×0.13 = 0.0022

(b) P(red) = 0.13; P(redC) = 1−0.13 = 0.87.
P(red1C and red2C and red3C) = 0.87×0.87×0.87 or 0.87³ = 0.6585

Common mistake: Students sometimes compute 1−.13³. But .13³ is the probability that all three are red, so 1−.13³ is the probability that fewer than three (0, 1, or 2) are red. You need the probability that zero are red, not the probability that 0, 1, or 2 are red. Think carefully about where your “not” condition must be applied!

(c) The complement is your friend with “at least” problems. The complement of “at least one is green” is “none of them is green”, which is the same as “every one is something other than green.”
P(green) = 0.16, P(non-green) = 1−0.16 = 0.84.
P(≥1 green of 3) = 1 − P(0 green of 3) = 1 − P(3 non-green of 3) = 1−0.84³ ≈ 0.4073

(d) (Sequences are the most practical way to solve this one.)
(A) G1 and G2C and G3C; (B) G1C and G2 and G3C; (C) G1C and G2C and G3
.16×(1−.16)×(1−.16) + (1−.16)×.16×(1−.16) + (1−.16)×(1−.16)×.16 ≈ 0.3387

16 In “at least” and “no more than” probability problems, the complement is often your friend. The complement of “at least one had not attended” is “all had attended”. If the fans are randomly selected, their attendance is independent and you can use the simple multiplication rule.

P(all 5 attended) = 0.45^5 = 0.0185

P(at least 1 had not attended) = 1 − 0.0185 = 0.9815

17 Sequences are the way to go here:

(cherry1 and orange2) or (orange1 and cherry2)

Common mistake: There are two ways to get one of each: cherry followed by orange and orange followed by cherry. You have to consider both probabilities.

There are 11+9 = 20 sourballs in all, and Grace is choosing the sourballs without replacement (one would hope!), so the probabilities are:

(11/20)×(9/19) + (9/20)×(11/19) = 99/190 or about 0.5211

18 The complement is your friend, and the complement of “win at least once in 5 years” is “win 0 times in 5 years” or “lose 5 times in 5 years”.

P(win ≥1) = 1−P(win 0) = 1−P(lose 5).

P(lose) = 1−P(win) = 1−(1/500) = 499/500

P(lose 5) = [P(lose)]5 = (499/500)^5 = 0.9900

P(win ≥1) = 1−P(lose 5) = 1−0.9900 = 0.0100 or 1.00%

Common mistake: If you compute 1−(499/500)5 in one step and get 0.00996008, be careful with your rounding! 0.00996… rounds to 0.0100 or 1%, not 0.0010 or 0.1%.

Common mistake: 1/500 + 1/500 + … is wrong. You can add probabilities only when events are disjoint, and wins in the various years are not disjoint events. It is possible (however unlikely) to win more than once; otherwise it would make no sense for the problem to talk about winning “at least once”.

Common mistake: You can’t multiply 5 by anything. Take an analogy: the probability of heads in one coin flip is 50%. Does that mean that the probability of heads in four flips is 4×50% = 200%? Obviously not! Any process that leads to a probability >1 must be incorrect.

Common mistake: 1−(1/500)5 is wrong. (1/500)5 is the probability of winning five years in a row, so 1−(1/500)5 is the probability of winning 0 to 4 times. What the problem asks is the probability of winning 1 to 5 times.

19 (a), (b), and (c) are all the possibilities there are, so the probabilities must total 1. You can compute two of them and then subtract from 1 to get the third.

(a) P(not first and not second) = P(not first) × P(not second) = (1−.7)×(1−.6) = 0.12

(c) P(first and second) = P(first) × P(second) = .7×.6 = 0.42

(b) 1−.12−.42 = 0.46

Alternative: You could compute (b) directly too, using sequences:

P(exactly one copy recorded) =

P(first and not second) + P(second and not first) =

P(first)×(1−P(second)) + P(second)×(1−P(first)) =

.7×(1−.6) + .6×(1−.7) = 0.46

A very common mistake on problems like this is writing down only one of the sequences. When you have exactly one success (or exactly any definite number), almost always there are multiple ways to get to that outcome.

You can’t use the “or” formula here, even if you studied it. That computes the probability of one or the other or both, but you need the probability of one or the other but not both.

### Problem Set 2

20 (a) P(ticket on route A) = P(taking route A) × P(speed trap on route A) = 0.2×0.4 = 0.08. In the same way, the probabilities of getting a ticket on routes B, C, D are 0.1×0.3 = 0.03, 0.5×0.2 = 0.10, and 0.2×0.3 = 0.06. He can’t take more than one route to work on a given day, so those are disjoint events. The probability that he gets a ticket on any one morning is therefore 0.08+0.03+0.10+0.06 = 0.27.

(b) The probability of not getting a ticket on a given morning is 1−0.27 = 0.73. The probability of getting no tickets on five mornings in a row is therefore 0.735 ≈ 0.2073 or about 21%.

21 Two events A and B are independent if P(A|B) = P(A).

P(man) = 106.2/219.7 ≈0.4834

P(man|divorced) = 9.7/22.8 ≈ 0.4254

Since P(man|divorced) ≠ P(man), the events are not independent.

Alternative solution:  You could equally well show that P(divorced|man) ≠ P(divorced):

P(divorced|man) = 9.7/106.2 ≈ 0.0913

P(divorced) = 22.8/219.7 ≈ 0.1038

22 What’s the probability of ten of the same flip in a row? In other words, given either result, what’s the probability that the next nine will be the same? That must be (1/2)9 = 1/512. You therefore expect this to happen about once in about every 500 flips, or about twice in every thousand.
23 P(open door) = P(unlocked) + P(locked)×P(right key)
P(open door) = 0.5 + 0.5×(2/5) = 0.7

## Solutions for Chapter 6

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1 (a) 0, 1, 2, 3, 4, 5
(b)There are five trials, each die is either a two or not a two, and the dice are independent. This fits the binomial model.
2
x (\$)P(x)
9,999,9951/10,000,000
951/125
51/20
−5.9419999
(a) The probability model is shown at right. (I computed the probability of losing \$5 as 1−[1/10000000+1/125+1/20].)

(b) \$ in L1, probabilities in L2. `1-VarStats L1,L2` yields μ = −2.70. The expected value of a ticket is −\$2.70. This is a bad deal for you. (It’s a very good deal for the lottery company. They’ll make \$2.70 per ticket, on average.)

Common mistakes: Students sometimes give hand-waving arguments such as the top prize being very unlikely, or the lottery company always getting to keep the ticket price, but these are not relevant. The only thing that determines whether it’s a good or bad deal for the player is the expected value μ.

3 (a) This is a geometric model: repeated failures until a success, with p = 0.066.

μ = 1/p = 1/.066 ≈ 15.2

Over the course of her undead existence, taking each night’s hunt as a separate experience, the average of all nights has her first getting an O negative drink from her fifteenth victim.

(b) `geometcdf(.066,10)` = .4947936946 ≈ 0.4948. Velma has almost a 50% chance of getting O negative blood within her first ten victims.

(You could also do this as a binomial, n = 10, p = 0.066, x = 1 to 10.)

(c) This is a binomial model with n = 10, p = 0.066, and x = 2. Use MATH200A part 3 or `binompdf(10,.066,2)` = .1135207874 ≈ 0.1135. Velma has just over an 11% chance of getting exactly two O negative victims within her first ten.

4 This is a geometric distribution. You’re looking for someone who is opposed to universal background checks, so p = 1−.92 = 0.08.

(a) geometpdf(.08, 3) = .067712 → 0.0677

(b) geometcdf(.08, 3) = .221312 → 0.2213

(You could also do this as a binomial with n = 3, p = 0.08, x = 1 to 3.)

5

(a) This is a binomial distribution: each student passes or not, whether one student passes has nothing to do with whether anyone else passes, and there are a fixed seven trials.

μ = np = 7*0.8 ⇒ μ = 5.6 people

σ = √[npq] = √[7*0.8*(1−0.8)] = 1.058300524 ⇒ σ =  1.1 people

(b) Binomial again, n = 7, p = 0.8, x = 4 to 6. Use `binompdf`-`sum` or MATH200A part 3 to find P(4 ≤ x ≤ 6) = 0.7569.  (c) Geometric model: p = 0.8, x = 3. `geometpdf(.8,3)` = 0.0320

(d) `geometcdf(.8,2)` = 0.9600

Alternative solution: Binomial probability with n = 2, p = 0.8, x = 1 to 2 gives the same answer.

6 This is binomial data, p = .49. For a sample of 40, expected value is μ = np = 40×.49 = 19.6. 13 is less than 19.6, so asking whether 13 is surprising is really asking whether 0 to 13 is surprising; see Surprise!

`binomcdf(40,.49,13)` or MATH200A Program part 3 with n=40, p=.49, x=0 to 13 gives .0259693307  →  0.0260, less than 5%, so you would be surprised though maybe not flabbergasted. ☺

7 (a) Probability of one equals proportion of all, and therefore a randomly selected 22-year-old male has a 0.1304% chance of dying in the next year. That’s the only “prize”, so multiply it by its probability to find fair price: 100000×0.001304 = \$130.40

(b) The company’s gross profit is \$180.00−130.40 = \$49.60, about 28%. But it could very well cost the company that much to sell the policy, pay the agent’s commission, and enter the policy in the computer. Also, all policies must bear part of the company’s general overhead costs. The price is not necessarily unfair in the plain English sense.

8

(a) x’s in L1, P’s in L2. `1-VarStats L1,L2` yields μ = 2 (exactly) and σ = 1.095353824 or σ ≈ 1.1. Interpretation: In the long run, on average you expect to get two heads per group of five flips. You expect most groups of five flips will yield between μ−σ = 1 head and μ+σ = 3 heads.

(b) (I wouldn’t use this part as a regular quiz question.) The long-term average is 2 heads out of 5 flips, which is p = 2/5 = 40%. Obviously coin flips are independent, so the probability of heads must be the same every time. Therefore you have a binomial model with n = 5 and p = 0.4.

9

(a) Binomial probability with n = 5, p = 0.7, x = 3 to 5. MATH200A part 3 5, .7, 3, 5 yields .83692 or P(x ≥ 3) = 0.8369. Or, `binompdf(5,.7)→L6` and then `sum(L6,4,6)` to get the same answer. Or, use the complement: 1−`binomcdf(5,.7,2)`.

(b) You need the mean of the binomial distribution:

μ = np = 10×0.7 = 7

(c) 5 is less than the expected number, so you compute P(x≤5):

MATH200A part 3 10, .7, 0, 5 yields 0.1503, or

`binomcdf(10,.7.5)` = 0.1503, not surprising

Common mistake: Don’t just compute P(x=5), which is 0.1029. When you want to know whether a result is unusual or surprising, you have to find the probability of that result or one even further from the expected value.

10 (a) Geometric model, p = 0.34. μ = 1/.34 ≈ 2.94. About three
(b) `binompdf(5,.34,0)` = .1252332576, about a 12.5% chance
11 Your words will vary, but you should have the idea that the binomial model is a fixed number of trials with varying number of successes, whereas the geometric model is a varying number of trials that ends with the first success.
12 Your words will vary, but you should have the idea that a pdf is the probability of a specific outcome, and the cdf is the cumulative probability of all outcomes 0 through a specified number. I’m not so concerned that you know what pdf and cdf stand for, as long as you understand what they mean and when to use each.

## Solutions for Chapter 7

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1
• On any given trip, there’s a 9% chance that Chantal’s commute will be less than 17 minutes.
• 9% of Chantal’s commutes are shorter than 17 minutes.
2 P(x ≥ 76.5) = normalcdf(76.5, 10^99, 69.3, 2.92) = .0068362782.
Here are the two interpretations, from Interpreting Probability Statements in Chapter 5:
• The probability that a randomly selected man is 76.5″ or taller is 0.0068 or 0.68%.
• Only 0.68% of men are 76.5″ tall or taller.
3 “Have boundaries, find probability.”

P(64 ≤ x ≤ 67) = normalcdf(64, 67, 64.1, 2.75) = 0.3686871988 → 0.3687

36.87% of women are 64″ to 67″ tall.

4 5% probability in the two tails means 2.5% or 0.025 in each tail.

x1 = invNorm(.025, 69.3, 2.92) = 63.57690516

x2 = invNorm(1−.025, 69.3, 2.92) = 75.02309484

Heights under 63.6″ or over 75.0″ would be considered unusual.

5 The area to left is given as 15% or 0.15, and you need the boundary.

P15 = invNorm(.15, 69.3, 2.92) = 66.27361453

You must be at least 66″ or 5′6″ tall. Also acceptable: at least 66¼ inches, or at least 66.3 inches.

6 (a) By the definition of percentile, the number of the desired percentile is also the area to left.

P25 = invNorm(.25, 64.1, 2.75) = 62.24515319 → P25 = 62.2″

P75 = invNorm(.75, 64.1, 2.75) = 65.95484681 → P75 = 66.0″

(b) Q3 is P75 and Q1 is P25, so the IQR is P75−P25 = 65.95484681−62.24515319 = 3.70969362 → IQR = 3.7″.

(c) 1.35σ = 1.35×2.75 = 3.7125 → 3.7″, matching the IQR as expected. (The match isn’t perfect, because 1.35 is a rounded number.)

7 Use MATH200A Program part 4. The screens are shown at right. The points fall reasonably close to a line. r = 0.9595 and crit = 0.9383. r > crit, and therefore you can say that the normal model is a good fit to the data.
8 The percentile is the percent of the population that scored ≤735.

P(x ≤ 735) = normalcdf(−10^99, 735, 500, 100) = 0.9906.

A score of 735 is at the 99th percentile.

9 2% or 0.02 is area to right, but invNorm needs area to left, so you subtract from 1.

x1 = invNorm(1−.02, 1500, 300) = 2116.124673

You must score at least 2117. (If you round to 2116, you get a number that is a bit less than the computed minimum. While rounding usually makes sense, there are situations where you have to round up, or round down, instead of following the usual rule.)

10 z0.01 = invNorm(1−0.01, 0, 1) = 2.326347877 → z0.01 = 2.33
11 P(x < 60) = normalcdf(−10^99, 60, 69.3, 2.92) = 7.240062385E−4
P(x < 60) = 7.24×10-4 or (better) 0.0007 Common mistake: The probability is not 7.24! That’s not just wrong, it’s very wrong — probabilities are never greater than 1. “E−4” on your calculator comes at the end of the number, but it’s critical info. It means “times 10 to the minus 4th power”, so the probability is 7×10−4 or 0.0007.

• The probability that a randomly selected man is under 60″ tall is 0.0007 or 0.07%.
• 0.07% of men are under 60″ tall.
12 The plot is pretty clearly not a straight line — there’s a sharp bend around the second and third data points. The numbers confirm this: r = .8363, crit = .9121, r < crit, and therefore the normal model is not a good fit for this data set.
13 The middle 90% leaves 10% in the two tails, or 5% in each tail. xm1 = invNorm(.05, 69.3, 2.92) = 64.49702741

xm2 = invNorm(1−.05, 69.3, 2.92) = 74.10297259

xf1 = invNorm(.05, 64.1, 2.75) = 59.57665253

xf2 = invNorm(1−.05, 64.1, 2.75) = 68.62334747

Men must be 64.5 to 74.1 inches tall; women must be 59.6 to 68.6 inches tall.

## Solutions for Chapter 8

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1 This is numeric data. You have a random sample, and it’s less than 10% of the households in a country. Despite the skew, with sample size so far above 30 you can be sure that the shape of the sampling distribution is approximately normal. The mean of the sampling distribution is μ = μ = \$48,000 The SD of the sampling distribution of the mean, a/k/a standard error of the mean, is σ = σ/√n = \$2000/√64 → σ = \$250
2 (a) First, describe the distribution and sketch the situation. For the population, you’re given μ = 800, σ = 50, n = 100.
• Center: The mean of the sampling distribution is the same as the mean of the population, 800 hours.
• Spread: The standard error of the mean is σ = σ/√n = 50/√100 = 5 hours.
• Shape: You have a random sample, 10n = 10×100 = 1000 is certainly less than the total number of light bulbs, and your sample size is comfortably larger than 30. Therefore you can use the normal model for the sampling distribution.

Sample means are ND with mean 800 hours and SD 5 hours. The sketch is at right.

Common mistake: The correct standard deviation is 5 hours, not 50. You’re not sketching the population of light bulbs. Rather, you’re now interested in the distribution of average lifetimes in samples of 100 bulbs. (The axis is the axis, not the x axis.)

780 hours, the sample mean that the problem asks about, is 20 hours below the population mean of 800. 20/5 = 4 standard errors, so you should have marked 780 hours at four standard deviations below the mean.

A sample mean of 780 is less than the population mean of 800 hours. Therefore you compute the probability of a sample mean of 780 hours or less. It will be surprising (unusual, unexpected) if the probability is under 5%.

P( ≤ 780) = normalcdf(−10^99, 780, 800, 50/√(100)) = 3.1686E-5 → P( ≤ 780) = 0.00003

(You can also give the probability as <0.0001.) Yes, this is surprising.

Common mistake: Don’t give the probability as 3.1686. Probabilities are never greater than 1.

(b) If the manufacturer’s claim is true, there are only three chances in a hundred thousand of getting a sample mean this low. It’s very unlikely that the manufacturer’s claim is true.

3 (a) “Describe the distribution” means shape, center and spread. You can always get center and spread, but if the test for normal approximation fails then you can’t say anything about the shape.
• μ = p = 0.72
• σ or SEP = √pq/n = √.72×(1−.72)/500 = 0.0200798406
• Expected “yes” per sample: np = 500×.72 = 360; expected “no” = 500−360 = 140; both are well above 10. You have a random sample, and 10×500 = 5000 is far less than the American population. Therefore the normal approximation is valid.

Answer: normally distributed with mean = 0.72, standard deviation (standard error) = 0.020

Common mistake: Don’t write n≥30 when testing the normal approximation. The n≥30 test applies to numeric data, but in this problem you have binomial data.

(b) 350/500 = 0.70 exactly, and 370/500 = 0.74 exactly. In a sample of 500, finding 350 to 374 successes is the same as finding 70% to 74% successes.

If you stored the computed SEP in part (a), then your screen will look like the one on the left. Otherwise, it will look like the one on the right: or Answer: P(70% ≤  ≤ 74%) = 0.6808.

Remark: Always check for reasonableness. 70% and 74% are one standard error below and above the mean, so you know from the Empirical Rule that about 68% of the data should be within that region.

Remark: The problem wanted you to use the normal approximation, but it’s always good to check answers by a different method if possible. 70%×500 = 350; 74%×500 = 370. MATH200A part 3 with n=500, p=.72, from 350 to 370, gives a probability of 0.7044, pretty good agreement.

4 The sampling distribution of is ND because the sample size of 1000 is greater than 30 and the random sample is smaller than 10% of the population (10% of 100,000 households is 10,000 households). The SEM is σ = 19000/√1000 ≈ \$601.

P( ≤ \$31,000) = normalcdf(−10^99, 31000, 32400, 19000/√(1000)) = 0.0099, almost exactly 1%. That would be pretty unlikely if the population mean was still \$32,400, so the city manager is most likely correct.

Remark: This problem was adapted from Freedman, Pisani, Purves (2007, 415) [see “Sources Used” at end of book].

5
x P(x) +10 18/38 −10 20/38 n/a 38/38 = 1

(a) The model is at right. You could list green and black separately, but since they have the same outcome there’s no need to do that. It’s important to have the probabilities as exact fractions, not approximate decimals. (b) x’s in L1, P’s in L2. `1-VarStats L1,L2` gives μ = −\$0.53, σ = \$9.99. Interpretation: In the long run, a player who bets \$10 on red will lose an average of 53¢ per bet.

Remark: Notice that the SD is about 20 times the mean. This is why gambling is so exciting for the player: there’s a lot of variability from one bet to the next.

(c) With n = 10,000, the sampling distribution of is normally distributed. (10n = 10×10,000 = 100,000, less than the total number of bets while the casino is in business. The bets placed in a given day are not random, but they are representative of all possible bets and therefore effectively random.) The mean of the sampling distribution is the mean of the population: μ = +\$0.53. (Whatever players lose, the casino wins, so the mean is the opposite of a player’s mean.) The standard error of the mean is σ/√n = 9.986139979/√10000; σ ≈ \$0.10.

Remark: This is why gambling is predictable for the operators: the SD is small compared to the mean.

(d) 10,000×\$.5263157895 = \$5,263.16 (e) To lose money, the casino has to make less than \$0.00. Zero is more than five standard errors below the mean (has a z-score below −5), so you know right off that it would be unusual for the casino to lose money. `normalcdf` confirms that: P(lose on 10,000 bets) = 6.8×10-8. The casino has essentially no risk (7 chances in 100 million) of losing money on 10,000 bets. (f) Remember the elevator example. A total of \$2000 on 10,000 bets is an average of 2000/10,000 = \$0.20 per bet. Use `normalcdf` to compute the probability of doing that well or better: P(make ≥\$2000) = 0.9995. Not only is the casino virtually certain not to lose money, it’s almost certain to make a handsome profit, as long as people come in to place bets.

6 Given: μ = 5.00, σ = 0.05, n = 15. Needed: P(∑x>75.6). A sample weighing 75.6 lb total will have a sample mean of 75.6/15 = 5.04 lb, so this is really just another problem in finding the probability of turning up a sample mean in a given range.
• μ = μ = 5.00 lb
• The SEM is σ = 0.05/√15 ≈ 0.013 lb.
• The sample means are normally distributed, even for this small sample, because the original population is normally distributed.

P(∑x > 75.6) = P( > 5.04) = normalcdf(75.6/15, 10^99, 5.00, 0.05/√(15)) = 9.7295E-4 ≈ 0.0010, about one chance in a thousand.

7 (a) This part is a standard Chapter 7 problem about individuals, not samples, so the axis is x rather than .  Answer: P(x > 43.0) = 0.1634 (b) The sampling distribution of is ND, even for this small sample, because the population is ND. The standard error is σ = 5.1/√14 ≈ 1.4.

P( > 43.0) = normalcdf(43, 10^99, 38, 5.1/√(14))  1.2212E-4 → P(>43.0) = 0.0001 or 0.01%

Remark: This sketch is not very well proportioned, because it makes the probability look much larger than it actually is.

8 12,778 KW shared among 1000 households is 12778/1000 = 12.778 KW per household on average. “Fail to supply enough power” means that the households are using more power than that. You need P( > 12.778) for n = 1000.

The standard error of the mean is σ = 3.5/√1000, about 0.11. The sampling distribution of the mean is normal because data are numeric and n =1000, greater than 30. (Treat the sample as random because it’s a “typical neighborhood”. And a thousand households is less than 10% of all the households that there are.)

P( > 12.778) = normalcdf(12.778, 10^99, 12.5, 3.5/√(1000) = 0.0060

9 p = 0.0171, n = 11,037, and you want to find P( ≤ 0.0094). First check that the sampling distribution of is a ND:
• The doctors were randomized between treatment and placebo groups.
• 10×11,037 = 110,370. There are more adult males than that.
• np = 11037×.0171 = about 189; nq = 11037−189 = 10848. Both are well above 10.

Therefore the sampling distribution can be approximated by a normal distribution.

The standard error of the proportion or SEP is σ = √pq/n = √.0171(1−.0171)/11037 ≈ 0.0012

If you use my shortcut, your screen will look like the one at the left; if not, it will look like the one at the right. or Either way, the probability is 2.2013×10-10, or 0.000 000 000 2. There are only two chances in ten billion of getting a sample proportion of 0.94% or less with sample size 11,037, if the true population proportion is 1.71%. That’s pretty darn unlikely, so based on this experiment you can rule out coincidence and decide that aspirin does reduce the chance of a heart attack among adult males.

10 Heights are ND, so the sampling distribution is also. By the Empirical Rule or 68–95–99.7 Rule, 95% of a ND falls within 2 SD of the mean. The distribution that concerns you in this problem is the sampling distribution of , not the original distribution of individual men’s heights. Therefore, the SD that concerns you is the standard error of the mean, not the SD of men’s heights.

The standard error of the mean or SEM is σ = σ/√n = 2.92/√16 = 0.73″.

μ ± 2σ = 69.3 ± 2×.73 = 67.84 to 70.76.

Sample means between those values would not be surprising, and therefore a sample mean would be surprising if it is under 67.84″ or over 70.76″.

Alternative solution: That back-of-the-envelope calculation is good enough, but you could also get a more precise answer:

L = invNorm(0.025, 69.3, 2.92/√(16)) = 67.87

H = invNorm(1−0.025, 69.3, 2.92/√(16)) = 70.73

11 This is like the Swain v. Alabama example. You have to convert the sample counts into a proportion:  = 737/1504 ≈ 49%. The problem is really asking you for P( ≥ 49%) in a sample of 1504 with population proportion of 45%. What does the sampling distribution look like? The center is μ = p = 0.45. The standard error is σ = √0.45×(1−.45)/1504 ≈ 0.013. Check requirements to make sure that a normal model can be used for the sampling distribution:

• Random sample? Yes, given.
• Sample less than 10% of population? 10×1504 = 15,040, compared to millions of American adults, OK.
• Sample large enough? Yes, 0.45×1504 ≈ 677 successes and 1504−677 ≈ 827 failures expected, both above 10.

P(x ≥ 737) = P( ≥ 49%) = normalcdf(737/1504, 10^99, .45, √(.45*(1−.45)/1504)) ≈ 9E-4 or 0.0009.

Can you draw a conclusion? Yes, you can. In a population with 45% unfavorable rating of the Tea Party, there are only 9 chances in 10,000 of getting a sample as unfavorable as this one (or more unfavorable). That’s pretty unlikely, so you conclude that the true unfavorable rating in October was most likely more than 45% of all Americans. (In Chapter 9, you’ll learn how to estimate that proportion from a sample.)

## Solutions for Chapter 9

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1 You make probability statements about things that can change if you repeat the experiment. There’s a 1/6 chance of rolling doubles, because you’ll get doubles about 1/6 of the times that you roll two dice. But the mean of the population is one definite number. It doesn’t change from one experiment to the next. Your estimate changes, because it’s based on your sample and no sample is perfect. But the thing you’re trying to estimate, mean or proportion, is what it is even though you don’t know it exactly.

(Statisticians would say, “the population mean or proportion is not a random variable.” By that, they mean just what I said in less technical language.)

2

Answer: A confidence interval for numeric data is an estimate of the average, and tells you nothing about individuals. Correct his conclusion to I’m 90% confident that the average food expense for all TC3 students is between \$45.20 and \$60.14 per week..

Remark: Use all or a similar word to show that you’re estimating the mean for the population, not just the sample of 40 students. There’s no need to estimate the mean of the sample, because you know the exact sample mean for your sample.

Remark: Be clear in your mind that you’re estimating the average spending per student at \$45–60 a week. Some individual students will quite likely spend outside that range, so your interpretation shouldn’t say anything about individual student spending.

3

Answer: It’s the use of the word average. When you collect data points that are all yes/no or success/failure, you have a sample proportion , equal to the number of successes divided by sample size, and you can estimate a population proportion. There is no “average” with non-numeric data.

Your 90% confidence estimate is simply that 27% to 40% usually or always prepare their own food.

4

This is a confidence interval about a mean, Case 1 in Inferential Statistics: Basic Cases.

Requirements: random sample, OK. 10n = 10×40 = 400 is less than total number of batteries made; OK. n = 40 >30, OK.

TInterval 1756, 142, 40, .95

(1710.6, 1801.4)

Neveready is 95% confident that the average Neveready A cell, operating a wireless mouse, lasts 1711 to 1801 minutes (28½ to 30 hours).

Common mistake: Don’t make any statement about 95% of the batteries! Your CI is about your estimate of one number, the average life of all batteries. Your CI has a margin of error of ±15 minutes; the 95% range for all batteries would be about 4 to 5 hours.

5 (a) = 5067/10000 = 0.5067

Don’t make the term “point estimate” harder than it is! The point estimate for the population mean (or proportion, standard deviation, etc.) is just the sample mean (or proportion, standard deviation, etc.).

(b) The sample is his actual data, the 10,000 flips. Therefore the sample size is n = 10,000. The population is what he wants to know about, all possible flips. The population size is infinite or “indefinitely large”.

6

This is sample size for a confidence interval about a proportion, Case 2 in Inferential Statistics: Basic Cases. Since you have no prior estimate, use 0.5 for .

With the MATH200A program (recommended): If you’re not using the program:
MATH200a/sample size/binomial,  = .5, E = .035, C-Level = .95, sample size is at least 784 The formula is .
1−α = .95 ⇒ α/2 = 0.025.
z0.025 = invNorm(1−.025, 0, 1)
Divide by .035, square the result, and multiply by .5*(1−.5).
Answer: at least 784. Remember — you’re not rounding, you’re going up to a whole number.
7

This is a confidence interval about a proportion, Case 2 in Inferential Statistics: Basic Cases.

Requirements:

• Random sample, OK.
• 10n = 10×100 = 1000 < 68,917, OK.
• 40 successes, 100−40 = 60 failures, both > 10, OK.

Common mistake: Don’t say “n > 30” or “n ≥ 30”. That’s true, but it doesn’t help you with binomial data. For computing a confidence interval about a proportion from binomial data, the “sample size large enough” condition is at least 10 successes and at least 10 failures, not sample size at least 30.

1-PropZInt 40, 100, .9 → (.31942, .48058),  = .4

31.9% to 48.1% of all claims at that office have been open for more than a year (90% confidence).

8

This is a confidence interval about a mean, Case 1 in Inferential Statistics: Basic Cases.

Requirements check:

• Random sample, OK.
• 10×40 = 400, less than the number of times she could commute (past, present, and future), OK.
• Sample size 40 > 30, OK.

TInterval 17.7, 1.8, 40, .95 → (17.124, 18.276)
She’s 95% confident that the average of all her commutes is 17.1 to 18.3 minutes.

9

This is a confidence interval about a mean, Case 1 in Inferential Statistics: Basic Cases.

Requirements check:

• Random sample, OK.
• 10×15 = 150 is less than total number of women in their 20s.
• MATH200A/Normality: r=.9667, CRIT=.9383, r>CRIT, OK.
• MATH200A/Box-whisker: no outliers, OK.

TInterval L6, 1, .95 → (62.918, 65.016), =63.96666667, s=1.894226818, n=15

The average height of women aged 20–29 is 62.9 to 65.0 inches (95% confidence).

Remark: Since adult women’s heights are known to be normally distributed, you could get away without checking for normality and outliers in this sample. But it does no harm to check every time.

10

This is a confidence interval about a mean, Case 1 in Inferential Statistics: Basic Cases.

Requirements check:

• Random sample, OK.
• 10×18 = 180. There are far more than 180 male students; OK.
• MATH200A/box-whisker: no outliers, OK.
• MATH200A/Normality check: r=.9787, CRIT=.9461, r>CRIT, OK.

TInterval L5, 1, .9 → (97.757, 98.343),  = 98.05, s =.7155828558 → 0.72, n = 18.
(a) Fred is 90% confident that the average body temperature of healthy male students is 97.8 to 98.3 °F.

(b) He’s 90% confident that the average body temperature is not more than 98.3°, so 98.6° as normal (average) temperature is inconsistent with his data.

(c) E = 98.343−98.05 = 0.3°, or E = 98.05−97.757 = 0.3°, or (98.343−97.757)/2 = 0.3°.

With the MATH200A program (recommended): If you’re not using the program:
(d) MATH200A/Sample size/Num unknown σ: s=.7155828558, E=.1, C-Level=.95, n≥202. He will need at least 202 in his sample. (d) Confidence level = 1−α = 0.95 ⇒ α = 0.05 ⇒ α/2 = 0.025. z0.025 = invNorm(1−.025)

Multiply by s, divide by E, and square the result. This gives 197. But the t distribution is more spread out than the normal (z) distribution, so you probably want to bump that number up a bit, say to 200 or so.

11

This problem is about a confidence interval about a proportion, Case 2 in Inferential Statistics: Basic Cases.

(a) Requirements check:

• Random sample, OK.
• 10×500 = 5000. A city of 6.4 million must have more than 5000 in that age range. OK.
• 219 successes, 500−219 = 281 failures. Both > 10, OK.

1-PropZInt, 219, 500, .9 → (.4015, .4745),  = .438

You’re 90% confident that 40.2% to 47.5% of Metropolis adults aged 50–75 have had a colonoscopy in the past ten years.

(b) MATH200A/sample size/binomial,  = .438, E = .02, C-Level = .9 → at least 1665

12

This is a confidence interval about a mean, Case 1 in Inferential Statistics: Basic Cases.

Requirements check:

• Random sample, OK.
• 10×20 = 200, less than the total number of cash deposits, OK.
• MATH200A/normality check, r=.9864, CRIT=.9503, r>CRIT, OK.
• MATH200a/box-whisker, no outliers, OK.

TInterval L4, 1, .95 → (179.86, 198.93),  = 189.40, s = 20.37, n = 20

You’re 95% confident that the average of all cash deposits is between \$179.86 and \$198.93.

Common mistake: Don’t say that 95% of deposits are between those values — if you look at the sample you’ll see that’s pretty unlikely. You’re estimating the average, not the individual deposits in the population.

13

This is a confidence interval about a proportion, Case 2 in Inferential Statistics: Basic Cases.

Requirements check:

• Systematic sample, OK.
• Sample 10n = 10×1000 = 10,000, less than the number of voters; OK.
• 520 successes and 1000−520 = 480 failures, OK.

1-PropZInt 520, 1000, .95 → (.48904, .55096),  = .52

With 95% confidence, 48.9% to 55.1% of voters voted Snake. At the 95% confidence level, we can’t tell whether more or less than 50% of voters voted for Abe Snake.

## Solutions for Chapter 10

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.

### Problem Set 1

1

1. Hypotheses.   2. Significance level   RC. Requirements check   3–4. Test statistic and p-value   5. Decision rule (or, conclusion in statistics language)   6. Conclusion (in English)

2

It keeps you honest. If you could select a significance level after computing the value, you could always get the result you want, regardless of evidence.

3

Answers will vary here. But you should get in the key idea that If H0 is true, the p-value is the chance of getting the sample you got, or a sample even further from H0, purely by random chance. For more correct statements, and common incorrect statements, see What Does the p-Value Mean?

4

(a) It’s too wishy-washy. When p<α, you can reach a conclusion. Correction: The accelerant makes a difference, at the 0.05 significance level.

(b)You can never prove the null hypothesis of “no difference”. You can’t even say “The accelerant may make no difference,” because that’s only part of the truth: it equally well may make a difference. You must say something like, “At the 0.05 significance level it’s impossible to say whether the accelerant makes a difference or not.”

5

(a) A Type I error is rejecting the null hypothesis when it’s actually true. In this case, a Type I error would be concluding “the accelerant makes paint dry faster” when actually it makes no difference. This would lead you to launch the product and expose yourself to a lot of warranty claims.

(b) A Type II error is failing to reject the null hypothesis when it’s actually false. In this case, a Type II error would be concluding “the accelerant doesn’t makes paint dry faster” when actually it does. This would lead you to keep the product off the market even though it could add to your sales and would perform as promised.

6

They are not necessarily mistakes. Type I and II errors are an unavoidable part of sample variability. Nothing can prevent them entirely. The only way to make them both less likely at the same time is to use a larger sample size.

That said, if you make mistakes in data collection or analysis you definitely make Type I or Type II errors (or both of them) more likely.

7

Make your significance level α smaller. The side effect is making a Type II error more likely.

8

Your own words will vary from mine, but the main difference is that when p > α you can’t reach a conclusion. Accepting H0 is wrong because it reaches the conclusion that H0 is true. Failing to reject H0 is correct because it leaves both possibilities open.

It’s like a jury verdict of “not guilty beyond a reasonable doubt. The jury is not saying the defendant didn’t do it. They are saying that either he didn’t do it or he did it but the prosecution didn’t present enough evidence to convince them.

A hypothesis test can end up rejecting H0 or failing to reject it, but the result can never be to accept H0.

9

H0: μ = 500
H1: μ ≠ 500

Remark: It must be ≠, not > or <, because the claim is that the mean is 500 minutes, and a difference in either direction would destroy the claim.

10

(a) p > α; fail to reject H0. At the 0.01 significance level, we can’t determine whether the directors are stealing from the company or not.

(b) p < α; reject H0 and accept H1. At the 0.01 level of significance, we find that the directors are stealing from the company.

11

α is the probability of a Type I error that you can tolerate. A Type I error in this case is determining that the defendant is guilty (calling H0 false) when actually he’s innocent (H0 is really true), and the consequence would be putting an innocent man to death. You specify a low α to make it less likely this will happen. Of the given choices, 0.001 is best.

12

This is binomial data, a Case 2 test of proportion in Inferential Statistics: Basic Cases.

(1) H0: p = .1, 10% of TC3 students driving alcohol impaired H1: p > .1, more than 10% of TC3 students driving alcohol impaired α = 0.05 Systematic sample (counts as random), OK. npo = 120×.10 = 12 successes and n−npo = 120−12 = 108 failures expected, OK. 10n = 10×120 = 1200, and there are many more students than that at TC3, OK. 1-PropZTest: .1, 18, 120, >poresults: z=1.825741858 → z = 1.83, p=.0339445194 → p = 0.0339, p̂ = .15 p < α. Reject H0 and accept H1. At the 0.05 significance level, more than 10% of TC3 students were alcohol impaired on the most recent Friday or Saturday night when they drove, Or, More than 10% of TC3 students were alcohol impaired on the most recent Friday or Saturday night when they drove (p = 0.0339).
13

This is binomial data (against or not against): a Case 2 test of population proportion in Inferential Statistics: Basic Cases.

Requirements check: Random sample? NO, this is a self-selected sample, consisting only of those who returned the poll. (That could be overcome by following up on those who did not return the poll, but nobody did that.)

The 10n≤N requirement also fails. 10n = 10×380 = 3800, much larger than the 1366 population size.

Answer: No, you cannot do any inferential procedure because the requirements are not met.

14 (a) The sample size is 325. Why not the 500 she talked to? Because she was studying the habits of the primary grocery shoppers. The 325 were members of that population and could therefore be part of her sample; the rest of the 500 were not.

(b) The population is all persons who do the primary grocery shopping in their households. We don’t know the precise number, but it is surely in the millions since there are millions of households. We can say that it is indefinitely large.

(c) The number 182 is x, the number of successes in the sample.

(d) She wanted to know whether the true proportion is greater than 40%, so her alternative hypothesis is H1p > 0.4 and po is 0.4.

(e) No. The researcher is interested in the habits of the primary grocery shoppers in households; therefore she must sample only people who are primary grocery shoppers in their households. If you even thought about saying Yes, please go back to Chapter 1 and review what bias actually means.

15

(a) This is inference about the proportion in one population, Case 2 in Inferential Statistics: Basic Cases.

(1) H0: p = 2/3, the chance of winning is 2/3 if you switch doors. H1: p ≠ 2/3, the chance of winning is different from 2/3 if you switch doors. Remark: You need to test for ≠, not <. You’re asked whether the claim of 2/3 is correct, and if it’s wrong it could be wrong in either direction. It doesn’t matter that the sample data happen to show a smaller proportion than 2/3. α = 0.05 Random sample? Effectively, yes. Sample less than 10% of population? Yes, 10×30 = 300, and in the long run there would be far more than 300 contestants who switched doors. Sample large enough? In a sample of 30, if H0 is true you expect 30×(2/3) = 20 successes and 30−20 = 10 failures, so the answer is yes. Common mistake: Don’t say “n ≥ 30.” That’s true, but it’s irrelevant for tests of proportions. The sample size of at least 30 is useful when you’re testing the mean of numeric data. 1-PropZTest, 2/3, 18, 30, ≠ results: z = −.77, p-value = 0.4386, p̂ = 0.6 p > α. Fail to reject H0. We can’t determine whether the claim “switching doors gives a 2/3 chance of winning” is true or false (p = 0.4386). Or,At the 0.05 significance level, we can’t determine whether the probability of winning after switching doors is equal to 2/3 or different from 2/3. Remark: It’s true that you can’t disprove the claim, but it’s also true that you can’t prove it. This is where a confidence interval gives useful information.

(b) Requirements have already been checked.
1-PropZInt 18, 30, .95. Results: (.4247, .7753),  = .6.
We’re 95% confident that the true probability of winning if you switch doors is between 42.5% and 77.5%.

(c) It’s possible that the true probability of winning if you switch doors is 1/3 (33.3%) or even worse, but it’s very unlikely. Why? You’re 95% confident that it’s at least 42.5%. Therefore you’re better than 95% confident that the true probability if you switch is better than the 1/3 probability if you don’t switch doors. Switching is extremely likely to be the good strategy.

16 The null hypothesis is always “no effect”, “nothin’ goin’ on here.” In this case “no effect” is “not spam”, so H0 is “This piece of mail is not spam.”

(a) A Type I error is rejecting the null hypothesis when it’s actually true. Here, a Type I error means deciding a piece of mail is spam when it’s actually not, so if Heather’s spam filter makes a Type I error then it will delete a piece of real mail. A Type II error is failing to reject H0 when it’s actually false, treating a piece of spam as real mail, so a Type II error would let a piece of spam mail into Heather’s in-box..

(b) Most people would rather see a piece of spam (Type II) than miss a piece of real mail (Type I), so a Type I error is more serious in this situation. Lower significance levels make Type I errors less likely (and Type II errors more likely), so a lower α is appropriate here.

17 This is a test of one population proportion, Case 2 in Inferential Statistics: Basic Cases.
(1) H0: p = .304 H1: p < .304, less than 30.4% of Ithaca households own cats. α = 0.05 Random sample? Systematic, OK. Sample too large? 10×215 = 2150. Without knowing how many households are in Ithaca, we can be sure it’s more than 2150. Sample large enough? In a sample of 215, according to H0 you expect 215×.304 ≈ 65 successes and 215−65 = 150 failures, OK. 1-PropZTest .304, 54, 215, < results: z = −1.68, p-value = 0.0461, p̂ = 0.2512 p < α. Reject H0 and accept H1. At the 0.05 significance level, fewer than 30.4% of Ithaca households own cats. Or,Fewer than 30.4% of Ithaca households own cats (p = 0.0461).

### Problem Set 2

18

(a)The population parameter is missing. It should be either μ or p, but since a proportion can’t be greater than 1 it must be μ.
Correction: H0: μ = 14.2;  H1: μ > 14.2

(b) H0 must have an = sign. Correction: H0: μ = 25;  H1: μ > 25

(c) You used sample data in your hypotheses. Correction: H0:μ=750; H1:μ>750

(d) You were supposed to test “makes a difference”, not “is faster than”. Never do a one-tailed test (> or <) unless the other direction is impossible or of no interest at all. It’s possible that your “accelerant” could actually increase drying time, and if it does you’d definitely want to know.
Correction: H0: μ = 4.3 hr;  H1: μ ≠ 4.3 hr

19

This is numeric data, and you don’t know the standard deviation (SD) of the population. In Inferential Statistics: Basic Cases this is Case 1, a test of population mean.

(1) H0: μ = 3.8, the mean pollution this year is no different from last year H1: μ < 3.8, the mean pollution this year is lower than last year α = 0.01 Random sample? Yes. 10n ≤ N? 10×10 = 100, and if there were daily readings the population size is greater than that. Sample size ≥30? NO! Normally distributed? MATH200A part 4 yields r=.9784, crit=.9179. r>crit, therefore normal. Outliers? MATH200A part 2 shows none.  T-Test: 3.8, L1, 1, <μo results: t = −4.749218419 → t = −4.75, p = 5.2266779E−4 → p = 0.0005, x̅ = 3.21, s = .3928528138 → s = 0.39, n = 10 Common mistake: Don’t write “p = 5.2267” or anything equally silly. A p-value is a probability, and probabilities are never greater than 1. p < α. Reject H0 and accept H1. At the 0.01 level of significance, the mean pollution is lower this year than last year. Or, The mean pollution this year is lower than last year (p = 0.0005).
20

This is numeric data with unknown SD of the population, Case 1 (test of population mean) in Inferential Statistics: Basic Cases.

(1) H0: μ = 32.0, quarts are being properly filled H1: μ < 32.0, Dairylea is shorting the public Remark: Your H1 uses <, not ≠, because the problem asks if Dairylea has a legal problem. Yes, they might be overfilling, but that would not be a legal problem. α = 0.05. This is just a business situation, not a matter of life and death. (You could justify a lower α if you can show serious consequences from making a mistake, such as a multimillion libel suit brought by the company against the investigator.) random sample 10×10 = 100 quarts much smaller than total number produced population is normally distributed, so small sample size is OK T-Test: 32, 31.8, .6, 10, <μo results: t=−1.054092553 → t = −1.05, p=.159657788 → p = 0.1597 p > α. Fail to reject H0. At the 0.05 level of significance, we can’t determine whether Dairylea is giving short volume or not. Or, We can’t determine from this sample whether Dairylea is giving short volume or not (p = 0.1597). Remark: You never accept the null hypothesis. But in many cases you may proceed as though it’s true. Here, since you can’t prove a case against the dairy, you don’t file charges, make a press release, organize a boycott, etc. You behave exactly as you would behave if you had proof the dairy was honest. But you don’t conclude that Dairylea is giving full measure, either. All your hypothesis test tells you is that it could go either way.
21

This is numeric data with unknown SD of population. You’re testing a population mean, Case 1 in Inferential Statistics: Basic Cases.

(1) H0: μ = 870, no difference in strength H1: μ ≠ 870, new glue’s average strength is different Remark: You’re testing different here, not better. It’s possible that the new glue bonds more poorly, and that would be interesting information, either guiding further research or perhaps leading to a new product (think Post-It Notes). α = 0.05 Random sample.n = 30. 10×30 = 300, and in principle you could make more than 300 trials. T-Test: 870, 892.2, 56.0, 30, μ≠μo results: t=2.17132871 → t = 2.17, p=.038229895 → p = 0.0382 p < α. Reject H0 and accept H1. At the 0.05 level of significance, new glue has a different mean strength from the company’s best seller. In fact, it is stronger. Or, New glue has a different mean strength from the company’s best seller (p = 0.0382). In fact, it is stronger. Remark: When you are testing ≠, and p<α, you give the two-tailed interpretation “different from”, and then continue with a one-tailed interpretation. See p < α in Two-Tailed Test: What Does It Tell You?
22

This is binomial data (each person either has a bachelor’s or doesn’t) for a one-population test of proportion: Case 2 in Inferential Statistics: Basic Cases

(a) Requirements:

• Random sample, yes.
• 10n = 10×120 = 1200, and there are many more than 1200 residents of Tompkins County aged 25+.
• Sample has 52 successes and 120−52 = 68 failures, both ≥ 10, check.

1-PropZInt: x=52, n=120, C-Level=.95
Results: (.34467, .52199); =.4333333333 →  = .4333

We’re 95% confident that 34.5 to 52.2% of Tompkins County residents aged 25+ have at least a bachelor’s degree.

(b) Requirements have already been checked. A two-tailed test at the 0.05 level is equivalent to a confidence interval at the 95% level. The statewide proportion of 32.8% is outside the 95% CI for Tompkins County, and therefore at the 0.05 significance level, the proportion of bachelor’s degrees among Tompkins County residents aged 25+ is different from the statewide proportion of 32.8%. In fact, Tompkins County’s proportion is higher.

23

This is numeric data, with population SD unknown: test of population mean, Case 1 in Inferential Statistics: Basic Cases.

(1) H0: μ = 625, no difference in strength H1: μ > 625, Whizzo stronger than Stretchie Remark: Here you test for >, not ≠. Even though Whizzo might be less strong, you don’t care unless it’s stronger. α = 0.01 Random sample. 10n = 10×8 = 80, less than Whizzo’s production of bungee cords. n<30, so test for normality. MATH200A part 4 gives r=.9569, crit=.9054. r>crit, therefore ND. MATH200A part 2 shows no outliers.  T-Test: 625, L1, 1, >μo results: t=3.232782217 → t = 3.23, p=.0071980854 → p = 0.0072, x̅ = 675, s=43.74602023 → s = 43.7, n = 8 p < α. Reject H0 and accept H1. At the 0.01 level of significance, Whizzo is stronger on average than Stretchie. Or, Whizzo is stronger on average than Stretchie (p = 0.0072).
24

This is numeric data, with σ unknown: test of population mean, Case 1 in Inferential Statistics: Basic Cases.

(1) H0: μ = 6 H1: μ > 6 α = 0.05 Systematic sample. n =100 >30. 10n = 10×100 = 1000, less than the number of TC3 students. T-Test: 6, 6.75, 3.3, 100, >μo results: t=2.272727273 → t = 2.27, p=.0126021499 → p = 0.0126 p < α. Reject H0 and accept H1. TC3 students do average more than six hours a week in volunteer work, at the 0.05 level of significance. Or, TC3 students do average more than six hours a week in volunteer work (p = 0.0126).
25

Binomial data (head or tail) implies Case 2, test of population proportion on Inferential Statistics: Basic Cases. A fair coin has heads 50% likely, or p = 0.5.

(1) H0: p = 0.5, the coin is fair H1: p ≠ 0.5, the coin is biased Common mistake: You must test ≠, not >. An unfair coin would produce more or less than 50% heads, not necessarily more than 50%. Yes, this time he got more than 50% heads, but your hypotheses are never based on your sample data. α = 0.05 Random sample? Yes, it’s coin flips. npo = 10000×.5 = 5000 successes and 10000−5000 = 5000 failures expected. 10n = 10×10,000 = 100,000. It would be possible to flip the coin more than 100,000 times. 1-PropZTest, .5, 5067, 10000, prop≠po results: z = 1.34, p = .1802454677 → p-value = 0.1802, p̂ = .5067 p > α. Fail to reject H0. At the 0.05 level of significance, we can’t tell whether the coin is fair or biased. Or, We can’t determine from this experiment whether the coin is fair or biased (p = 0.1802). Common mistake: You can’t say that the coin is fair, because that would be accepting H0. You can’t say “there is insufficient evidence to show that the coin is biased”, because there is also insufficient evidence to show that it’s fair. Remark: “Fail to reject H0” situations are often emotionally unsatisfying. You want to reach some sort of conclusion, but when p>α you can’t. What you can do is compute a confidence interval: 1-PropZInt: 5067, 10000, .95 results: (.4969,.5165) You’re 95% confident that the true proportion of heads for this coin (in the infinity of all possible flips) is 49.69% to 51.65%. So if the coin is biased at all, it’s not biased by much.
26

You have numeric data, and you don’t know the SD of the population, so this is a Case 1 test of population mean in Inferential Statistics: Basic Cases.

(a) Check requirements: random sample, n = 45 > 30, and there are more than 10×45 = 450 people with headaches.

TInterval: =18, s=8, n=45, C-Level=.95

Results: (15.597, 20.403)

We’re 95% confident that the average time to relief for all headache sufferers using PainX is 15.6 to 20.4 minutes.

(b) Requirements have already been checked. A two-tailed test (a test for “different”) at the 0.05 level is equivalent to a confidence interval at the 1−0.05 = .95 = 95% confidence level. Since the 95% CI includes 20, the mean time for aspirin, we cannot determine, at the 0.05 significance level, whether PainX offers headache relief to the average person in a different time than aspirin or not.

## Solutions for Chapter 11

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1

(a) Use MATH200A part 5 and select `2-pop binomial`. You have no prior estimates, so enter 0.5 for 1 and 2. E is 0.03, and C-Level is 0.95. Answer: you need at least 2135 per sample, 2135 people under 30 and 2135 people aged 30 and older. Here’s what it looks like, using MATH200A part 5:  Caution! Even if you don’t identify the groups, at least you must say “per sample”. Plain “2135” makes it look like you need only that many people in the two groups combined, or around 1068 per group, and that is very wrong.

Caution! You must compute this as a two-population case. If you compute a sample size for just one group or the other, you get 1068, which is just about half of the correct value. If you don’t have the program, you have to use the formula: [1(1−1)+2(1−2)]·(zα/2/E)². You don’t have any prior estimates, so 1 and 2 are both equal to 0.5. Multiply out 1 × (1−1) × 2 × (1−2) to get .5.

Next, 1−α = 0.95, so α = 0.05 and α/2 = 0.025. zα/2 = z0.025 = invNorm(1−0.025). Divide that by E (.03), square, and multiply by the result of the computation with the ’s.

(b) Using MATH200A Program part 5 with .3, .45, .03, .95 gives 1953 per sample.

Alternative solution: Using the formula, .3(1−.3)+.45(1−.45) = .4575. Multiply by (invNorm(1−.05/2)/.03)² as before to get 1952.74157 → 1953 per sample.

Again, you must do this as two-population binomial. If you do the under-30 group and the 30+ group separately, you get sample sizes of 897 and 1057, which are way too small. If your samples are that size, the margins of error for under-30 and 30+ will each be 3%, but the margin of error for the difference, which is what you care about, will be around 4.2%, and that’s greater than the desired 3%.

2

(a) You have numeric data in two independent samples. You’re testing the difference between the means of two populations, Case 4 in Inferential Statistics: Basic Cases. (The data aren’t paired because you have no reason to associate any particular Englishman with any particular Scot.)

(1) Population 1 = English; population 2 = Scots. H0: μ1 = μ2 (or μ1−μ2 = 0) H1: μ1 > μ2 (or μ1−μ2 > 0) α = 0.05 The problem states that samples were random. For English, r=.9734 and crit=.9054; for Scots, r=.9772 and crit=.9054. Both r’s are greater than crit, so both are nearly normally distributed. The stacked boxplot shows no outliers. And obviously the samples of 8 are far less than 10% of the populations of England and Scotland.   English numbers in L1, Scottish numbers in L2. 2-SampTTest with Data; L1, L2, 1, 1, μ1>μ2, Pooled:No Outputs: t=1.57049305 → t = 1.58, p=.0689957991 → p = 0.0690, df=13.4634, x̅1=6.54, x̅2=4.85, s1=1.91, s2=2.34, n1=8, n2=8 p > α. Fail to reject H0. At the 0.05 level of significance, we can’t say whether English or Scots have a stronger liking for soccer. Or, We can’t say whether English or Scots have a stronger liking for soccer (p = 0.0690).

2-SampTInt, C-Level=.90
Results: (−.2025, 3.5775)
We’re 90% confident that, on a scale from 1=hate to 10=love, the average Englishman likes soccer between 0.2 points less and 3.6 points more than the average Scot.

3

(a) This is the difference of proportions in two populations, Case 5 in Inferential Statistics: Basic Cases.

(1) Population 1 = English, population 2 = Scots. H0: p1 = p2 (or p1−p2 = 0) H1: p1 ≠ p2 (or p1−p2 ≠ 0) α = 0.05 Populations of England and Scotland are greater than 10×150 = 1500 and 10×200 = 2000. England: 105 successes, 150−105 = 45 failures, both ≥ 10. Scotland: 160 successes, 200−160 = 40 failures, both ≥ 10. The samples were stated to be random. 2-PropZTest x1=105, n1=150, x2=160, n2=200, p1≠p2 results: z=−2.159047761 → z = −2.16, p=.030846351 → p = 0.0308, p̂1 = 0.70, p̂2 = 0.80, p̂ = 0.7571428751 p < α. Reject H0 and accept H1. The English and Scots are not equally likely to be soccer fans, at the 0.05 level of significance; in fact the English are less likely to be soccer fans. Or, The English and Scots are not equally likely to be soccer fans, (p = .0308); in fact the English are less likely to be soccer fans.

2-PropZInt with C-Level = .95 → (−.1919, −.0081)

That’s the estimate for p1p2, English minus Scots. Since that’s negative, English like soccer less than Scots do. With 95% confidence, Scots are more likely than English to be soccer fans, by 0.8 to 19.2 percentage points.

(c) [(−.0081) − (−.1919)] / 2 = 0.0919, a little over 9 percentage points.

(d) MATH200A part 5, 2-pop binomial, 1=.7, 2=.8, E=.04, C-Level .95 gives 889 per sample

By formula, zα/2 = z0.025 = invNorm(1−0.025) = 1.96.
n1 = n2 = [.7(1−.7)+.8(1−.8)]×(1/96/.04)² = 888.37 → 889 per sample

4

(a) This is before-and-after paired data, Case 3 in Inferential Statistics: Basic Cases. You’re testing the mean difference.

(1) d = After−Before H0: μd = 0, running makes no difference in HDL H1: μd > 0, running increases HDL Remark: If this was a research study, they would probably test for a difference in HDL, not just an increase. Maybe this study was done by a fitness center or a running-shoe company. They would want to find an increase, and HDL decreasing or staying the same would be equally uninteresting to them. α = 0.05 Before in L1, After in L2, L3=L2−L1 Random sample. Five women is obviously less than 10% of all women. Box-whisker (L3) shows no outliers. Normality check (L3): r(.9131)>crit(.8804).  T-Test 0, L3, 1, μ>0 results: t=3.059874484 → t = 3.06, p=.0188315555 → p = 0.0188, d̅=4.6, s=3.36, n=5 p < α. Reject H0 and accept H1. At the 0.05 level of significance, running 4 miles daily for six months raises HDL level. Or, Running 4 miles daily for six months raises HDL level (p = 0.0188).

(b) TInterval with C-Level .9 gives (1.3951, 7.8049).

Interpretation: You are 90% confident that running an average of four miles a day for six months will raise HDL by 1.4 to 7.8 points for the average woman.

Caution! Don’t write something like “I’m 90% confident that HDL will be 1.4 to 7.8”. The confidence interval is not about the HLD level, it’s about the change in HDL level.

Remark: Notice the correspondence between hypothesis test and confidence interval. The one-tailed HT at α = 0.05 is equivalent to a two-tailed HT at α = 0.10, and the complement of that is a CI at 1−α = 0.90 or a 90% confidence level. Since the HT did find a statistically significant effect, you know that the CI will not include 0. If the HT had failed to find a significant effect, then the CI would have included 0. See Confidence Interval and Hypothesis Test.

5

(a) Each participant either had a heart attack or didn’t, and the doctors were all independent in that respect. This is binomial data. You’re testing the difference in proportions between two populations, Case 5 in Inferential Statistics: Basic Cases.

(1) Population 1: Aspirin takers; population 2: non-aspirin takers. H0: p1 = p2, taking aspirin makes no difference H1: p1 ≠ p2, taking aspirin makes a difference α = 0.001 SRS. 10n1 = 10×11,037 = 110,370. According to A Census of Actively Licensed Physicians in the United States, 2010 (Young (2011) [see “Sources Used” at end of book]), in that year there were 850,085 actively licensed physicians in the US. Even if we assume half were women and there were fewer doctors in 1982 when the study began, still 10n1 is lower. 10n2 = 10×11,034 = 110,340, also within the limit. Treatment group: 139 successes, 11037−139 = 10898 failures, both ≥ 10. Placebo group: 239 successes, 11034−239 = 10795 failures, both ≥ 10. 2-PropZTest: x1=139, n1=11037, x2=239, n2=11034, p1≠p2 results: z=−5.19, p-value = 2×10-7, p̂1 = .0126, p̂2 = .0217, p̂ = .0171  p < α. Reject H0 and accept H1. At the 0.001 level of significance, aspirin does make a difference to the likelihood of heart attack. In fact it reduces it. Or, Aspirin makes a difference to the likelihood of heart attack (p < 0.0001). In fact, aspirin reduces the risk.

Remark The study was conducted from 1982 to 1988 and was stopped early because the results were so dramatic. For a non-technical summary, see Physicians’ Health Study (2009) [see “Sources Used” at end of book]. More details are in the original article from the New England Journal of Medicine (Steering Committee 1989 [see “Sources Used” at end of book]).

(b) `2-PropZInt` with C-Level .95 gives (−.0125, −.0056). We’re 95% confident that 325 mg of aspirin every other day reduces the chance of heart attack by 0.56 to 1.25 percentage points.

Caution! You’re estimating the change in heart-attack risk, not the risk of heart attack. Saying something like “with aspirin, the risk of heart attack is 0.56 to 1.25%” would be very wrong.

6

(a) You’re estimating the difference in means between two populations. This is Case 4 in Inferential Statistics: Basic Cases. Requirements:

• Random samples (given).
• Sample sizes both >30.
• 10×30 = 300 and 10×32 = 320 are less than the numbers of houses in the two counties.

Population 1 = Cortland County houses, population 2 = Broome County houses.
2-SampTInt, 134296, 44800, 30, 127139, 61200, 32, .95, No

results: (−20004, 34318)

June is 95% confident that the average house in Cortland County costs \$20,004 less to \$34,318 more than the average house in Broome County.

(b) A 95% confidence interval is the complement of a significance test for ≠ at α = 0.05. Since 0 is in the interval, you know the p-value would be >0.05 and therefore June can’t tell, at the 0.05 significance level, whether there is any difference in average house price in the two counties or not.

If both ends of the interval were positive, that would indicate a difference in averages at the 0.05 level, and you could say Cortland’s average is higher than Broome’s. Similarly, if both ends were negative you could say Cortland’s average is lower than Broome’s. But as it is, nada.

Remark: Obviously Broome County is cheaper in the sample. But the difference is not great enough to be statistically significant. Maybe the true mean in Broome really is less than in Cortland; maybe they’re equal; maybe Broome is more expensive. You simply can’t tell from these samples.

7

The immediate answer is that those are proportions in the sample, not the proportions among all voters. This is two-population binomial data, Case 5 in Inferential Statistics: Basic Cases.
Requirements check:

• Random samples, OK.
• Each sample 10n = 10×1000 = 10,000. There are far more than 10,000 voters nationally; OK.
• The two samples were independent, OK.
• Red: 520 successes and 1000−520 = 480 failures, OK.
Blue: 480 successes and 1000−480 = 520 failures, OK.

Population 1 = Red voters, population 2 = Blue voters.
2-PropZInt 520, 1000, 480, 1000, .95
Results: (−.0038, .08379), 1=.48, 2=.52

With 95% confidence, the Red candidate is somewhere between 0.4 percentage points behind Blue and 8.4 ahead of Blue. The confidence interval contains 0, and so it’s impossible to say whether either one is leading.

Remark: Newspapers often report the sample proportions 1 and 2 as though they were population proportions, but now you know that they aren’t. A different poll might have similar results, or it might have samples going the other way and showing Blue ahead of Red.

8 (a) For a confidence interval, each sample must have at least 10 successes and at least 10 failures. Sample 1 has only 7 successes. Requirements are not met, and you cannot compute a confidence interval with 2-PropZInt.

(b) For a hypothesis test, we often use “at least 10 successes and 10 failures in each sample” as a shortcut requirements test, but the real requirement is at least 10 successes and 10 failures expected in each sample, using the blended proportion . If the shortcut procedure fails, you must check the real requirement. In this problem, the blended proportion is

= (x1+x2)/(n1+n2) = (7+18)/(28+32) =25/60, about 42%.

For sample 1, with n1 = 28, you would expect 28×25/60 ≈ 11.7 successes and 28−11.7 = 16.3 failures. For sample 2, with n2 = 32, you would expect 32×25/60 ≈ 13.3 successes and 32−13.3 = 18.7 failures. Because all four of these expected numbers are at least 10, it’s valid to compute a p-value using 2-PropZTest.

## Solutions for Chapter 12

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.
1 There is no difference. What matters in a model is the relative sizes of the predictions for the categories. 40% is 1.6 times 25%, just as 40 is 1.6 times 25.
2 This is attribute data, one population, more than two possible responses: Case 6, goodness-of-fit, in Inferential Statistics: Basic Cases. There are 6 categories, therefore 5 degrees of freedom.
(1) H0: The 25:25:20:15:8:7 model for ice cream preference is good. H1: The 25:25:20:15:8:7 model for ice cream preference is bad. α = 0.05 Use MATH200A part 6. df=5, χ²=9.68, p-value = 0.0849 Here are the input and output data screens:  (If you have MATH200A V6, you’ll see the p-value, degrees of freedom, and χ² test statistic on the same screen as the graph.) Common mistake: When a model is given in percentages, some students like to convert the observed numbers to percentages. Never do this! The observed numbers are always actual counts and their total is always the actual sample size. Remark: You could give the model as decimals, .25, .20, .15 and so on. But for the model, all that matters is the relative size of each category to the others, so it’s simpler to use whole-number ratios. Common mistake: If you do convert the percentages to decimals, remember that 8% and 7% are 0.08 and 0.07, not 0.8 and 0.7. L3 shows the expected counts, and the lowest is 70, so all are ≥5. The problem says that the 1000 people were a random sample. There are millions of ice cream lovers, so the sample of 1000 is less than 10% of population. p > α. Fail to reject H0. At the 0.05 level of significance, you can’t say whether the model is good or bad. Or, It’s impossible to determine from this sample whether the model is good or bad (p = 0.0849). Remark: For Case 6 only, you could write your non-conclusion as something like “the model is not inconsistent with the data” or “the data don’t disprove the model.” Remark: The χ² test keeps you from jumping to false conclusions. Eyeballing the observed and expected numbers (L2 and L3), you might think they’re fairly far off and the model must be wrong. Yet the test gives a largish p-value. Remark: If it had gone the other way — if p was less than α — you would say something like “At the .05 level of significance, the model is inconsistent with the data” or “the data disprove the model” or simply “the model is wrong”.
3

Solution: Use Case 7, 2-way table, in Inferential Statistics: Basic Cases.

(1) H0: Gun opinion is independent of party H1: Gun opinion depends on party α = .05 Put the two rows and three columns in matrix A. (Don’t enter the totals.) Select χ²-Test from the menu. Outputs are χ² = 26.13, df = 2, p=2.118098E-6 → p = 0.000 002 or <.0001. The problem states that the sample was random. With millions of party members, the samples are under 10% of the population. Check the B matrix and find that the lowest expected count is 106.45. Therefore, all expected counts are above the minimum of 5. Alternative: use MATH200A part 7 for steps 3–4 and RC. p < α; reject H0 and accept H1. At the .05 level of significance, gun opinion depends on party. Or, Gun opinion depends on party (p<0.0001). Remark: “Depends on” does not mean that’s the only factor. But if you don’t like “depends on”, you could say “is not independent of”. Or you could say, “party affiliation is a factor in a person’s opinion on gun control.”
4 This is goodness of fit to a model, Case 6 in Inferential Statistics: Basic Cases. Your H0 model is 1:1:1:1:1, or any model with five equal numbers.
(1) H0: Preferences among all first graders are equal. H1: First graders prefer the five occupations unequally. α = 0.05 MATH200A part 6 with {1,1,1,1,1} or similar in L1 and the observed data in L2. χ²=12.9412 → χ² = 12.94, df = 4, p=.011567 → p = 0.0116. Random sample: given. There are many, many first graders, far more than 10×425 = 4250. All L3’s (expected counts) are 85, so all are ≥5. p < α. Reject H0 and accept H1. At the 0.05 significance level, first graders in general have unequal preferences among the five occupations. Or, First graders in general have unequal preferences among the five occupations (p = 0.0116).
5 This is Case 7 in Inferential Statistics: Basic Cases.
(1) H0: Egg consumption and age at menarche are independent. H1: Egg consumption and age at menarche are not independent. α = 0.01 3×3 in A. Use MATH200A part 7 or `χ²-Test` results: χ² = 3.13, df = 4, p-.535967 → p = 0.5360 Random sample: given. At a glance, it looks like the sample size is around 100. But it’s obviously less than 10% of the number of women. Expected values (Matrix B) show one value 4.8148, which is below 5. You can say that it’s just barely below 5, and it’s the only one, so the requirement is effectively met. That’s true, but it’s also a moot point because of the high p-value. p > α. Fail to reject H0. At the 0.01 level of significance, we can’t determine whether egg consumption and age at menarche are independent or not. Or, We can’t determine whether egg consumption and age at menarche are independent or not (p = 0.5360). Remark: The large p-value makes it really tempting to declare that the two variables are independent. But that would be accepting H0, which we must never do. It’s always possible that there is a connection and we were just unlucky enough that this particular sample didn’t show it. Some researchers would say “There is insufficient evidence to reject the hypothesis of independence.” Strictly speaking, that’s the same error. However, when the audience is researchers, rather than the non-technical public, it may be understood that they’re not really accepting H0, only failing to reject it pending the outcome of a further study.
6 This is a goodness-of-fit problem, Case 6 in Inferential Statistics: Basic Cases.
(1) H0: Age distribution of grand jurors matches age distribution of county. H1: Age distribution of grand jurors does not match age distribution of county. α = 0.05 The county percentages are the model and go in L1. The numbers of jurors (not percentages) go in L2. Reminder: don’t include the total row. results: χ²=61.2656 → χ² = 61.27, df = 3, p-value = 3.2×10-13 or p < 0.0001 Because you’re not generalizing, the random-sample rule and the under-10% rule don’t matter. You need only check that all expected counts are ≥ 5, and since the lowest is 10.56, the requirements are met. p < α. Reject H0 and accept H1. At the 0.05 significance level, the age distribution of grand jurors is different from the age distribution in the county. Or, The age distribution of grand jurors is different from the age distribution in the county (p < 0.0001). Remark: There are a lot of reasons for this. Judges tend to be older and tend to prefer jurors closer to their own age. Also, older candidates are more likely to be retired, which means they are less likely to be exempt by reason of their occupation.
7 This is a 2-way table, specifically a test of independence. Use Case 7 in Inferential Statistics: Basic Cases.
(1) H0: Population size of chosen residence town is independent of population size of town raised in. H1: Population size of chosen residence town depends on population size of town raised in. α = 0.05 Enter the 3×3 array in Matrix A. (Never enter the totals in a 2-way table hypothesis test.) Use MATH200A part 7 or the calculator’s `χ²-Test` menu selection. results: df = 4, χ² = 35.74, p-value=3.271956E-7 → p-value = 0.000 000 3 or p-value < 0.0001 Simple random sample: given. 500 men is obviously far below 10% of the total number. All expected counts (Matrix B) are 14.364 or greater, ≥5. p < α. Reject H0 and accept H1. At the 0.05 significance level, there is an association between the size of town men choose to live in and the size of town they grew up in. Or, There is an association between the size of town men choose to live in and the size of town they grew up in (p < 0.0001).
8 This is a 2-way table, specifically a test of homogeneity. You have seven populations, representing by the seven treatments in the experiment, seven ways to pre-treat and treat a cold. If Echinacea is effective, the proportions of infection from the various treatments should be significantly different. Use Case 7 in Inferential Statistics: Basic Cases.
(1) H0: The tested treatments with Echinacea make no difference to the proportion who catch cold. H1: The treatments do make a difference. … α = 0.01 There were seven treatments and two outcomes, so enter your 7×2 matrix and run a `χ²-Test` or MATH200A part 7. Results: χ² = 4.74, df = 6, p-value = 0.5769 Common mistake: Never enter the totals in a two-way test. Random sample? Yes, randomized experimental design. ✔ Sample less than 10% of population? Yes, the population of people exposed to the common cold is indefinitely large. ✔ All expected values ≥5? Yes, matrix B shows all values at least 5.6. ✔ p > α. Fail to reject H0. At the 0.01 significance level, we can’t determine whether Echinacea is effective against the common cold or not. Or, We can’t determine whether Echinacea is effective against the common cold or not (p = 0.5769). Remark: Researchers might write something like “Echinacea made no significant difference to infection rates in our study” with the p-value or significance level. It’s understood that this does not prove Echinacea ineffective — this particular study fails to reach a conclusion. But as additional studies continue to find p > α, our confidence in the null hypothesis increases. Remark: If you used MATH200A part 7, there’s some interesting information in matrix C. The top left 7 rows and 2 columns are the χ² contributions for each of the seven treatments and two outcomes. All are all quite low, in light of the rule of thumb that only numbers above 4 or so are significant, even at the less stringent 0.05 level.

The last two rows are the total numbers and percentages of people who did and didn’t catch cold: 349 (87.5%) and 50 (12.5%). If Echinacea is ineffective, you’d expect to see about that same infection rate for each of the seven treatments. Sure enough, compute the rates from the rows of the data table, and you’ll find that they vary between 81% and 92%.

The third column is the total subjects in each of the seven treatments, and the overall total. Of course you were given those in the data table, but it’s always a good idea to use this information to check your data entry.

The fourth column is the percentage of subjects who were assigned to each of the seven treatments, totaling 100% of course.

## Solutions to Review Problems

Because this textbook helps you,
Because this textbook helps you,
BrownMath.com/donate.

### Problem Set 1: Short Answers

Write your answer to each question. There’s no work to be shown. Don’t bother with a complete sentence if you can answer with a word, number, or phrase.

1 Disjoint events cannot be independent. Why? Disjoint events, by definition, can’t happen on the same trial. That means if A happens, P(B) = 0. But if A and B are independent, whether A happens has no effect on the probability of B. With disjoint events, whether A happens does affect the probability of B. Therefore disjoint events can’t be independent.
2 (a) C
(b) For numeric data with sample size under 30, you check for outliers by making a box-whisker plot and check for normality by making a normal probability plot.
3 qualitative = attribute, non-numeric, categorical. Examples: political party affiliation, gender.
quantitative = numeric. Examples: height, number of children.

Common mistake: Binomial is a subtype of qualitative data so it’s not really a synonym. Discrete and continuous are subtypes of numeric data.

4 equal to 1/6. The die has no memory: each trial is independent of all the others.

The Gambler’s Fallacy is believing that the die is somehow “due for a 6”. The Law of Large Numbers says that in the long run the proportion of 6’s will tend toward 1/6, but it doesn’t tell us anything at all about any particular roll.

5 (a) pop. 1 = control, pop 2 = music
H0: p2 = p1 and H1: p2 < p1
Or: H0: p2p1 = 0 and H1: p2p1 < 0
(b) Case 5, Difference between Two Pop. Proportions; or 2-PropZTest

Common mistake: You must specify which is population 1 and which is population 2.

Common mistake: The data type is binomial: a student is in trouble, or not. There are no means, so μ is incorrect in the hypotheses.

6 Check this against the definition:
• Are there a fixed number of trials? Yes, you are rolling five dice, n = 5.
• Are there only two outcomes, success and failure? Yes, each die is either a 3 or not.
• Is the probability of success the same from trial to trial; are the trials independent? Yes, p = 1/6 for each die, and the dice are independent.

This is a binomial PD.

7 A

Remark: The significance level α is the level of risk of a Type I error that you can live with. If you can live with more risk, you can reach more conclusions.

8 B,D — B if p<α, D if p
9 “Disjoint” means the same as “mutually exclusive”: two events that can’t happen at the same time. Example: rolling a die and getting a 3 or a 6.

Complementary events can’t happen at the same time and one or the other must happen. Example: rolling a die and getting an odd or an even. Complementary events are a subtype of disjoint events.

10 For any set of continuous data, or discrete data with many different values. If the variable is discrete with only a few different answers, you could use a bar graph or an ungrouped histogram.

For a small- to moderate-sized set of numeric data, you might prefer a stemplot.

11 For mutually exclusive (disjoint) events. Example: if you draw one card from a standard deck, the probability that it is red is ½. The probability that it is a club is ¼. The events are disjoint; therefore the probability that it is red or a club is ½+¼ = ¾.
12 A, B

Remark: C is wrong because “model good” is H0. D is also wrong: every hypothesis test, without exception, compares a p-value to α. For E, df is number of cells minus 1. F is backward: in every hypothesis test you reject H0 when your sample is very unlikely to have occurred by random chance.

13 Continuous data are measurements and answer “how much” questions. Examples: height, salary
Discrete data usually count things and answer “how many” questions. Example: number of credit hours carried
14 C, D

Remark: As stated, what you can prove depends partly on your H1. There are three things it could be:

• If H1: p > 0.01 (machine producing too many defectives) and you calculate p-value<α, your conclusion is C.
• If H1: p ≠ 0.01 (machine not operating as specified) and you calculate p-value<α, your conclusion depends on the sample data. Your conclusion is either C above or the unlisted conclusion “the machine is producing fewer defectives than allowed”, in other words performing better than specified. See p < α in Two-Tailed Test: What Does It Tell You? for reminders about interpreting a two-tailed test when p-value<α.
• There is little reason to choose H1: p < 0.01 (the machine is performing better than specified), but if that is H1 and p-value<α then your conclusion is that the machine is performing better than specified; that conclusion is not listed above.

Regardless of H1, if p-value>α your conclusion will be D or similar to it.

Common mistake: Conclusion A is impossible because it’s the null hypothesis and you never accept the null hypothesis.

Conclusion B is also impossible. Why? because “no more than” translates to ≤. But you can’t have ≤ in H1, and H1 is the only hypothesis that can be accepted (“proved”) in a hypothesis test.

15 You can’t. You can reduce the likelihood of a Type I error by setting the significance level α to a lower number, but the possibility of a Type I error is inherent in the sampling process.

Remark: A Type I error is a wrong result, but it is not necessarily the result of a mistake by the experimenter or statistician.

16 (a) The population, the group you want to know something about, is all churchgoers. Common mistake: Not churchgoers who think evolution should be taught, but all churchgoers. “Churchgoers who think evolution should be taught” is a subgroup of that population, and you want to know what proportion of the whole population is in that subgroup.

(b) The size is unknown, but certainly in the millions. You also could call it infinite, or uncountable. Common mistake: Don’t confuse size of population with size of sample. The population size is not the 487 from whom you got surveys, and it’s not the 321 churchgoers in your sample.

(c) The sample size n is the 321 churchgoers from whom you collected surveys. Yes, you collected 487 surveys in all, but you have to disregard the 166 that didn’t come from churchgoers, because they are not your target group. Common mistake: 227 isn’t the sample size either. It’s x, the number of successes within the sample.

(d) No. You want to know the attitudes of churchgoers, so it is correct sampling technique to include only churchgoers in your sample.

If you wanted to know about Americans in general, then it would be selection bias to include only churchgoers, since they are more likely than non-churchgoers to oppose teaching evolution in public schools.

17 In your experiment, there was some difference between the average performance of Drug A and Drug B. The p-value is the chance of getting a difference that large or larger if Drug A and Drug B are actually equally effective. Using the number, you can say that if there is no difference between Drug A and Drug B, then there’s a 6.78% probability of getting this big a difference between samples, or even a bigger difference.

Common mistake: Your answer will probably be worded differently from that, but be careful that it is a conditional probability: If H0 is true, then there’s a p-value chance of getting a sample this extreme or more so. The p-value isn’t the chance that H0 is true.

18 (a) There are a fixed n = 100 trials, and the probability of success is p = 0.08 on every trial. This is a binomial distribution.
MATH200A part 3, or `binomcdf(100,.08,5)`

(b) This is a binomial distribution, for exactly the same reasons.
MATH200A part 3, or `binompdf(100,.08,5)`

(c) The probability of success is p = 0.08 on every trial, but you don’t have a fixed number of trials. This is a geometric distribution.
`geometpdf(.08,5)`

19 C

Remark: There is no specific claim, so this is not a hypothesis test.

20 r must be between −1 and +1 inclusive. (Symbolically, −1 ≤ r ≤ +1, orr | ≤ 1.) A value of r = 0 indicates no linear correlation. But this doesn’t necessarily mean no correlation, because another type of correlation might still be present. Example: the noontime height of the sun in the sky plotted against day of the year will show near zero linear correlation but very strong sine-wave correlation.
21 , proportion of a sample (In this case,  = 4/5 = 0.8 or 80%.)
22 attribute or qualitative, specifically binomial (“Are you satisfied with the food service?”)
23 Attribute (qualitative or categorical) data. This compact form of graph makes it easy to compare the relative sizes of all the categories. (A bar graph is also a common choice for qualitative data.)

Caution: The percentages must add to 100%. Therefore you must have complete data on all categories to display a pie chart. Also, if multiple responses from one subject are allowed, then a pie chart isn’t suitable, and you should use some other presentation, such as a bar graph.

24 When the data are skewed, prefer the median.
25 Because you can never accept the null hypothesis; only the alternative hypothesis can be accepted.
26 G (Possibly K, depending on your textbook; see below.

Remark: This problem tests for several very common mistakes by students. Always make sure that

• Your hypotheses include a population parameter (rules out A, E, and I because they have no symbol; rules out B, D, F, H, J, L because their symbols aren’t a population parameter)
• The “=” sign is in H0 (rules out A–D)

This leaves you with G and K as possibilities. Either can be correct, depending on your textbook. The most common practice is always to put a plain = sign in H0 regardless of H1, which makes G the correct answer. But some textbooks or profs prefer ≤ or ≥ in H0 for one-tailed tests, whch makes K the correct answer.

27 C
28 In an experiment, you assign subjects to two or more treatment groups, and through techniques like randomization or matched pairs you control for variables other than the one you’re interested in. By contrast, in an observational study you gather current or past data, with no element of control; the possibility of lurking variables severely limits the type of conclusions you can draw. In particular, you can’t conclude anything about causation from an observational study.
29 1–α, or (1−α)100% is also acceptable.
30 B

Remark: The Z-Test is wrong because you don’t know the SD of the selling price of all 2006 Honda Civics in the US. The 1-PropZTest and χ²-test are for non-numeric data. There is no such thing as a 1-PropTTest.

31 descriptive: presentation of actual sample measurements
inferential: estimate or statement about population made on the basis of sample measurements

Example: “812 of 1000 Americans surveyed said they believe in ghosts” is an example of descriptive statistics: the numbers of yeses and noes in the sample were counted. “78.8% to 83.6% of Americans believe in ghosts (95% confidence)” is an example of inferential statistics: sample data were used to make an estimate about the population. “More than 60% of Americans believe in ghosts” is another example of inferential statistics: sample data were used to test a claim and make a statement about a population.

32 C

Remark: Remember that the confidence interval derives from the central 95% or 90% of the normal distribution. The central 90% is obviously less wide than the central 95%, so the interval will be less wide.

33 A sample is a subgroup of the population, specifically the subgroup from which you take measurements. The population is the entire group of interest.

Example: You want to know the average amount of money a full-time TC3 student spends on books in a semester. The population is all full-time TC3 students. You randomly select a group of students and ask each one how much s/he spent on books this semester. That group is your sample.

34 D

Remark: This is unpaired numeric data, Case 4.

35 (a) A This is binomial because each respondent was asked “Did you feel strong peer pressure to have sex?” There is one population, high-school seniors, so this is Case 2.

(b) For binomial data, requirements are slightly different between CI and HT. Here you are doing a hypothesis test.

• Random sample? ✔
• At least 10 successes and 10 failures expected? npo = 500×0.25 = 125, and 500−125 = 375, both >10. ✔

Common mistake: For hypothesis test, you need expected successes and failures. It’s incorrect to use actual successes (150) and failures (350).

• Check that the sample is not too large: 10n = 10×500 = 5000, and far more than 5000 students graduate from US high schools each year. ✔

Common mistake: Some students answer this question with “n > 30”. That’s true, but not relevant here. Sample size 30 is important for numeric data, not binomial data.

### Problem Set 2: Calculations

36 Numeric data, two populations, independent samples with σ unknown: Case 4 (2-SampTTest).

Common mistake: You cannot do a 2-SampZTest because you do not know the standard deviations of the two populations.

(1) Population 1 = Judge Judy’s decisions; Population 2 = Judge Wapner’s decisions H0: μ1 = μ2, no difference in awards H1: μ1 > μ2, Judge Judy gives higher awards α = 0.05 Random sample Sample sizes are both above 30, so there’s no worry about whether the population data are normal. 2-SampTTest: x̅1=650, s1=250, n1=32, x̅2=580, s2=260, n2=32, μ1>μ2, Pooled: No Results: t=1.10, p-value = .1383 p > α. Fail to reject H0. At the 0.05 level of significance, we can’t tell whether Judge Judy was more friendly to plaintiffs (average award higher than Judge Wapner’s) or not.

Some instructors have you do a preliminary F-test. It gives p=0.9089>0.05, so after that test you would use Pooled:Yes in the 2-SampTTest and get p=0.1553.

37 normalcdf(20.5, 10^99, 14.8, 2.1) = .00332. Then multiply by population size 10,000 to obtain 33.2, or about 33 turkeys.
38

Solution: This is one-population numeric data, and you don’t know the standard deviation of the population: Case 1. Put the data in L1, and 1-VarStats L1 tells that  = 4.56, s = 1.34, n = 8.

(1) H0: μ = 4, 4% or less improvement in drying time H1: μ > 4, better than 4% decrease in drying time Remark: Why is a decrease in drying time tested with > and not CRIT(.9054). Therefore data are ND. Box-whisker (MATH200A part 2) shows no outliers.  (You don’t have to show these graphs on your exam paper; just show the numeric test for normality and mention that the modified boxplot shows no outliers.) T-Test: μo=4, x̅=4.5625, s=1.34…, n=8, μ>μo Results: t = 1.19, p = 0.1370 p > α. Fail to reject H0. At the 0.05 significance level, we can’t tell whether the average drying time improved by more than 4% or not.

(b) TInterval: C-Level=.95
Results: (3.4418, 5.6832)

(There’s no need to repeat the requirements check or to write down all the sample statistics again.)

With 95% confidence, the true mean decrease in drying time is between 3.4% and 5.7%.

39 (a) This is a binomial probability distribution: each rabbit has long hair or not, and the probability for any given rabbit doesn’t change if the previous rabbit had long hair. Use MATH200A part 3.

n = 5, p = 0.28, from = 0, to = 0. Answer: 0.1935

Alternative solution: If you don’t have the program, you can compute the probability that one rabbit has short hair (1−.28 = 0.72), then that all the rabbits have short hair (0.72^5 = 0.1935), which is the same as the probability that none of the rabbits have long hair.

(b) The complement of “one or more” is none, so you can use the previous answer.

P(one or more) = 1−P(none) = 1−0.1935 = 0.8065

Alternative solution: MATH200A part 3 with n=5, p=.28, from=1, to=5; probability = 0.8065

(c) Again, use MATH200A part 3 to compute binomial probability: n = 5, p = 0.28, from = 4, to = 5. Answer: 0.0238

Alternative solution: If you don’t have the program, do binompdf(5, .28) and store into L3, then sum(L3,5,6) or L3(5)+L3(6) = 0.0238.  Avoid the dreaded off-by-one error! For x=4 and x=5 you want L3(5) and L3(6), not L3(4) and L3(5).

For n=5, P(x≥4) = 1−P(x≤3). So you can also compute the probability as 1−binomcdf(5, .28, 3) = 0.0238.

(d) For this problem you must know the formula:

μ = np = 5×0.28 = 1.4 per litter of 5, on average

40 This is Case 7, a 2×5 table. (The total row and total column aren’t part of the data.)

Common mistake: It might be tempting to do this problem as a goodness-of-fit, Case 6, taking the Others row as the model and the doctors’ choices as the observed values. But that would be wrong. Both the Doctors row and the Others row are experimental data, and both have some sampling error around the true proportions. If you take the Others row as the model, you’re saying that the true proportions for all non-doctors are precisely the same as the proportions in this sample. That’s rather unlikely.

(1) H0: Doctors eat different breakfasts in the same proportions as others. H1: Doctors eat different breakfasts in different proportions from others. α = 0.05 χ²-Test gives χ² = 9.71, df = 4, p=0.0455 random sample Matrix B shows that all the expected counts are ≥5. (As an alternative, you could use MATH200A part 7.) p < α. Reject H0 and accept H1. Yes, doctors do choose breakfast differently from other self-employed professionals, at the 0.05 significance level.
41 (a) z = (x−μ)/σ ⇒ −1.2 = (x−70)/2.4 ⇒ x = 67.1″
or: x = zσ + μ ⇒ x = −1.2×2.4 + 70 = 67.1″

(b) 70−67.6 = 2.4″, and therefore z = −1. By the Empirical Rule, 68% of data lie between z = ±1. Therefore 100−68 = 32% lie outside z = ±1 and 32%/2 = 16% lie below z = −1. Therefore 67.6″ is the 16th percentile.

Alternative solution: Use the big chart to add up the proportion of men below 67.6″ or below z = −1. That is 0.15+2.35+13.5 = 16%.

(c) z = (74.8−70)/2.4 = +2. By the Empirical Rule, 95% of men fall between z = −2 and z = +2, so 5% fall below z = −2 or above z = +2. Half of those, 2.5%, fall above z = +2, so 100−2.5 = 97.5% fall below z = +2. 97.5% of men are shorter than 74.8″.

Alternative solution: You could also use the big chart to find that P(z > 2) = 2.35+0.15 = 2.5%, and then P(z < 2) = 100−2.5 = 97.5%.

42 (a) The histogram is shown at left. You must show the scale for both axes and label both axes. The scale for the horizontal axis is predetermined: you label the edges of the histogram bars and not their centers. You have some latitude for the scale of the vertical axis, as long as you include zero, show consistent divisions, and have your highest mark greater than 89. For example, 0 to 100 in increments of 20 would also work.

(b) Compute the class marks or midpoints: 575, 725, and so on. Put them in L1 and the frequencies in L2. Use `1-VarStats L1,L2` and get n = 219.
See Summary Numbers on the TI-83.

(c) Further data from `1-VarStats L1,L2`:  = 990.1 and s = 167.3

Common mistake: If you answered = 950 you probably did `1-VarStats L1` instead of `1-VarStats L1,L2`. Your calculator depends on you to supply one list when you have a simple list of numbers and two lists when you have a frequency distribution.

(d) f/n = 29/219 ≈ 0.13 or 13%

43 The 85th percentile is the speed such that 85% of drivers are going slower and 15% are going faster.

invNorm(0.85, 57.6, 5.2) = 62.98945357 → 63.0 mph

44 (a) This is binomial data (each person either would or would not take the bus), hence Case 2, One population proportion.

MATH200A/sample size/binomial:  = .2, E = 0.04, C-Level = 0.90

Common mistake: The margin of error is E = 4% = 0.04, not 0.4.

Alternative solution: See Sample Size by Formula and use the formula at right. With the estimated population proportion  = 0.2 in the formula, you get zα/2 = z0.05 = invNorm(1−0.05) = 1.6449, and n = 270.5543 → 271

(b) If you have no prior estimate, use  = 0.5. The other inputs are the same, and the answer is 423

45 (a) For the procedure, see Step 1 of Scatterplot, Correlation, and Regression on TI-83/84. Your plot should look like the one at right.

You expect positive correlation because points trend upward to the right (or, because y tends to increase as x increases). Even before plotting, you could probably predict a positive correlation because you assume higher calories come from fat; but you can’t just assume that without running the numbers.

(b) See Step 2 of Scatterplot, Correlation, and Regression on TI-83/84.
r = .8863314629 → r = 0.8862
a = .0586751909 → a = 0.0587
b = −3.440073602 → b = −3.4401

ŷ = 0.0587x − 3.4401

Common mistake: The symbol is ŷ, not y.

(c) The y intercept is −3.4401. It is the number of grams of fat you expect in the average zero-calorie serving of fast food. Clearly this is not a meaningful concept.

Remark: Remember that you can’t trust the regression outside the neighborhood of the data points. Here x varies from 130 to 640. The y intercept occurs at x = 0. That is pretty far outside the neighborhood of the data points, so it’s not surprising that its value is absurd.

(d) See How to Find ŷ from a Regression on TI-83/84. Trace at x = 310 and read off ŷ = 14.749… ≈ 14.7 grams fat. This is different from the actual data point (x=310, y=25) because ŷ is based on a trend reflecting all the data. It predicts the average fat content for all 310-calorie fast-food items.

Alternative solution: ŷ = .0586751909(310) − 3.440073602 = 14.749 ≈ 14.7.

(e) The residual at any (x,y) is yŷ. At x = 310, y = 25 and ŷ = 14.7 from the previous part. The residual is yŷ = 10.3

Remark: If there were multiple data points at x = 310, you would calculate one residual for each point.

(f) From the `LinReg(ax+b)` output, R² = 0.7855834621 → R² = 0.7856 About 79% of the variation in fat content is associated with variation in calorie content. The other 21% comes from lurking variables such as protein and carbohydrate count and from sampling error.

(g) See Decision Points for Correlation Coefficient. Since 0.8862 is positive and 0.8862 > 0.602, you can say that there is some positive correlation in the population, and higher-calorie fast foods do tend to be higher in fat.

46 invNorm(1-.06, 2.0, 0.1) = 2.1555, about 2.16 mm
47 This is paired data, Case 3. (Each individual gives you two numbers, Before and After.)
(1) d = After − Before H0: μd = 0, no improvement H1: μd > 0, improvement in number of sit-ups Remark: Why After−Before instead of the other way round? Since we expect After to be greater than Before, doing it this way you can expect the d’s to be mostly positive (if H1 is true). Also, it feels more natural to set things up so that an improvement is a positive number. But if you do d=Before−After and H1:μd<0, you get the same p-value. α = 0.01 Random sample Enter the seven differences — 1, 4, 0, 6, 7, 12, 1 — into a statistics list. A normal probability plot (MATH200A part 4) shows a straight line with r(.957) > CRIT(.8978), so the data are normal. The modified box-whisker plot (MATH200A part 2) shows no outliers. The plots are shown here for comparison to yours, but you don’t need to copy these plots to an exam paper.  T-Test: μo=0, List:L4, Freq:1, μ>μo Results: t = 2.74, p = 0.0169, x̅ = 4.4, s = 4.3, n = 7 p > α. Fail to reject H0. At the 0.01 significance level, we can’t say whether the physical fitness course improves people’s ability to do sit-ups or not.
48 (a) normalcdf(-10^99, 24, 27, 4) = .2266272794 → 0.2266 or about a 23% chance

(b) normalcdf(-10^99, 24, 27, 4/√5) = .0467662315 → 0.0468 or about a 5% chance

49 Here you have a model (the US population) and you’re testing an observed sample (Nebraska) for consistency with that model. One tipoff is that you are given the size of the Nebraska sample but for the US you have only percentages, not actual numbers of people. This is Case 6, goodness of fit to a model.
(1) H0: Nebraska preferences are the same as national proportions. H1: Nebraska preferences are different from national proportions. α = 0.05 US percentages in L1, Nebraska observed counts in L2. MATH200A part 6. The result is χ² = 12.0093 → 12.01, df = 4, p-value = 0.0173 Common mistake: Some students convert the Nebraska numbers to percentages and perform a χ² test that way. The χ² test model can equally well be percentages or whole numbers, but the observed numbers must be actual counts. random sample L3 shows the expected values, and they are all above 5. p < α. Reject H0 and accept H1. Yes, at the 0.05 significance level Nebraska preferences in vacation homes are different from those for the US as a whole.
50 This is unpaired numeric data, Case 4.
(1) Population 1 = Course, Population 2 = No course H0: μ1 = μ2, no benefit from diabetic course H1: μ1 < μ2, reduced blood sugar from diabetic course α = 0.01 Independent random samples, both n’s >30 2-SampTTest: x̅1=6.5, s1=.7, n1=50, x̅2=7.1, s2=.9, n2=50, μ1<μ2, Pooled:No Results: t=−3.72, p=1.7E−4 or 0.0002 Though we do not, some classes use the preliminary 2-SampFTest. That test gives p=0.0816>0.05. Those classes would use Pooled:Yes in 2-SampTTest and get p=0.00016551 and the same conclusion. p < α. Reject H0 and accept H1. At the 0.01 level of significance, the course in diabetic self-care does lower patients’ blood sugar, on average.

(b) For two-population numeric data, paired data do a good job of controlling for lurking variables. You would test each person’s blood sugar, then enroll all thirty patients in the course and test their blood sugar six months after the end of the course. Your variable d is blood sugar after the course minus blood sugar before, and your H1 is μd < 0.

One potential problem is that all 30 patients receive a heightened level of attention, so you have to worry about the placebo effect. (With the original experiment, the control group did not receive the extra attention of being in the course, so any difference from the attention is accounted for in the different results between control group and treatment group.)

It seems unlikely that the placebo effect would linger for six months after the end of a short course, but you can’t rule out the possibility. There are two answers to that. You could re-test the patients after a year, or two years. Or, you could ask whether it really matters why patients do better. If they do better because of the course itself, or because of the attention, either way they’re doing better. A short course is relatively inexpensive. If it works, why look a gift horse in the mouth? In fact, medicine is beginning to take advantage of the placebo effect in some treatments.

51 This is a test on the mean of one population, with population standard deviation unknown: Case 1.
(1) H0: μ = 2.5 years H1: μ > 2.5 years α = 0.05 random sample, normal with no outliers (given) T-Test: μo=2.5, x̅=3, s=.5, n=6, μ>μo Results: t = 2.45, p = 0.0290 p < α. Reject H0 and accept H1. Yes, at the 0.05 significance level, the mean duration of pain for all persons with the condition is greater than 2.5 years.
52 (a) Each man or woman was asked a yes/no question, so you have binomial data for two populations: Case 5.
(1) Population 1 = men, Population 2 = women H0: p1 = p2 men and women equally likely to refuse promotions H1: p1 > p2 men more likely to refuse promotions α = 0.05 independent random samples For each sample, 10n = 10×200 = 2000 is far less than the total number of men or women. Men: 60 yes, 200−60 = 140 no; women: 48 yes, 200−48 = 152 no; all are ≥ 10. (The formal requirement uses the blended proportion p̂ = (60+48)/(200+200) = .27, so men have .27×200 = 54 expected yes and 200−54 = 146 expected no, and women have the same; again, all are ≥ 10.) 2-PropZTest: x1=60, n1=200, x2=48, n2=200, p1>p2 Results: z=1.351474757 → z = 1.35, p=.0882717604 → p-value = .0883, p̂1=.3, p̂2=.24, p̂=.27 p > α. Fail to reject H0. At the 0.05 level of significance, we can’t determine whether the percentage of men who have refused promotions to spend time with their family is more than, the same as, or less than the percentage of women.

(b) 2-PropZInt with the above inputs and C-Level=.95 gives (−.0268, .14682). The English sentence needs to state both magnitude and direction, something like this: Regarding men and women who refused promotion for family reasons, we’re 95% confident that men were between 2.7 percentage points less likely than women, and 14.7 percentage points more likely.

Common mistake: With two-population confidence intervals, you must state the direction of the difference, not just the size of the difference.

53 This problem depends on the Empirical Rule and knowing that the normal distribution is symmetric.

If the middle 95% runs from 70 to 130, then the mean must be μ = (70+130)÷2 → μ = 100

95% of any population are within 2 standard deviations of the mean. The range 70 to 100 (or 100 to 130) is therefore two SD. 2σ = 100−70 = 30 → σ = 15

54 This is binomial data, Case 2. (The members of the sample are insurance claims, and each claim either is settled or is not.)
(1) H0: p = .75 H1: p < .75 α = 0.05 random sample 10n = 10×65 = 650, obviously less than the total number of claims filed in the state. 65×0.75 = 48.75 expected successes and 65−48.75 = 16.25 expected failures, both ≥ 10. Common mistake: Don’t use the actual successes and failures, 40 and 65−40 = 25. That would be right for a confidence interval, but for a hypothesis test you assume H0 is true and so you must use the proportion 0.75 from your null hypothesis. 1-PropZTest: po=.75, x=40, n=65, prop
55 P(mislabeled) = P(Brand A and mislabeled) + P(Brand B and mislabeled) because those are disjoint events. But whether a pair is mislabeled is dependent on the brand, so

P(Brand A and mislabeled) = P(Brand A) × P(mislabeled | Brand A)

and similarly for brand B.

P(mislabeled) = 0.40 × 0.025 + 0.60 × 0.015 = 0.019 or just under 2%

Alternative solution: The formulas can be confusing, and often there’s a way to do without them. You could also do this as a matter of proportions:

Out of 1000 shoes, 400 are Brand A and 600 are Brand B.

Out of 400 Brand A shoes, 2.5% are mislabeled. 0.025×400 = 10 brand A shoes mislabeled.

Out of 600 Brand B shoes, 1.5% are mislabeled. 0.015×600 = 9 brand B shoes mislabeled.

Out of 1000 shoes, 10 + 9 = 19 are mislabeled. 19/1000 is 1.9% or 0.019.

This is even easier to do if you set up a two-way table, as shown below. The values in bold face are given in the problem, and those in light face are derived from them.

Brand A Brand B Total 40% × 2.5% = 1% 60% × 1.5% = 0.9% 1% + 0.9% = 1.9% 40% − 1% = 39% 60% − 0.9% = 59.1% 39% + 59.1% = 98.1% 40% 60% 100%
56

Solution: This is paired numeric data, Case 3.

Common mistake: You must do this as paired data. Doing it as unpaired data will not give the correct p-value.

(1) d = A−B H0: μd = 0, no difference in smoothness H1: μd ≠ 0, a difference in smoothness Remark: You must define d as part of your hypotheses. α = 0.10 random sample Compute the ten differences (positive or negative, as shown above) and put them in a statistics list. Use MATH200A part 4 for the normal probability plot to show data are normal. MATH200A part 2 gives a modified boxplot showing no outliers. T-Test: μo=0, List:L3, Freq: 1, μ≠μo Results: t = 1.73, p = 0.1173, x̅ = 1, s = 1.83, n = 10 p > α. Fail to reject H0. At the 0.10 level of significance, it’s impossible to say whether the two brands of razors give equally smooth shaves or not.
57 The key to this is recognizing the difference between with and without replacement. While (a) and (b) are both technically without replacement, recall that when the sample is less than 5% of a large population, as it is in (a), you treat the sample as drawn with replacement. But in (b), the sample of two is drawn from a population of only ten bills, so you must use computations for without replacement.

Solution: (a) Use MATH200A part 3 with n=2, p=0.9, from=1, to=1. Answer: 0.18

You could also use binompdf(2, .9, 1) = 0.18.

Alternative solution: The probability that exactly one is tainted is sum of two probabilities: (i) that the first is tainted and the second is not, and (ii) that the first is not tainted and the second is. Symbolically,

P(exactly one) = P(first and secondC) + P(firstC and second)

P(exactly one) = 0.9×0.1 + 0.1×0.9

P(exactly one) = 0.09 + 0.09 = 0.18

Solution: (b) When sampling without replacement, the probabilities change. You have the same two scenarios — first but not second, and not first but second — but the numbers are different.

P(exactly one) = P(first and secondC) + P(firstC and second)

P(exactly one) = (9/10)×(1/9) + (1/10)×(9/9)

P(exactly one) = 1/10 + 1/10 = 2/10 = 0.2

Common mistake: Many, many students forget that both possible orders have to be considered: first but not second, and second but not first.

Common mistake: You can’t use binomial distribution in part (b), because when sampling without replacement the probability changes from one trial to the next.

58 This is numeric data for one population with σ unknown: Case 1. Requirements are met because the original population (yields per acre) is normal. The T-Interval yields (80.952, 90.048). 81.0 < μ < 90.0 (90% confidence) or 85.5±4.5 (90% confidence)
59 No, because the probabilities on the five trials are not independent.

For example, if the first card is an ace then the probability the second card is also an ace is 3/51, but if the first card is not an ace then the probability that the second card is an ace is 4/51. Symbolically, P(A2|A1) = 3/51 but P(A2| not A1) = 4/51.

60 This is two-population binomial data, Case 5.

(a) T = 128/300 = 0.4267. C = 135/400 = 0.3375. TC = 0.0892 or about 8.9%

Remark: The point estimate is descriptive statistics, and requirements don’t enter into it. But the confidence interval is inferential statistics, so you must verify that each sample is random, each sample has at least 10 successes and 10 failures, and each sample is less than 10% of the population it came from.

The problem states that the samples were random, which takes care of the first requirement. There were 128 successes and 300−128 = 172 failures in Tompkins, 135 successes and 400−135 = 265 failures in Cortland, so the second reqirement is met.

What about the third requirement? You don’t know the populations of the counties, but remember that you can work it backwards. 10×300 = 3000 (Tompkins) and 10×400 = 4000 (Cortland), and surely the two counties must have populations greater than 3000 and 4000, so the third requirement must be met.

(b) 2-PropZInt: The 98% confidence interval is 0.0029 to 0.1754 (about 0.3% to 17.5%), meaning that with 98% confidence Tompkins viewers are more likely than Cortland viewers, by 0.3 to 17.5 percentage points, to prefer a movie over TV.

(c) E = 0.1754−0.0892 = 0.0862 or about 8.6%

You could also compute it as 0.0892−0.0029 = 0.0863 or (0.1754−0.0029)/2 = 0.0853. All three methods get the same answer except for a rounding difference.

61 This is binomial data for two populations, Case 5. (The members of the samples are seeds, and a given seed either germinated or didn’t.) Note: sample sizes are 80+20 = 100 and 135+15 = 150.
(1) Population 1 = no treatment, Population 2 = special treatment H0 p1 = p2, no difference in germination rates H1 p1 ≠ p2, there’s a difference in germination rates α = 0.05 independent random samples 10n1=10×100=1000; 10n2=10×150=2000; obviously there are far more than 3000 seeds of this type. In sample 1, 80 successes and 20 failures; in sample 2, 135 successes and 15 failures; all are at least 10. (The formal requirement uses the blended proportion p̂ = (80+135)/(100+150) = 0.86 to find expected successes and failures. For sample 1, 0.86×100 = 86 and 100−86 = 14; for sample 2, 0.86×150 = 129 and 150−129 = 21. All are at least 10.) 2-PropZTest: x1=80, n1=80+20, x2=135, n2=135+15, p1≠p2 Results: z = −2.23, p-value = 0.0256, p̂1 = .8, p̂2 = .9, p̂ = .86 p < α. Reject H0 and accept H1. Yes, at the 0.05 significance level, the special treatment made a difference in germination rate. Specifically, seeds with the special treatment were more likely to germinate than seeds that were not treated. Remark: p < α in Two-Tailed Test: What Does It Tell You? explains how you can reach a one-tailed result from a two-tailed test.

Alternative solution: You could also do this as a test of homogeneity, Case 7. The χ²-Test gives χ² = 4.98, df = 1, p=0.0256