→ Stats w/o Tears → Ch 1 Solutions
Stats w/o Tears home page

Stats without Tears
Solutions for Chapter 1

Updated 1 June 2015 (What’s New?)
Copyright © 2013–2017 by Stan Brown

View or
These pages change automatically for your screen or printer. Underlined text, printed URLs, and the table of contents become live links on screen; and you can use your browser’s commands to change the size of the text or search for key words. If you print, I suggest black-and-white, two-sided printing.
Because this textbook helps you,
please click to donate!
Because this textbook helps you,
please donate at

← Exercises for Ch 1 

1 Sampling error is another name for sample variability, the fact that each sample is different from the next because no sample perfectly represents the population it was drawn from. Nonsampling errors are problems in setting up or carrying out the data collection, such as poorly worded survey questions and failure to randomize.

Nothing can eliminate sampling error, but you can reduce it by increasing your sample size. (Most nonsampling errors can be avoided by proper experimental design and technique.)

2 (a) systematic sample.
(b) It is probably a good sample of that gynecologist’s patients, since there’s no reason to think that one month is different from another. But it’s a bad sample of pregnant women in general, because it suffers from selection bias. This gynecologist’s patients may use prenatal vitamins differently from pregnant women who see other gynecologists or who don’t have a regular gynecologist.
(c) observational study
3 (a) completely randomized
(b) the plant food administered
(c) no food, Gro-Mor, Magi-Grow
(d) 13 heights at the end of the 13 weeks (You could also make a case for growth rate.)
(e) the 150 bulbs
(f) selection of plant food
(g) the group that gets no plant food
4 Each family answered the question “How many children do you have?”
(a) The variable is number of children.
(b) It is a discrete variable.
(c) It summarizes population data, and therefore it is a parameter.

Although “numeric” or “quantitative” is correct, it’s not an adequate answer because it is not as specific as possible. Discrete and continuous data are treated differently in descriptive statistics, so it matters which type you have.

Students are sometimes fooled by the decimal. Always ask yourself what was the original question asked or the original measurement taken from each member of the sample.

5 (a) The sample is the 80 people in your focus group. (It is not the drinks. It’s also not the people’s preferences: Their preferences are the data or sample data.)
(b) The sample size is 80, because that’s the number of people you took data from. It’s not 55: That’s just the number who gave one particular response.
(c) The population is not stated explicitly, but you can infer that it’s cola drinkers in general, or Whoopsie Cola drinkers in general.
(d) You don’t know how many cola drinkers (or Whoopsie Cola drinkers) there are. You can’t know, since people change their soft-drink habits all the time. You can say that the population is indefinitely large, or you can say that it’s infinite. (You can say that the population is uncountable, but don’t say that the population size is uncountable.)

Common mistake: Students sometimes answer “80” for population size, but this is not correct. You took data from 80 people, so those 80 people are your sample and 80 is your sample size.

6 (a) sampling error (or sample variability) (b) increase sample size
7 You’re asking people to admit to socially disapproved behavior. People tend to shade their answers toward socially acceptable behavior.

What can be done to reduce response bias? Interviewers should be trained to be absolutely neutral in voice and facial expression, which is how the Kinsey team gathered data on sexual behavior. Or the question can be asked on a written questionnaire, so that the subject isn’t looking another person in the face when answering. The question can also be made less threatening: “Have you ever left an infant alone in the house, even for just a minute?”


Best balance? Probably the cluster sample. The true random sample is a lot of work for a sample of 50, because after selecting the names you have to track the students down. The systematic sample, no matter how you do it, is going to miss a lot of students, and you have that time-period problem. With the cluster sample, you can time it for when students are likely to be home, and you can go back to follow up on those you missed.

But nothing is perfect, in this life where we are born to trouble as the sparks fly upward. The cluster sample works if the students were randomly assigned to rooms. When students pick their own roommates, they tend to pick people with similar attitudes, interests, and activities. That means those two are more similar to each other than other students, and there’s no way you can treat that cluster sample as a random sample. The cluster would probably be safe for freshman, where the great majority would be randomly assigned, but less so for students in later years.

9 No, you can’t reach that conclusion, because you can never conclude causation from an observational study. You would have to do an experiment, where people were randomly assigned to watch Fox News or to watch no news at all, and then see if there was a difference in how much they knew about the world.

Students often answer questions like this with hand-waving arguments, either coming up with reasons why it’s a plausible conclusion or coming up with reasons why it isn’t. This is statistics, and we have to follow the facts. Whatever you may think about Fox News, the fact is that observational studies can’t prove causation.

10 (a) It excludes people who don’t use the bus. This means that people who are dissatisfied with the bus are systematically under-represented. Your survey will probably show that willingness to pay is higher than it actually is.
(b) sampling bias
11 “Random” doesn’t mean unplanned; it takes planning. This is a bogus sample. If you want a more formal statistical word, call it a convenience sample, an opportunity sample or a non-probability sample.
12 (a) This is attribute data or qualitative data or non-numeric data. Don’t be fooled by the number 42: the original question asked was “Do you have at least one streaming device?” and that’s a yes/no question.

Alternative: the more specific answer binomial data, which you may have heard in the lecture though it’s not in the book till Chapter 6.

(b) This is descriptive statistics because it’s reporting data actually measured: 42% of the sample. If it said “42% of Americans”, then it would be inferential because you know not every American was asked, so the investigators must have extrapolated from a sample to the population.

(c) It is a statistic because it is a number that summarizes data from a sample.


All of these are nonsampling errors.

14 2.145E-4 is 0.0002145, and 0.0004 is larger than that.
15 It’s spurious precision. (That much precision could be appropriate if you had surveyed a few hundred thousand households.)

To fix it, round to one decimal place: 1.9. (Don’t make the common mistake of “rounding” to 1.8.)

16 (a) Non-numeric. (It has the form of a number, but think about the average area code in a group and you’ll realize an area code is not a number.)
(b) Continuous.
(c) Discrete.
(d) Non-numeric.
(e) Non-numeric.
(f) Discrete. (or continuous if you allow answers like 6.3)
17 (a) was done for you.
(b) Measurement: Amount of each dinner check. Continuous.
(c) Question: “Did you experience bloating and stomach pain?” Non-numeric.
(d) Measurement: Number of people in each party. Discrete.

What’s New

Because this textbook helps you,
please click to donate!
Because this textbook helps you,
please donate at

Updates and new info:

Site Map | Home Page | Contact