Stats without Tears
9. Estimating Population Parameters
Updated 24 Dec 2017
(What’s New?)
Copyright © 2013–2024 by Stan Brown, BrownMath.com
Updated 24 Dec 2017
(What’s New?)
Copyright © 2013–2024 by Stan Brown, BrownMath.com
In Chapter 8, you learned what sort of samples to expect from a known population. In the rest of the course, you’ll learn how to use a sample to make statements about an unknown population. This is inferential statistics.
In inferential statistics, there are two types of things you want to do: test whether some claim is true, and estimate the size of some effect. In this chapter you’ll construct confidence intervals that estimate population means and proportions; Chapter 10 starts you on testing claims.
In the Physicians’ Health Study [see “Sources Used” at end of book], about 22,000 male doctors were randomly assigned to take aspirin or a placebo every night. Of 11,037 in the treatment group, 104 had heart attacks and 10,933 did not. Can you say how likely it is for people in general (or, at least, male doctors in general) to have a heart attack if they take aspirin nightly?
As always, probability of one equals proportion of all. So you could just as well ask, what proportion of people who take aspirin would be expected to have heart attacks?
Before statistics class, you would divide 104/11037 = 0.0094 and say that 0.94% of people taking nightly aspirin would be expected to have heart attacks. This is known as a point estimate.
But you are in statistics class. You know that a sample can’t perfectly represent the population, and therefore all you can say is that the true proportion of heart attacks in the population of aspirin takers is around 0.94%. Can you be more specific?
Yes, you can. You can compute a confidence interval for the proportion of heart attacks to be expected among aspirin takers, based on your sample, and that’s the subject of this chapter. We’ll get back to the doctors and their aspirin later, but first, let’s do an example with M&Ms.
Example 1: You take a random sample of 605 plain M&Ms, and 87 of them are red. What can you say about the proportion of reds in all plain M&Ms?
A point estimate of a population parameter is the single best available number, and in fact it’s nothing more than the corresponding sample statistic.
In this example, your point estimate for population proportion is sample proportion, 87/605 = 14.4%, and you conclude “Somewhere around 14.4% of all plain M&Ms are red.”
The sample proportion is a point estimate of the proportion in the population, the sample mean is a point estimate of the mean of the population, the sample standard deviation is a point estimate of the standard deviation of the population, and so on.
A confidence interval estimate of a population parameter is a statement of bounds on that parameter and includes your level of confidence that the parameter actually falls within those bounds.
For instance, you could say “I’m 95% confident that 11.6% to 17.2% of plain M&Ms are red.” 95% is your confidence level (symbol: 1−α, “one minus alpha”). and 11.6% and 17.2% are the boundaries of your estimate or the endpoints of the interval.
As an alternative to endpoint form, you could write a confidence interval as a point estimate and a margin of error, like this: “I’m 95% confident that the proportion of red in plain M&Ms is 14.4% ± 2.8%.” 14.4% is your point estimate, and 2.8% is your margin of error (symbol: E), also known as the maximum error of the estimate. Since the confidence interval extends one margin of error below the point estimate and one margin of error above the point estimate, the margin of error is half the width of the confidence interval.
For all the cases you’ll study in this course, the point estimate — the mean or proportion of your sample — is at the middle of the confidence interval. But that’s not true for some other cases, such as estimating the standard deviation of a population. For those cases, computing the margin of error is uglier.
As you might expect, your TI-83/84 and lots of statistical packages can compute confidence intervals for you. But before doing it the easy way, let’s take a minute to understand what’s behind computing a confidence interval.
You can compute an interval to any level of confidence you desire, but 95% is most common by far, so let’s start there. How do you use those 87 reds in a sample of 605 M&Ms to estimate the proportion of reds in the population, and have 95% confidence in your answer?
In Chapter 8, you learned how to find the sampling distribution of p̂. Given the true proportion p in the population, you could then determine how likely it is to get a sample proportion p̂ within various intervals. To find a confidence interval, you simply run that backward.
You don’t know the proportion of reds in all plain M&Ms, so call it p. You know that, if the sample size is large enough, sample proportions are ND and there’s a 95% chance that any given sample proportion will be within 2 standard errors on either side of p, whatever p is.
The standard error of the proportion is σ_{p̂} = √pq/n. You don’t know p — that’s what you’re trying to find. Are you stuck? No, you have an estimate for p. Your point estimate for the population proportion p is the sample proportion p̂ = 87/605. You can estimate the standard error of the proportion (the SEP) by using the statistics of your sample:
σ_{p̂} ≈ √(87/605)(1−87/605)/605 = 0.0142656 or about 1.4%
Two standard errors is 0.0285312 → 0.029 or 2.9%.
How good is this estimate? For decent-sized samples, it’s quite good. For example, suppose the true population proportion p is 50% or 0.5. For a sample of n = 625, the SEP is √.5(1−.5)/625 = 0.0200 or 2.00%. Your sample proportion is very, very, very unlikely to be as far away as 40% or 0.4, but even if it is then you would estimate the SEP as √.4(1−.4)/625 = 0.0196 or 1.96%, which is extremely close.
Different authors use the term “standard error” slightly differently. Some use it only for the standard deviation of the sampling distribution, which you never know exactly because you never know the population parameters exactly. Others use it only for the estimate based on sample statistics, which I computed just above. Still others use it for either computation. In practice it doesn’t make a lot of difference. I don’t see much point to getting too fussy about the terminology, given that only one of them can be computed anyway.
Any given sample proportion is 95% likely to be within two standard errors or 2.9% of the population proportion:
p−0.029 ≤ p̂ ≤ p+0.029 (probability = 95%)
Now the magic reverso: Given a sample proportion, you’re 95% confident that the population proportion is within 2.9% of that sample proportion:
p̂−0.029 ≤ p ≤ p̂+0.029 (95% confidence)
In this case, your sample proportion is 87/605 ≈ 0.144:
0.144−0.029 ≤ p ≤ 0.144+0.029 (95% confidence)
0.115 ≤ p ≤ 0.173 (95% confidence)
So your 95% confidence interval is 0.115 to 0.173, or 11.5% to 17.3%.
p−0.029 ≤ p̂ ≤ p+0.029
Multiply by −1. When you multiply by a negative, you have to reverse the inequality signs.
−p+0.029 ≥ −p̂ ≥ −p−0.029
Rewrite in conventional order, from smallest to largest.
−p−0.029 ≤ −p̂ ≤ −p+0.029
Now add p+p̂ to all three “sides”.
p̂−0.029 ≤ p ≤ p̂+0.029
You might have noticed that I changed from 95% probability to 95% confidence. What’s up with that? Well, the sample proportion is a random variable — different samples will have different sample proportions p̂, and you can compute the probability of getting p̂ in any particular range.
But the population proportion p is not a random variable. It has one definite value, even though you don’t know what that definite value is. Probability statements about a definite number make about as much sense as discussing the probability of precipitation for yesterday. The population proportion is what it is, and you have some level of confidence that your estimated range includes that true value.
What does “95% confident” mean, then? Simply this: In the long run, when you do everything right, 95% of your 95% intervals will actually include the population proportion, and the other 5% won’t. 5% is 5/100 = 1/20, so in the long run about one in 20 of your 95% confidence intervals will be wrong, just because of sample variability.
Probability of one = proportion of all, so there’s one chance in twenty that this interval is wrong, meaning that it doesn’t contain the true population proportion, even if you did everything right. If that makes you too nervous, you can use a higher confidence level, but you can never reach 100% confidence.
There’s one more wrinkle. That margin of error of 0.029 was 2σ_{p̂}, two standard errors. The figure of 2 standard errors for the middle 95% of a ND comes from the Empirical Rule or 68–95–99.7 Rule, so it’s only approximately right.
But you can be a little more precise. In Chapter 7 you learned to find the middle any percent, and that lets you generalize to any confidence level:
This Example | General Case | |
---|---|---|
Confidence level (middle area of the ND) |
95% | 1−α |
Area in the two tails combined | 100%−95% = 5% or 0.05 | 1−(1−α) = α |
Area in each tail | 0.05/2 = 0.025 | α/2 |
The boundaries are | ±z_{0.025} = invNorm(1−0.025) = 1.9600 | ±z_{α/2} |
The margin of error is | E = 1.96σ_{p̂} | E = z_{α/2} σ_{p̂} |
And you compute it as | E = 1.96 | E = z_{α/2} |
The margin of error on a 1−α confidence interval is z_{α/2} standard errors. (This will be important when you determine necessary sample size, below.)
The margin of error on a 95% confidence interval is close to 2σ_{p̂}, but more accurately it’s 1.96σ_{p̂}. For the proportion of red M&Ms, where the SEP was σ_{p̂} = 0.0142656, the margin of error is 1.96σ_{p̂} = 0.0279606 → 0.028 or 2.8%. Since the point estimate was 14.4%, you’re 95% confident that the proportion of reds in plain M&Ms is within 14.4%±2.8%, or 11.6% to 17.2%.
You’ve seen that there are two ways to state a confidence interval: from ____ to ____ with ____% confidence, or ____ ± ____ with _____% confidence. Mathematically these are equivalent, but psychologically they’re very different. The first form is better than the second.
What’s wrong with the ____ ± ____ form? It’s easy to misinterpret.
If you say “I’m 95% confident that the proportion of reds in plain M&Ms is within 14.4%±2.8%”, some people will read 14.4% and stop — they’ll think that the population proportion is 14.4%. And even people who get past that will probably think that there’s something special about 14.4%, that somehow it’s more likely to be the true proportion of reds among all plain M&Ms. But 14.4% is just a value of a random variable, namely the proportion of reds in this sample. Another sample would almost certainly have a different p̂ and therefore a different midpoint for the interval.
It’s much better to use the endpoint form, because the endpoint form is harder to misinterpret. When you say “I’m 95% confident that the proportion of reds in plain M&Ms is 11.6% to 17.2%”, you lead the reader, even the non-technical reader, to understand that the proportion could be anything in that range, and even that there’s a slight chance that it’s outside that range.
Requirements check (RC): This is an essential step — do it before you compute the confidence interval. Computing the CI assumes that the sampling distribution of p̂ is a ND, but “assumes” in statistics means you don’t assume, you check it.
The requirements are stated in Chapter 8 as simple random sample (or equivalent), np and nq both ≥ about 10, 10n ≤ N. You don’t know p, but for binomial data it’s okay to use p̂ as an estimate. But np̂ is just the number of yeses or successes in your sample, and nq̂ is just the number of noes or failures in your sample, so you really don’t need to do any multiplications.
Here’s how you check the requirements:
Your TI-83 or TI-84 can easily compute confidence intervals for a population proportion. With binomial data, this is Case 2 in Inferential Statistics: Basic Cases. (Excel can do it too, but it’s significantly harder in Excel.)
Example 2:
Let’s do the red M&Ms, since you already know the answer.
See the requirements check above.
Press [STAT
] [◄
] to get
to the STAT TESTS menu, and scroll up or down to find
1-PropZInt
. (Caution: you don’t want 1-PropZTest.
That’s reserved for
Chapter 10.)
Enter the number of successes in the sample, the sample size, and the confidence
level — easy-peasy!
Write down the screen name and your inputs, then proceed to
the output screen and
write down just the new stuff:
Here’s how you show your work:
1-PropZInt 87, 605, .95 (not PropZInt, please!)
(.11584, .17176), p̂ = .1438016529
There’s no need to write n=605 because you already wrote it down from the input screen.
Interpretation: I’m 95% confident that 11.6% to 17.2% of plain M&Ms are red.
You can vary that in several ways. For instance, some people like to put the confidence level last: 11.6% to 17.2% of plain M&Ms are red (95% confidence). Or they may choose more formal language: We’re 95% confident that the true proportion of reds in plain M&Ms is 11.6% to 17.2%.
I’ve already pooh-poohed the margin-of-error form, but sometimes you have to write it that way, for instance if your boss or your thesis advisor demands it. You can get it easily from the TI-83/84 output screen.
The center of the interval, the point estimate, is given: 14.38%. To find the margin of error, subtract that from the upper bound of the interval, or subtract the lower bound from it: .17176−.1438 = .02796, or .1438−.11584 = .02796. Either way it’s 2.8%. You can then express the CI as 14.4%±2.8% with 95% confidence.
Example 3: What about the male doctors who started this section? 104 out of 11037 of the doctors taking nightly aspirin had heart attacks. Assuming that male doctors are representative of adults in general, in terms of heart-attack risk, what can you say about the chance of heart attack for anyone who takes aspirin nightly? Use confidence level 1−α = 95%.
Solution: Requirements check (RC):
1-PropZInt 104, 11037, .95
(.00762, .011223), p̂ = .0094228504
Conclusion: People who take nightly aspirin have a 0.76% to 1.12% chance of heart attack (95% confidence).
Example: Suppose that your sample was only 50 doctors, and none of them had heart attacks. 3/50 = 6%, so you would be 95% confident that people who take nightly aspirin have a zero to 6% chance of heart attack.
The equation for margin of error is packed with information: E = z_{α/2}
You can see that a larger sample size n means a narrower confidence interval, but the sample size is inside the square-root sign so you don’t get as much benefit as you might hope for. If you take a sample four times as big, the square root of 4 is 2 and so your interval is half as wide, not ¼ as wide.
You can see also that you get a narrower interval if you’re willing to live with a lower confidence level. The lower your confidence interval, the smaller z_{α/2} will be, and therefore the narrower your confidence interval.
The bottom line is that there’s a three-way tension among sample size, confidence level, and margin of error. You can choose any two of those, but then you have to live with the third. (p̂ doesn’t come into it. Although p̂ does contribute to the standard error and therefore to the margin of error, you can’t choose what p̂ you’re going to get in a sample.)
If you want to get a confidence interval at your preferred confidence level with (no more than) a specified margin of error, how big a sample do you need? MATH200A Program part 5 will compute this for you, but let’s look at the formula first.
(See Getting the Program for instructions on getting the MATH200A program into your calculator.)
The equation at the start of this section shows the margin of error you get for a given sample size and confidence level. You can solve for the sample size n, like this:
E = z_{α/2} ⇒
In the formula, p̂ is your prior estimate if you have one. This can be the result of a past study, or a reasonable estimate if it has some logical basis. If you don’t have a prior estimate, use 0.5.
Using .5 as your prior estimate, you’re guaranteed that your sample won’t be too small, though it may be larger than necessary. Why not just use .5 all the time? Because taking samples always costs time and usually costs money, so you don’t want a larger sample than necessary.
Example 4: In a sample of 605 plain M&Ms, 87 were red. The 95% confidence interval had a 2.8% margin of error. How big a sample would you need to reduce the margin of error to 2%?
With the MATH200A program (recommended): | If you’re not using the program: |
---|---|
Press [
The next screen wants your estimated p, your desired margin of error, and your desired confidence level. Your prior estimate is 87/605, from your earlier study. Your margin of error is 2% = .02 (not .2 !), and your confidence level is 95% = .95. The output screen echoes back your inputs, in case you forgot to write down the input screen, and then tells you that the sample size must be at least 1183 M&Ms. Notice the inequalities: for a margin of error of .02 (2%) or less, you need a sample size 1183 or more. (z Crit is critical z or z_{α/2}, the number of standard errors associated with your chosen confidence level.) |
Marshal your data: prior estimate p̂ = 87/605, desired margin of error E = 0.02, and confidence level 1−α = 0.95. You need z_{α/2}. Get α/2 from the confidence level: 1−α = 0.95 ⇒ α = 0.05 ⇒ α/2 = 0.025 z_{α/2} is the z-score such that the area to the
right is α/2.
In this problem, α/2 = 0.025, so you’re computing
z_{0.025}.
You’ll use invNorm, but invNorm wants area to left and 0.025 is
an area to right, so
you compute
Now, to avoid re-entering that z value,
chain your calculations. The formula says you need to divide
by E, so simply press [ To square the fraction, press [ Finally, multiply by p̂ and (1−p̂). You get 1182.4…, and therefore your required sample size is 1183. Caution! Your answer is 1183, not 1182. You don’t round the result of a sample-size calculation. If it comes out to a whole number (unusual), that’s your answer. Otherwise, you round up to the next whole number. Why? Smaller sample size makes larger margin of error. n = 1182.4… corresponds to E = 0.02 exactly. A sample of 1182 would be just slightly under 1182.4…, and your margin of error would be just slightly over 0.02. But 0.02 was the maximum acceptable margin of error, so 1182.4… is the minimum acceptable sample size. You can’t take a fraction of an M&M in your sample, so you have to go up to the next whole number. |
There’s no requirements check in sample-size problems. These are planning how to take your sample; requirements apply to your sample once you have it.
Example 5: You’re taking the first political poll of the season, and you’d like to know what fraction of adults favor your candidate. You decide you can live with a 90% confidence level and a 3% margin of error. How many adults do you need in your random sample?
Solution: Since you have no prior estimate for p, make p̂ = .5.
With the MATH200A program (recommended): | If you’re not using the program: |
---|---|
MATH200A/sample size/binomial p̂=.5, E=.03, C-Level=.9, n ≥ 752 |
1−α = .9, E = .03, p̂ = .5 1−α = .9 ⇒ α = 0.1 ⇒ α/2 = 0.05 z_{0.05} = invNorm(1−.05) Divide by E, which is .03. Square the result. Multiply by p̂ times (1−p̂). n =751.5… → 752 |
Numeric data are pretty much the same deal as binomial data, though there are a couple of wrinkles:
The second one is a problem, because you almost never know the standard deviation of the population. Therefore, we won’t be working any problems for this case. Instead, I’ll give you a little more theory to lay the groundwork for the next section, which explains how we get around this knowledge gap.
If you know the standard deviation of the population — and you hardly ever do — then your confidence interval is
x̅ − z_{α/2} · σ/√n ≤ μ ≤ x̅ + z_{α/2} · σ/√n
If you’re ever in this situation, you can compute a
confidence interval on your TI-83/84 by choosing
ZInterval
in the STAT TESTS menu.
The margin of error is E = z_{α/2} · σ/√n, so the required sample size for a margin of error E with confidence level 1−α is
n = [ z_{α/2} · σ / E]²
(You can also use MATH200A part 5.)
“Houston, we have a problem!” A confidence interval is founded on the sampling distribution of the mean or proportion. Everything in Chapter 8 on the sampling distribution of the mean was based on knowing the standard deviation of the population. But you almost never know the standard deviation of the population. How to resolve this?
The solution comes from William Gosset, who worked for Guinness in Dublin as a brewer. (I swear I am not making this up.) In 1908 he published a paper called The Probable Error of a Mean [see “Sources Used” at end of book]. For competitive reasons, the Guinness company wouldn’t let him use his own name, and he chose the pen-name “Student”. The t distribution that he described in his paper has been known as Student’s t ever since.
While looking for Gosset’s original paper, I stumbled on Probable Error of a Mean, The (“Student”) (Moulton [see “Sources Used” at end of book]). It’s a fascinating look at what Gosset did and didn’t accomplish, and how this classic paper was virtually ignored for years. Things didn’t start to happen till Gosset sent a copy of his tables to R. A. Fisher with the remark that Fisher was the only one who would ever use them! It was Fisher who really got the whole world using Student’s t distribution.
Gosset knew that the standard error of the mean is σ/√n, but he didn’t know σ. He wondered what would happen if he estimated the standard error as s/√n, and did some experiments to answer that question. Since s varies from one sample to the next, this new t distribution spreads out more than the ND. Its peak is shallower, and its tails are fatter.
Actually, there’s no such thing as “the” t distribution. There’s a different t for each sample size. The larger the sample, the closer that t distribution is to a normal distribution, but it’s never quite normal.
For technical reasons, t distributions aren’t identified by sample size, but rather by degrees of freedom (symbol df or Greek ν, “nu”). df = n−1. Here are two t distributions:
Solid: standard normal distribution Line: Student’s t for df = 4, n = 5 |
Solid: standard normal distribution Line: Student’s t for df = 29, n = 30 |
What do you see? Student’s t for 4 degrees of freedom is quite a bit more spread out than the ND: 12.2% of sample means are more than two standard errors from the mean, versus only 5% for the ND.
At this scale, Student’s t for 29 degrees of freedom looks identical to the ND, but it’s not quite the same. You can see that 6% of sample means are more than two standard errors from the mean, versus 5% for the ND.
You don’t really need a list of properties of Student’s t, because your calculator is going to do the work for you. It’s enough to know this:
The logic of confidence intervals for numeric data is the same whether you know the standard deviation of the population or not. Even the requirements are the same. The only difference is between using a z and a t.
x̅ − t_{α/2} · s/√n ≤ μ ≤ x̅ + t_{α/2} · s/√n (1-α confidence)
(It’s understood that you have to use the right number of degrees of freedom, df = n−1, in finding critical t.)
Example 6: You’re auditing a bank.
You take a random sample of 50 cash deposits and find a mean of $189.56
and standard deviation of $42.17.
(a) Estimate the mean of all cash deposits, with 95% confidence.
(b) The bank’s accounting department tells you that the
average cash deposit is over $210.00. Is that believable?
Solution: You want to compute a confidence interval about the mean of all deposits. You have numeric data, and you don’t know the standard deviation of the population, σ. This is Case 1 in Inferential Statistics: Basic Cases. In your sample, n = 50, x̅ = 189.56, and s = 42.17.
First, check the requirements (RC):
Since the sample is large enough, there’s no need to verify normality or check for outliers.)
Now calculate the interval.
On your
TI-83/84, in the STAT TESTS menu, select 8:TInterval
. The
difference between Data
and Stats
is whether you
have all the data points, or just summary statistics. In this case
you have only the stats, so cursor onto Stats
and press
[ENTER
]. (The lower part of the screen may change.)
Enter your sample statistics and your desired
confidence level. Write down your inputs before you select
Calculate
:
TInterval 189.56, 42.17, 50, .95
Proceed to the output screen, and write down everything new. There isn’t much:
(177.58, 201.54)
Finally, write your interpretation. I’m 95% confident that the average of all cash deposits is between $177.58 and $201.54.
Caution! Don’t say anything like “95% of deposits are between $177.58 and $201.54.” Your confidence interval is an estimate of the true average of all deposits, and it’s not about the individual deposits. With a standard deviation of $42 and change, you would predict that 95% of deposits are within 2×42.17 = $84.34 either side of the mean, which is a much wider interval.
Now turn to part (b). Management claims that the average of all cash deposits is > $210.00. Is that believable? Well, it’s not impossible, but it’s unlikely. You’re 95% confident that the average of all deposits is between $177.58 and $201.54, which means you’re 95% confident that it’s not < $177.58 or > $201.54. But they’re claiming $210, which is outside your confidence interval. Again, they’re unlikely to be correct — there’s less than a 5% likelihood (100%−95% = 5%).
Example 7: In a random sample from the 237 vehicles on a used-car lot, the following weights in pounds were found:
2500 3250 4000 3500 2900 4500 3800 3000 5000 2200
Estimate the average weight of vehicles on the lot, with 90% confidence.
Solution: Check the requirements first. You have a small sample (n < 30), so you have to verify that the data are ND and there are no outliers. Here are the results of normality check and box-whisker plot in MATH200A:
There’s not much you can write for the box-whisker plot, but you can show the normality test numerically:
Now proceed to your TInterval
. This time you have
the actual data, so you choose Data
on the screen. Specify
your data list. Freq (frequency) should already be set to 1; if not,
first press the [ALPHA
] key once, and then [1
] [ENTER
].
Enter your confidence level, and write down your inputs:
TInterval L1, 1, .90
When you have raw data, everything on the output screen is new:
(2956, 3974)
x̅ = 3465, s = 878.1, n = 10
You’re 90% confident that the average weight of all vehicles on the lot is between 2956 and 3974 pounds.
Again, this is an estimate of the average weight of the population (the 237 cars on the lot). In your interpretation, you can’t say anything about the weights of individual vehicles, because you don’t know anything about the weights of individual vehicles, apart from your sample.
Why do you have to check for outliers? If your sample passes the normality check, isn’t that enough? No! If a sample passes the normality check, it still might have outliers.
No sample is perfectly normal, so you’re not actually deciding “is it normal or not?” Instead, you’re finding the strength of evidence against normality. The smaller r is, the stronger the evidence against a ND. If r < crit, the evidence is so strong that you say the data are non-normal. But if r > crit, you can’t say that the data are definitely normal, only that you can’t rule out a ND based on this test. But outliers make the evidence against the normal model too strong, so if outliers are present then you can’t treat the data as normal.
This “fail to prove” is similar to what you saw in Chapter 4 with decision points: you could prove that the correlation was non-zero, but you couldn’t prove that it was zero. Starting in Chapter 10, you’ll see that this is how inferential statistics works whenever you’re testing some proposition.
Why are outliers a problem? Well, your confidence interval depends on the mean and standard deviation of your sample. But x̅ and s are sensitive to outliers. (That sensitivity goes down as sample size goes up, so you don’t have to worry with samples bigger than about 30.)
To make this clearer, let’s look at an example. I drew these 15 points from a moderately skewed population:
157 171 182 189 201 208 217 219
229 242 247 252 265 279 375
The normality test shows r > crit. So far so good. But the box plot shows a big honkin’ outlier:
How big a difference does it make? Quite a lot, unfortunately. Here are the 95% confidence intervals for the original sample, and the sample with the outlier removed. The means are different, the standard deviations are really different, and the high ends of the confidence intervals are pretty different too. (The screens don’t show the margins of error, but they too are quite different: (258.45-199.28)/2 = 29.6 and (239.36-197.5)/2 = 20.9.)
Do you say that the outlier increased the mean by almost 5% and the SD by almost 50%, moved the confidence interval and made it wider? That’s not really fair — the sample is what it is (assuming you’ve ruled out a mistake in data entry). If you start throwing out points, you no longer have a random sample. On the other hand, that one point does seem to carry an awful lot of weight, and it doesn’t seem right to have results depend so heavily on one point.
So what do you do? If you can, you take another sample, preferably a larger one. Larger samples are less likely to have outliers in the first place, and outliers that do occur have less influence on the results.
But taking a new sample may not be practical. An alternative — not really great, but better than nothing — is to do the analysis both ways, once with the full sample and once with the outlier(s) excluded. That will at least give a sense of how much the outliers affect the results.
With the MATH200A program (recommended): | If you’re not using the program: |
---|---|
Example 8: For the vehicle weights, your margin of error in a 90% CI was 3974−3465 = 509 pounds. How many vehicles would you need in your sample to get a 95% confidence interval with a margin of error of 500 pounds?
Solution: In MATH200A part 5, select
When you enter the last piece of information, you’ll notice that the calculator takes several seconds to come up with an answer; this is normal because it has to do an iterative calculation (fancy words for trial and error). Critical t for a 95% CI with 14 degrees of freedom (n = 15) is 2.14, larger than critical z of 1.96 because the t distribution is more spread out. But of course what you really care about is the bottom line: to keep margin of error no greater than 500 pounds in a 95% CI, you need to sample at least 15 vehicles. How is this computed? Start with the margin of
error and solve for sample size:
E = t_{α/2}·s/√n ⇒ n = [t_{α/2}·s/E]² The problem here is that t_{α/2} depends on df, which depends on n, so you haven’t really isolated sample size on the left side. The only way to solve this equation precisely is by a process of trial and error, and that’s what MATH200 does. |
What if you don’t have the program? Since t is not super different from the normal distribution, you can alter the above formula and use z in place of t: n = [z_{α/2}·s/E]². But the t distribution is more spread out than the normal (z) distribution, so your answer may be smaller than the actual necessary sample size. If you do that and you get > about 30, it’s probably nearly right for the t distribution. If your answer is small, you should increase it so that the TInterval doesn’t come out with too large a margin of error. You calculate z_{α/2} exactly as you did in the sample-size formula for a confidence interval about a proportion. For example, with a 95% CI, 1−α = 1−0.95, α = 0.05, and α/2 = 0.025. z_{α/2} = z_{0.025} = invNorm(1−.025) = 1.9600. so using z for t you compute sample size [1.96·878.1/500]² = 11.8… → 12. That’s well under 30, so you want to bump it up a bit. I’m deliberately glossing over this, because the program is a lot easier. But if you want more, check out Case 1 in How Big a Sample Do I Need? That page gives you all the details of the method, with worked-out examples. At first glance, this procedure is less precise than the successive approximations done by MATH200A. But in fairness, there’s one more source of un-preciseness that neither method can avoid. Unlike binomial data, where small variations in the prior estimate p̂ made little difference to the computed sample size, for numeric data variations in the standard deviation do make a difference in computed sample size. Since s is squared in the formula, it can be a big difference. This can swamp any pettifogging details about t versus z. |
(The online book has live links.)
1-PropZInt
to compute a CI
estimate of the population proportion p. See Inferential Statistics: Basic Cases for
the requirements.TInterval
to compute a CI
estimate of the population mean μ. See Inferential Statistics: Basic Cases for
requirements.Chapter 10 WHYL → ← Chapter 8 WHYL
Write out your solutions to these exercises, showing your work for all computations. Then check your solutions against the solutions page and get help with anything you don’t understand.
Caution! If you don’t see how to start a problem, don’t peek at the solution — you won’t learn anything that way. Ask your instructor or a tutor for a hint. Or just leave it and go on to a different problem for now. You may find when you return to that “impossible” problem that you see how to do it after all.
The Neveready Company tested 40 randomly selected A-cell batteries to see how long they would operate a wireless mouse. They found a mean of 1756 minutes (29 hours, 16 minutes) and standard deviation (SD) 142 minutes. With 95% confidence, what’s the average life of all Neveready A cells in wireless mice?
You’re planning to conduct a poll about people’s attitudes toward a hot political issue, and you have absolutely no idea what proportion will be in favor and what proportion will be opposed. If you want a margin of error no more than 3.5% at 95% confidence, how large must your sample be?
The Department of Veterans Affairs is under fire for slow processing of veterans’ claims. An investigator for the Nightly Show randomly selected 100 claims (out of 68,917 at one office) and found that 40 of them had been open for more than a year. Find a 90% confidence interval for the proportion of all claims that have been open for more than a year.
For her statistics project, Sandra kept track of her commute times for 40 consecutive mornings (8 weeks). Treat this as a random sample. Her mean commute time was 17.7 minutes and her SD was 1.8 minutes. Find a 95% confidence interval for her average time on all commutes, not just this sample of 40.
Fifteen women in their 20s were randomly selected for health screening. As part of this, their heights in inches were recorded:
62.5 63 67 63.5 62 63 65 64.5 66.5 64.5 62.5 62 61.5 64.5 67.5
Construct a 95% confidence interval for the average height of women aged 20–29.
For his statistics project, Fred measured the body temperature of 18 randomly selected healthy male students. Here are his figures in °F:
98.3 97.7 98.6 98.5 97.5 98.6 98.2 96.9 97.9
96.9 97.8 99.3 98.6 99.2 96.9 97.8 97.9 98.3
(a) Write a 90% confidence interval for the average body temperature of healthy male students.
(b) What does this say about the famous “normal” temperature of 98.6°?
(c) What is his margin of error?
(d) To get an answer to within 0.1° with 95% confidence, how many students would he have to sample?
The
Colorectal Cancer Screening Guidelines
(CDC 2014 [see “Sources Used” at end of book])
recommend a colonoscopy every ten years for adults aged 50 to 75.
A public-health researcher interviews a simple random sample of 500
adults aged 50–75 in Metropolis (pop. 6.4 million) and finds
that 219 of them have had a colonoscopy in the past ten years.
(a) What proportion of all Metropolis adults in that age range
have had a colonoscopy in the past ten years, at the 90% level of
confidence?
(b) Still at the 90% confidence level, what sample size would be
required to get an estimate within a margin of error of 2%, if she
uses her sample proportion as a prior estimate?
The next year, you go back to audit the bank again. This time, you take a random sample of 20 cash deposits. Here are the amounts:
192.68 188.24 152.37 211.73 201.57 167.79 177.19 191.15 209.22 178.49 185.90 226.31 192.38 190.23 156.13 224.07 191.78 203.45 186.40 160.83
Construct a 95% confidence interval for the average of all cash deposits at the bank.
Not wanting to wait for the official results, Abe Snake commissioned an exit poll of voters. In a systematic sample of 1000 voters, 520 (52%) said they voted for Abe Snake. (14,000 people voted in the election.) That sounds good, but can he be confident of victory, at the 95% level?
Updates and new info: https://BrownMath.com/swt/