BrownMath.com → Stats w/o Tears → Ch 3 Solutions
Stats w/o Tears home page

Stats without Tears
Solutions for Chapter 3

Updated 11 June 2014 (What’s New?)
Copyright © 2013–2017 by Stan Brown

View or
Print:
These pages change automatically for your screen or printer. Underlined text, printed URLs, and the table of contents become live links on screen; and you can use your browser’s commands to change the size of the text or search for key words. If you print, I suggest black-and-white, two-sided printing.
Because this textbook helps you,
please click to donate!
Because this textbook helps you,
please donate at
BrownMath.com/donate.

← Exercises for Ch 3 

1 When the data set is skewed, the median is better. Outliers tend to skew a data set, so usually the median is a better choice when you have outliers.
2 15% of people have cholesterol equal to or less than yours, so yours is on the low end. Though you might not really celebrate by eating high-cholesterol foods, there is no cause for concern.
3 (a) It uses only the two most extreme values.
(b) It uses only two values, but they are not the most extreme, so it is resistant.
(c) It uses all the numbers in the data set.
(d) Any two of: It is in the same units as the original data, it can be used in comparing z-scores from different data sets, you can predict what percentage of the data set will be within a certain number of SD from the mean.
4 (a) s is standard deviation of a sample; σ is standard deviation of a population.
(b) μ is mean of a population; is mean of a sample.
(c) N is population size or number of members of the population; n is sample size or number of members of the sample.
5 You were 1.87 standard deviations above average. This is excellent performance. 1.87 is almost 2, and in a normal distribution, z = +2 would be better than 95+2.5 = 97.5% of the students. 1.87 is not quite up there, but close. (In Chapter 7, you’ll learn how to compute that a z-score of 1.87 is better than 96.9% of the population.)
6 Since the weights are normally distributed, 99.7% (“almost all”) of them will be within three SD above and below the mean. 3σ above and below is a total range of 6σ. The actual range of “almost all” the apples was 8.50−4.50 = 4.00 ounces. 6σ = 4.00; therefore σ = 0.67 ounces.

Alternative solution: In a normal distribution, the mean is half way between the given extremes: μ = (4.50+8.50)/2 = 6.50. Then the distance from the mean to 8.50 must be three SD: 8.50−6.50 = 2.00 = 3σ; σ = 0.67 ounces.

7
AgesMidpoint (L1)Frequency (L2)
20 – 292534
30 – 393558
40 – 494576
50 – 5955187
60 – 6965254
70 – 7975241
80 – 8985147
(a) This is a grouped distribution, so you need the class midpoints, as shown at right. Enter the midpoints in L1 and the frequencies in L2.

Caution! The midpoints are not midway between lower and upper bounds, such as (20+29)/2 = 24.5. They are midway between successive lower bounds, such as (20+30)/2 = 25.

1-VarStats L1,L2 (Check n first!)

= 63.85656971 →  = 63.86

s = 15.43533244 → s = 15.44

n = 997

Common mistake: People tend to run 1-VarStats L1, leaving off the L2, which just gives statistics of the seven numbers 25, 35, …, 85. Always check n first. If you check n and see that n = 7, you realize that can’t possibly be right since the frequencies obviously add up to more than 7. You fix your mistake and all is well.

(b) You need the original data to make a boxplot, and here you have only the grouped data. A boxplot of a grouped distribution doesn’t show the shape of the data set accurately, because only class midpoints are taken into account. The class midpoints are good enough for approximating the mean and SD of the data, but not the five-number summary that is pictured in the boxplot.

8
CourseCredits (L2)GradeQuality
Points (L1)
Statistics3A4.0
Calculus4B+3.3
Microsoft Word1C−1.7
Microbiology3B−2.7
English Comp3C2.0
You need the weighted average, so put the quality points in L1 and the credits in L2. (No, you can’t do it the other way around. The quality points are the numeric forms of your grades, and you have to give them weights according to the number of credits in each course.)

1-VarStats L1,L2

n = 14 (This is the number of credits attempted. If you get 5, you forgot to include L2 in the command.)

 = 2.93

9 You don’t have the individual quiz scores, but remember what the average means: it’s the total divided by the number of data points. If your quiz average is 86%, then on 10 quizzes you must have a total of 86×10 = 860 percentage points. If you need an 87% average on 11 quizzes, you need 11×87 = 957 percentage points. 957−860 = 97; you can still skip the final exam if you get a 97 on the last quiz.
10
(a)
Commute Distance, km
 0- 9     4
10-19    12
20-29     7
30-39     1
40-49     1
Total    25
(b) The class width is 10 (not 9). The class midpoints are 5, 15, 25, 35, 45 (not 4.5, 14.5, etc.).

(c) Class midpoints in one list such as L2 and frequencies in another list such as L3. This is a sample, so symbols are , s, n, not μ, σ, N.
1-VarStats L2,L3
= 18.2 km
s = 9.5 km
n = 25

(d) Data in a list such as L1. 1-VarStats L1 gives  = 17.6 km, Median = 17, s = 9.0  km, n = 25

(e) box-whisker plot

(f) Mean, because the data are nearly symmetric. Or, median, because there is an outlier.
Comment: The stemplot made the data look skewed, but that was just an artifact of the choice of classes. The boxplot shows that the data are nearly symmetric, except for that outlier. This is why the mean and median are close together. This is a good illustration that sometimes there is no uniquely correct answer. It’s why your justification or explanation is an important part of your answer.

(g) The five-number summary, from MATH200A part 2 [TRACE], is 1, 12, 17 22.5, 45. There is one outlier, 45.
(The five-number summary includes the actual min and max, whether they are outliers or not.)

11 normal curve, shaded 500 to 700 (z=0 to 2), auxiliary line at z=minus 2, central area 95%, shaded area 47.5% Since 500 equals the mean, its z score is 0. For 700, compute the z score as z = (700−500)/100 = 2. So you need the probability of data falling between the mean and two SD above the mean. Make a sketch and shade this area.

Draw an auxiliary line at z = −2. You know that the area between z = −2 and z = +2 is 95%, so the area between z = 0 and z = 2 is half that, 47.5% or 0.475.

12 To compare apples and oranges, compute their z scores:

zJ = (2070−1500)/300 = 570/300 = 1.90

zM = (129−100)/15 = 29/15 = about 1.93

Because she has the higher z score, according to the tests Maria is more intelligent.

Remark: The difference is very slight. Quite possibly, on another day Jacinto might do slightly better and Maria slightly worse, reversing their ranking.

13
Test Scores Frequencies, f
(L2)
Class Midpoints, x
(L1)
470.0–479.915475.0
480.0–489.922485.0
490.0–499.929495.0
500.0–509.950505.0
510.0–519.938515.0
Start with the class marks or midpoints, as shown at right. (Class midpoints are halfway between successive lower bounds: (470+480)/2 = 475. You can’t calculate them between lower and upper bounds, (470+479.9)/2=474.95.)

Put class midpoints in a list, such as L1, and frequencies go in another list, such as L2. (Either label the columns with the lists you use, as I did here, or state them explicitly: “class marks in L1, frequencies in L2”.)

1-VarStats L1,L2 (Always write down the command that you used.)

(a) n = 154

(b) = 499.81 (before rounding, 499.8051948)

(c) s = 12.74 (before rounding, 12.74284519)

Be careful with symbols. Use the correct one for symbol or population, whichever you have.

Common mistake: The SD is 12.74 (Sx), not 12.70 (σ), because this is a sample and not the population.

14 The mean is much greater than the median. This usually means that the distribution is skewed right, like incomes at a corporation.

What’s New

Because this textbook helps you,
please click to donate!
Because this textbook helps you,
please donate at
BrownMath.com/donate.

Updates and new info: https://BrownMath.com/swt/

Site Map | Home Page | Contact