→ Stats w/o Tears → 6. Discrete Models
Stats w/o Tears home page

Stats without Tears
6. Discrete Probability Models

Updated 18 Jan 2017 (What’s New?)
Copyright © 2013–2017 by Stan Brown

View or
These pages change automatically for your screen or printer. Underlined text, printed URLs, and the table of contents become live links on screen; and you can use your browser’s commands to change the size of the text or search for key words. If you print, I suggest black-and-white, two-sided printing.

Intro: In Chapter 5, you looked at the probabilities of specific events. In this chapter, you’ll take a more global view and look at the probabilities of all possible outcomes of a given trial.


6A.  Random Variables

The random variable is one of the main concepts of statistics, and we’ll be dealing with random variables from now till the end of the course.


A variable is “the characteristic measured or observed when an experiment is carried out or an observation is made.”

Upton and Cook (2008, 401) [see “Sources Used” at end of book]

If the results of that procedure depend on chance, completely or partly, you have a random variable. Each outcome of the procedure is a value of the variable. We use a capital letter like X for a variable, and a lower-case letter like x for each value of the variable.

As you learned in Chapter 1, numeric variables can be discrete or continuous. A discrete random variable can have only specific values, typically whole numbers. A continuous random variable can have infinitely many values, either across all the real numbers or within some interval.

In this chapter, you’ll be concerned with discrete random variables. In the next chapter, you’ll look at one particular type of continuous random variable, the normal distribution.

Example 1: You roll three dice. The number of sixes that appear is a random variable, and the total number of spots on the upper faces is another random variable. These are both discrete.

Example 2: You randomly select a household and ask the family income for last year. This is a continuous random variable.

Example 3: You randomly select twelve TC3 students, measure their heights, and take the average. “Height of a student” is a continuous random variable, and “average height in a 12-student sample” is another continuous random variable.

Example 4: You randomly select 40 families and ask the number of children in each. “Number of children in family” is a discrete random variable, and “average number of children in a sample of 40 families” is a continuous random variable.

6B.  Discrete Probability Distributions

Definition: A discrete probability distribution or DPD (also known as a discrete probability model) lists all possible values of a discrete random variable and gives their probabilities. The distribution can be shown in a table, a histogram, or a formula. Like any probabilities, the probabilities in a DPD can be determined theoretically or experimentally.

Value, x
Chance of
Winning, P(x)
Two Camaros$100,0001 in 5,000,000
Cash10,0001 in 1,000,000
Apple iPad1,0001 in 500,000
Various5001 in 250,000
Gift card50.9999928

Example 5: In March 2013, Royal Auto sent me one of those “Win big!” flyers with a fake car key taped to it. The various prizes, and chances of winning, are shown at right.

This is a discrete probability distribution. The discrete variable X is “prize value”, and the five possible values of X are $100,000 down to $5.

Remember the two interpretations of probability: probability of one = proportion of all. From the table, you can equally well say that any person’s chance of winning a $500 prize is 1/250,000 = 0.000 004 = 0.0004%, or that in the long run 0.0004% of all the people who participate in the promotion will win a $500 prize.

A discrete probability distribution must list all possible outcomes. The total probability for all possible outcomes in any situation is 1. Therefore, for any discrete probability distribution, the probabilities must add up to 1 or 100%.

6B1.  Mean and Standard Deviation of a DPD

Definitions: Suppose you do a probability experiment a lot of times. (For the Royal Auto example, suppose bazillions of people show up to claim prizes.) Each outcome will be a discrete value. The mean of the discrete probability distribution, μ, is the mean of the outcomes from an indefinitely large number of trials, and the standard deviation of the discrete probability distribution, σ, is the standard deviation of the outcomes from an indefinitely large number of trials. The mean of any probability distribution is also called the expected value, because it’s the expected average outcome in the long run.

How do you find the mean and SD of a discrete probability distribution? Well, one interpretation of probability is long-term relative frequency, so you can treat a discrete probability distribution as a relative frequency distribution. (You can also think of the probabilities as weights, with the mean as the weighted average.) On the TI-83/84, that means good old 1-Var Stats, just like in Chapter 3.

Textbooks all list the formulas, so if you want to know them here they are. But in fact everybody uses software except in the simplest cases.

μ = ∑ x·P(x)     σ = √[ ∑ (x²·P(x)) − μ²]

For ∑, see ∑ Means Add ’em Up in Chapter 1.

Example 6: To find the mean and SD of the distribution of winnings in the Royal Auto sweepstakes, put the x’s in one list and the P(x)’s in another list. Caution: When the probability is a fraction, enter the fraction, not an approximate decimal. The calculator will display an approximate decimal, but it will do its calculations on a much more precise value.

After entering the x’s and p’s, press [STAT] [] [1] and specify your two lists, such as 1-Var Stats L1,L2. (Yes, the order matters: the x list must be first and the P(x) list second.) When you get your results, check n first. In a discrete probability distribution, n represents the total of the probabilities, so it must be exactly 1. If it’s just approximately 1, you made a mistake in entering your probabilities.

TI-83 Stat Edit input screen      TI-83 Stat Edit output screen; see text

The mean of the distribution is μ = $5.03, and the standard deviation is σ = $45.85.

Interpretation: In the long run, the dealership will have to pay out $5.03 per person in prizes. The SD is a little harder to get a grasp on, but notice that it’s more than nine times the mean. This tells you that there is a lot of variability in outcome from one person to the next. In general, the mean tells you the long-term average outcome, and the SD tells you the unpredictability of any particular trial. You can look at the SD as a measure of risk.

A couple of notes about the calculator output: The calculator knows that a DPD is a population, so it gives you σ and not s for the SD. It should give you μ for the mean, but instead it displays , so you need to make the change. I’ve already mentioned that the sum of the probabilities (n) must be exactly 1, not just approximately 1.

6B2.  Comparing DPDs: Parking Choices

Example 7: When visiting the city, should you park in a lot or on the street? On a quarter of your visits (25%), you park for an hour or less, which costs $10 in a lot; for parking more than an hour they charge a flat $14. If you park on the street, you might receive a simple $30 parking ticket (p = 20%), or a $100 citation for obstruction of traffic (p = 5%), but of course you might get neither. Which should you do?

(Adapted from Paulos 2004 [see “Sources Used” at end of book].)

You have two probability models here, one for the outcomes of parking in a lot, and one for street parking. Begin by putting the two models into tables:

Parking in lotxP(x)
≤ 1 hour$100.25
> 1 hour$14
Parking on streetxP(x)
Parking ticket$300.20
Obstruction ticket$1000.05
No ticket

The problem leaves out some things that you can figure for yourself. Remember that every probability model includes all outcomes, and the probabilities add up to 1. If there’s a 25% chance of parking up to an hour, there must be a 100−25 = 75% chance of parking more than an hour. And on the street, if you have a 20+5 = 25% chance of getting some kind of ticket, you have a 100−25 = 75% chance of getting neither. The cost of getting neither ticket is zero.

Now you can fill in the empty cells in the tables.

Parking in lotxP(x)
≤ 1 hour$100.25
> 1 hour$140.75
Parking on streetxP(x)
Parking ticket$300.20
Obstruction ticket$1000.05
No ticket$00.75

I showed the total probability to emphasize that it’s 1. Never compute the total of the outcomes (x’s), because that wouldn’t mean anything.

How do these tables help you make up your mind where to park? By themselves, they don’t. But they let you compute μ and σ, and that will help you decide.

I placed the x’s and P(x)’s for the parking lot in L1 and L2, and did 1-Var Stats L1,L2. I placed the x’s and P(x)’s for street parking in L3 and L4 and did 1-Var Stats L3,L4. Here are the results:

Lot: parameters for parking lot: mean=13, SD=1.732050808      Street: parameters for street parking: mean=11, SD=23.64318084

As always, look first at n. If it’s not exactly 1, find your mistake in entering the probabilities.

Now you can interpret these results. Parking in the lot is a bit more expensive in the long run (μ = $13.00 per day versus μ = $11.00 per day). But there are no nasty surprises (σ = $1.73, little variation from day to day). Parking on the street is much riskier (σ = $23.64), meaning that what happens today can be wildly different from what happened yesterday.

So what should you do? Statistics can give you information, but part of your decision is always your own temperament. If you like stability and predictability — if you are risk averse — you’ll opt for the parking lot. If it’s more important to you to save $2 a day on average, and you can accept occasionally getting hit with a nasty fine, you’ll choose to park on the street.

6B3.  Fair Price of a Game

Definitions: The fair price of a game is the price that would make all parties come out even in the long run. (We’re not just talking traditional games here. A game is any activity where the participants stand to gain or lose money or something else of value. Usually chance contributes to the outcome, but not necessarily.)

The fair price of a game is the price that would make the expected value or mean value of the probability distribution equal to zero, the break-even point.

(“Fair price” is one of those math words that look like English but mean something different. You should expect to pay more than the fair price because the operator of the game — the insurance company or casino or stockbroker — also has to cover selling and administrative expenses.)

There are two ways to compute the fair price:

Die showsxP(x)
6$60−12 = $481/6
6/6 = 1

Example 8: Take a really simple bar game: a stranger offers to pay you $60 if you roll a 6 with a standard six-sided die, but you have to pay him $12 per roll. Find the fair price of this game.

Method 1: The only prize is $60, and you have a 1/6 chance of winning it. $60×(1/6) = $10.

Method 2: Amounts in L1, probabilities in L2; 1-VarStats L1,L2. Verify that n=1, and read off the mean of −$2. The actual price is $12, so the fair price is $12 + (−2) = $10.

Naturally, the two methods always give the same answer. Method 2 is easier if you already know the mean of the probability distribution; otherwise Method 1 is easier.

Example 9: A lottery has a $6,000,000 grand prize with probability of winning 1 in 3,000,000. It also has a $10 consolation prize with probability of winning 1 in 1000. What is the fair price of your $5 lottery ticket?

Solution: You don’t need μ, so Method 1 is easier: multiply each prize by its probability and add up the products. $6,000,000×(1/3,000,000) + $10×(1/1000) → fair price is $2.01.

Why does a lottery ticket that is worth $2.01 actually cost $5.00? In effect, the lottery is paying out about 2.01/5.00 ≈ 40% of ticket sales in prizes. Some of the 60% that the lottery commission keeps will cover the lottery’s own expenses, and the rest is paid to the state treasury. This is actually fairly typical: most lotteries pay out in prizes less than half of what they take in. By contrast, the illegal “numbers game” pays out about 70%, or at least it did in the 1980s in Cleveland. (Don’t ask me how I know that!)

6C.  Bernoulli Trials

In the examples so far of probability models, I’ve had to give you a table of probabilities. But there are many subtypes of discrete probability distribution where the probabilities can be calculated by a formula. The rest of this chapter will look at part of one family, discrete probability distributions that come from Bernoulli trials.


Repeated trials of a process or an event are called Bernoulli trials if they have both of these characteristics:

  1. Each trial has only two possible outcomes. We call those “success” and “failure”. However, “success” is not necessarily a desirable outcome. Success simply means the outcome you’re interested in, and failure is the other outcome.
  2. The probability of success, denoted p, is the same for every trial. This is another way of saying that the trials are independent. (Even if they’re not independent, you can usually treat the trials as independent if the sample is a small part of the population, not more than about 10%.)

If the probability of success on each trial is p, then the probability of failure on each trial is 1−p, or q for short.

Bernoulli trials are named after Jacob Bernoulli, a Swiss mathematician. He developed the binomial distribution, which you’ll meet later in this chapter.

Example 10: You randomly interview 30 people to find out which party they will vote for in the next election. These are not Bernoulli trials, because there are more than two possible outcomes. (New York State ballots often have six or more parties listed, though some parties just endorse the Republican or Democratic candidate.)

Example 11:

On reflection, you realize that you don’t care which party a given voter will choose. All you care about is whether they are voting for your candidate or not, so you randomly select 30 registered voters and ask, “Will you be voting for Abe Snake for President?” (Yes, that’s a real thing; here’s a video.) These are Bernoulli trials, because there are only two answers, and the probability of voting for Abe Snake is the same for each randomly selected person. (p equals the proportion of Abe Snake voters in the population. Remember, proportion of all = probability of one.)

Actually, this overlooks the undecided or “swing” voters. These become fewer as the election gets closer, but in real life they can’t be overlooked because they may be a larger proportion than the leading candidate’s lead.

Example 12:

You draw cards from a deck until you get a heart. These are not Bernoulli trials. Although there are only two outcomes, heart and other suit, the probability changes with each draw because you have removed a card from the deck.

Variation: You replace each card and reshuffle the deck before drawing the next card. Then these become Bernoulli trials because the probability of drawing a heart is 25% on every trial.

Variation: You have five decks shuffled together, instead of one 52-card pack. You don’t replace cards after drawing them. You can treat these as Bernoulli trials even without replacement, because you won’t be drawing enough cards to alter the probabilities significantly.

How do I know? Five packs is 260 cards, and 10% of 260 is 26. On the first card, P(heart) = 25%. It’s quite unlikely that you’d have no hearts by the 26th card (0.04% chance), but if you did, the probability of a heart on the 27th card would be: 5×13/(5×52−26) ≈ 27.8%. That’s not much different from the original 25%.

(You don’t have to take my word for these probabilities. Use the sequences method from Chapter 5 to compute them.)

Although this sample without replacement violates independence, it doesn’t violate it by very much, not enough to worry about. This bears out what I said earlier: Trials without replacement can still be treated as independent when the sample is small relative to the population.

6D.  The Geometric Model

Example 13: According to the AVMA (2014) [see “Sources Used” at end of book] 30.4% of US households own one or more cats. Suppose you randomly select some households.
(a) How likely is it that the first time you find cat owners is in the fifth household?
(b) How likely is it that your first cat-owning household will be somewhere in the first five you survey?

Although you could compute these individual probabilities using techniques from Chapter 5, there’s a specific model called the geometric model that makes it a lot easier to compute. Also, using the geometric model you can get an overview of the probabilities for various outcomes, which you’d miss by computing probabilities of specific events using the previous chapter’s techniques. If trials are independent, and you want the probability of a string of failures before your first success, you’re using a geometric model.


The geometric model, also known as the geometric probability distribution, is a kind of discrete probability distribution that applies to Bernoulli trials when you try, and try, and try again until you get a success. P(x) is the probability that your first success will come on your xth attempt, after x−1 failures.

Expanding on the definition of Bernoulli trials, you can say that a geometric model is one where

The probability of success on any given trial, p, completely describes a geometric model.

geometric distribution for p=0.304 Here’s a picture of part of the geometric model for cat-owning households, with p = 0.304.

How do you read this? The horizontal axis is x, the number of the trial that gives your first success, and the vertical axis is P(x), the probability of that outcome.

For example, there’s a hair over a 30% chance that you’ll find cat owners in your first household, P(1) = 30.4%. There’s about a 21% chance that the first household won’t own cats but the second household will, P(2) ≈ 21%. Skipping a bit to x = 6, there’s just about a 5% chance that the first five households won’t have cats but the sixth will, P(6) ≈ 5%. And so forth.

x = 1 is always the most likely outcome, and larger x values are successively less and less likely. This is true for every geometric distribution, not just this particular one with p = 0.304.

The geometric model never actually ends. The probabilities eventually get too small to show in the picture, but no matter what x you pick, the probability is still greater than 0.

6D1.  Computing Probabilities

Your TI-83/84 calculator has two menu selections for the geometric model:

They’re both in the [2nd VARS makes DISTR] menu.

(If you have a calculator in the TI-89 family, use the [F5] Distr menu. Select Geometric Pdf and Geometric Cdf.)

Let’s use the calculator to find the answers for Example 13. Here p, the probability of success in any given household, is 30.4% or 0.304.

Part (a) wants the probability of four failures followed by a success on the fifth try. For that you use geometpdf. Press [2nd VARS makes DISTR] [] [] to get to geometpdf, and press [ENTER].

With the “wizard” interface: With the classic interface:
Enter p and x.

geometpdf 'wizard' screen, with p=.304 and x=5

Press [ENTER] twice, and your screen will look like the one at right.

After entering p and x, press [)] [ENTER] to get the answer.

geometpdf of .304 and 5 yields .0713362938

geometpdf(.304,5) = .0713362938 → 0.0713

There’s about a 7% chance you won’t find any cat owners in the first four households but you will in the fifth household.

(You could calculate this the long way. The probability of four failures followed by a success is (1−.304)4×.304. But the geometric model is easier. That’s the point of a model: one general rule works well enough for all cases, so you don’t have to treat each situation as a special case with its own unique methods.)

Part (b) wants the probability of a success occurring anywhere in the first five trials. This is a geometcdf problem. Press [2nd VARS makes DISTR] [] to get to geometcdf, and press [ENTER].

With the “wizard” interface: With the classic interface:
Enter p and x.

geometcdf 'wizard' on TI-84, showing p=.304 and x=5

Press [ENTER] twice, and your screen will look like the one at right.

After entering p and x, press [)] [ENTER] to get the answer.

geometcdf of .304 and 5 yields .0713362938

geometcdf(.304,5) = .8366774327 → 0.8367

There’s almost an 84% chance you will find at least one cat-owning household among the first five.

(Doing this the long way, you would use the complement. The complement of “at least one cat-owning household in the first five” is “no cat-owning households in the first five”. The probability that a given household doesn’t own a cat is q = 1−.304 = 0.696, and the probability that five in a row don’t own cats is 0.6965. Therefore the original probability you wanted is 1−(.6965) = 0.8367.)

You don’t actually need formulas for the geometric model, but if you’re curious about what your calculator is doing, here they are:
       geometpdf(p,x) = qx−1p     geometcdf(p,x) = 1−qx
where q = 1−p as usual. You can see that the two “long way” paragraphs above actually used those formulas.

6D2.  Mean and Standard Deviation of a Geometric Distribution

The geometric distribution is completely specified by p, so you can compute the mean and standard deviation quite easily:

μ = 1/p          σ = μ √q  or  (1/p) √(1−p)

Example 14: 30.4% of US households own cats. How many households do you expect you’ll need to visit to find a cat-owning household?

Solution: The expected value of a distribution is the mean. μ = 1/p = 1/.304 = 3.289473684. μ = 3.3. Interpretation: On average, you expect to have to visit between 3 and 4 households to find the first cat owners.

Caution! The expected value (mean) is not the most likely value (mode). Take a look back at the histogram, and you’ll see that the most likely value is 1: you’re more likely to get lucky on the first trial than on any other specific trial. But the distribution is highly skewed right, so the average gets pulled toward the higher numbers.

mean and standard deviation on TI-84; see text for numbers To compute the SD, just multiply the mean by √q. A handy technique is called chaining calculations. After first calculating the mean, press the [×] key, and the calculator knows you are multiplying the previous answer by something. Here you see that σ = 2.7.

Interpreting σ is a bit harder. The geometric distribution is a type of discrete probability distribution, so you interpret its standard deviation the same way as for any other DPD. In this particular example, σ is almost as large as μ, so you expect a lot of variability. If you and a lot of co-workers go out independently looking for households with cats, the group average number of visits will be 3.3 households, but there will be a lot of variability between different workers’ experience. You can’t use the Empirical Rule here because the geometric model is not a bell curve, but you can at least say you won’t be surprised to find workers who get lucky on the first house (μ−σ ≈ 0.5), and workers who have to visit six houses or more (μ+σ ≈ 6.0).

6D3.  Making a Decision

Some people find it very hard to make choices because they feel they must consider all the pros and cons of every possibility. Others look at possibilities one at a time and take the first one that’s acceptable. Studies such as The Tyranny of Choice (Roets, Schwartz, Guan 2012 [see “Sources Used” at end of book]) show that the first group may make better choices objectively, but the second group is happier with the items they choose.

Example 15: You have to buy a new sofa. You’d be content with 55% of the sofas out there. Let’s assume that your Web search presents sofas in an order that has nothing to do with your preferences. There are hundreds to choose from, so you decide to adopt the “first one that’s acceptable” strategy. How likely is it that you’d order the third sofa you’d see?

Solution: This is a geometric model, with two failures followed by one success. p = 55%. geometpdf(.55,3) = .111375. There’s about an 11% chance you’d order the third sofa.

6D4.  Baseball

Example 16: Larry’s batting average is .260. During which time at bat would he expect to get his first hit of the game? How likely is he to get his first hit within his first four times at bat?

Solution: This is a geometric model with p = 0.260. The mean or expected value is 1/p = 1/.26 = 3.85, about 4. On average, his first hit each game will come on his fourth time at bat. For the second question, geometcdf(.26,4) = .70013424; there’s about a 70% chance he’ll get his first hit within his first four times at bat.

6E.  The Binomial Model

In the previous section, we looked at the geometric model, where you just keep trying until you get a success. In this section, we’ll look at the binomial model, where you have a fixed number of trials and a varying number of successes.


The binomial model, also known as the binomial probability distribution or BPD, is a kind of discrete probability distribution that applies to Bernoulli trials when you have a fixed number of trials, n.

Expanding on the definition of Bernoulli trials, you can say that a binomial model is one where

Example 17: Cats again! 30.4% of US households own one or more cats. You visit five households, selected randomly.
(a) What’s the chance that no more than two have cats?
(b) What’s the chance that exactly two have cats?
(c) What’s the chance that at least two have cats?
(d) What’s the chance that two to four have cats?

binomial distribution for n = 5, p = 0.304 This problem fits the binomial model: n = 5 trials, each household does or does not have cats, and the probability p = 30.4% is the same for each household.

A picture of this binomial distribution is shown at right, and you can see some differences from the picture of the geometric distribution:

How do you read the picture? There’s about a 17% probability that none of the five households will have cats, about 36% that one of the five will have cats, and so on. (Why 36% and not 30.4%? Because there’s a greater chance of “winning” one out of five than one out of one.)

In this book we’re more concerned with computing probabilities, but it can be nice to get an overall picture of a distribution. I made this particular graph by using @RISK from Palisade Corporation, but you can also make histograms of binomial distributions by using MATH200A Program part 1(5).

6E1.  Computing Probabilities

Here you have a choice. Your TI-83/84 calculator comes with two menu selections for the binomial model, but the MATH200A program gives you a simpler interface. Here’s a quick overview of both, before we start on computations:

With the MATH200A program (recommended): If you’re not using the program:

MATH200A Program part 3 gives you one interface for all binomial probability calculations. The program might already be on your calculator from Chapter 3 boxplots, but if it’s not, see Getting the Program for instructions.

To find binomial probability with the program, press [PRGM]. If you see MATH200A in the list, press its menu number; otherwise, press [] or [] to get to MATH200A, and press [ENTER].

MATH200A menu screen That puts the program name on your home screen. Press [ENTER] again to run the program, and yet again to dismiss the title screen. You’ll then see a menu. Press [3] for binomial probability.

These are both in the [2nd VARS makes DISTR] menu:

  • binomcdf(n,p,x) answers the question “what’s the probability of no more than x successes in n trials (0 to x successes)?” (The “cdf” stands for cumulative distribution function, because the cdf functions accumulate the probabilities for a range of outcomes.)
  • binompdf(n,p,x) answers the question “what’s the probability of exactly x successes in n trials?” (The “pdf” stands for probability distribution function, because the probability for any particular number of successes is a function of [determined by] that number.)
Because this textbook helps you,
please click to donate!
Because this textbook helps you,
please donate at

Got a TI-89 family calculator? Use the [F5] Distr menu. Select Binomial Pdf or Binomial Cdf. The Cdf function can handle any range of successes, not just 0 to x. See Binomial Probability Distribution on TI-89 for full instructions.)

Now let’s use your TI-83/84 to answer the questions in Example 17. You have five trials, so n = 5. The probability of success on any given household is 30.4%, so p = 0.304.

(a) What’s the probability that no more than two of the five randomly selected households have cats?

With the MATH200A program (recommended): If you’re not using the program:

Press [PRGM], select MATH200A, and press [3] in the MATH200A menu.

MATH200A binomial input screen: n=5, p=.304, from=0, to=2
MATH200A binomial results screen: n=5, p=.304, x=0 to 2, probability=.8315878479

Enter n and p. “No more than two cats” is from 0 to 2 cats, so enter those values when prompted. The program echoes back your inputs and shows the computed probability. To show your work, write down the screen name, the inputs, and the result.

Conclusion: P(x ≤ 2) or P(0 ≤ x ≤ 2) = 0.8316.

The probability that no more than two of your five households have cats (in other words, the probability that 0 to 2 have cats) is binomcdf(5,.304,2). Press [2nd VARS makes DISTR] and scroll up to binomcdf.

binomcdf(5, .304, 2) = .8315878479) If you don’t have the “wizard” interface, or you have it turned off, binomcdf( will appear on your screen, Enter n, p, and the desired maximum number of successes, in that order, then the closing paren and [ENTER].

’Wizard’ interface input screen: trials=5, p=.304, x value=2 If you have the “wizard” interface, you get a menu screen, but you enter the same information. Press [ENTER] once on Paste and then again when the command is pasted to your home screen.

Either way, write down the binomcdf command and the argument numbers to show your work.

Conclusion: P(x ≤ 2) or P(0 ≤ x ≤ 2) = 0.8316.

(b) What’s the probability that exactly two of five randomly selected households are cat owners?

With the MATH200A program (recommended): If you’re not using the program:
MATH200A binomial input screen: n=5, p=.304, number of successes from 2 to r, probability=.4773995859
MATH200A binomial results screen: n=5, p=.304, x=22, probability=.3115838118

You need a specific number of successes, instead of a range. It’s almost exactly the same deal: you just enter the same number for from and to. In this example, to get the probability of exactly two successes, enter number of successes from 2 to 2.

Conclusion: P(x = 2) or P(2) = 0.3116.

binompdf(5, .304, 2) = .3115838118 (a) The probability of exactly two cat-owner households in five is binompdf(5,.304,2). Press [2nd VARS makes DISTR] and then press [] several times to get to binompdf. (Caution! pdf, not cdf.) Press [ENTER], type in the numbers, and press [)] [ENTER].

(The “wizard” interface screen is the same as it was for binomcdf.)

Conclusion: P(x = 2) or P(2) = 0.3116.

(c) What’s the probability that at least two of the five randomly selected households have cats?

With the MATH200A program (recommended): If you’re not using the program:

“At least two”, in a sample of five, means from two to five successes. Enter those values in MATH200A part 3. Here’s the results screen:

MATH200A binomial results screen: n=5, p=.304, x=2 to 5, probability=.4799959639

Conclusion: P(x ≥ 2) or P(2 ≤ x ≤ 5) = 0.4800.

This one is a little trickier. You could find P(2), P(3), P(4), and P(5) and add them up by hand, but that’s tedious and error prone, and it can introduce rounding errors. Instead, you’ll make the calculator add them up for you.

'wizard' interface for binompdf with trials=5, p=.304, and x-value left blank First, get all the probabilities for 0 through n successes into a statistics list. To do this, use binompdf (not cdf) but with only the n and p arguments. (If you have the “wizard” interface, leave x value blank.)

After the closing paren, don’t press [ENTER] just yet. Instead, press the [STO→] key and select a statistics list, such as [2nd 6 makes L6]. Then press [ENTER]. This puts the probabilities for 0 successes, 1 success, and so on to 5 successes into L6. (If you want, you could examine them with [], or on the [STAT] edit screen.)

binompdf(5, .304) stored to L6; sum(L6, 3, 6) = .4799959639 Now you need to sum the desired range of cells. You want 2 ≤ x ≤ 5. But the lowest possible x is 0, and the cells in statistics lists are numbered starting at 1. So to get x from 2 through 5, you need cells 3 through 6. When summing part of a list, add 1 to your desired x values.

Press [2nd STAT makes LIST] [] [5] to paste sum(, then [2nd 6 makes L6] [,] 3 [,] 6 [)] [ENTER].

Your answer: P(x ≥ 2) or P(2 ≤ x ≤ 5) = 0.4800.

Beware of off-by-one errors when you solve problems with phrases like at least and no more than. Always test the “edge conditions”. “Okay, I need at least 2, and that’s 2 through 5, not 3 through 5. Oh yeah, add 1 for the statistics list in the TI-83, so I’m summing cells 3 through 6, not 2 through 5.”

Alternative solution: Do you remember solving “at least” problems in Chapter 5? What was the lesson there? With laborious probability problems, the complement is your friend. What’s the complement of “at least two”? It’s “fewer than two”, which is the same as “no more than one”.

Shaky on the logic of complements? Use the enumeration method from Chapter 5: 0 1 2 3 4 5 or 0 1 | 2 3 4 5.

Find the probability of ≤1 household with cats, and subtract from 1:

P(x ≥ 2) = 1 − P(x ≤ 1)

P(x ≥ 2) = 1 − binomcdf(5, .304, 1)

P(x ≥ 2) = .4799959639 → 0.4800

(d) What’s the chance for two to four cat-owning households in your random sample of five households?

With the MATH200A program (recommended): If you’re not using the program:

Nothing new here: just use good old MATH200A part 3. Here’s the results screen:

MATH200A binomial results screen: n=5, p=.304, x=2 to 4, probability=.4773995859

P(2 ≤ x ≤ 4) = 0.4774.

sum(L6, 3, 5) = .4773995859 You need x from 2 through 4, but remember you always add 1 when summing binomial probabilities from a statistics list, so you put 3 to 5 in your sum command. (You’re still using the same distribution, so there’s no need to repeat binompdf.)

P(2 ≤ x ≤ 4) = 0.4774.

Alternative solution: You can also do it without summing. If you think about it, the probability for x from 2 to 4 is the probability for x from 0 to 4, with x below 2 (x no more than 1) removed: 0 1 2 3 4. In symbols,

P(0 ≤ x ≤ 1) + P(2 ≤ x ≤ 4) = P(0 ≤ x ≤ 4)

and by subtracting that first term you get

P(2 ≤ x ≤ 4) = P(0 ≤ x ≤ 4) − P(0 ≤ x ≤ 1)

binomcdf of 5, .304, 4 minus binomcdf of 5, .304, 1 yields .4773995859 Your probability is the result of subtracting two cumulative probabilities, the cdf from 0 to 4 minus the cdf from 0 to 1. It’s shown at right.

This is tricky, I admit. You have to set that x value correctly in the second binomcdf, so this method is not much better than the other one. About all it has going for it is that it avoids storing values in a list and then using sum.

You don’t actually need a formula for the binomial model, but if you’re curious about what your calculator is doing, here it is:
       binompdf(n,p,x) = nCx · px qnx
Why? px is the probability of getting successes on all of the first x trials. q is the probability of failure on one trial, and therefore qnx is the probability of failure on the remaining trials, after the x out of n successes. But in a binomial probability model, you care how many successes and failures there are, not in what order they occur. To account for the fact that order doesn’t matter, the formula has to multiply by nCx, “the number of ways to choose x objects out of n”. (If you want to know more about nCx, search “combinations” at your favorite math site.)

Unlike the geometric case, there’s no simple formula for binomcdf. Your calculator just has to compute probabilities for x = 0, 1, and so on and add them up.

6E2.  Baseball Again!

Example 18: Larry’s batting average is .260. How likely is it that he’ll get more than one hit in four times at bat?

binompdf(4, .26) stored to L6; sum(L6, 3, 5) = .27870128 Solution: This is a binomial model with n = 4, p = 0.26, x = 2 to 4. You can use MATH200A part 3 or the binompdf-sum technique to get .27870128. P(x > 1) = 0.2787 or about 28%. (The program is completely straightforward, so I’m showing only the tricky binompdf-sum sequence here.)

Alternative solution: If you don’t have the program, can you see how to use the complement to solve this problem more easily? Check your answer against mine to be sure that your method is correct.

6E3.  Mean and Standard Deviation of a Binomial Distribution

The binomial distribution depends on the proportion in the population (p) and your sample size (n). You can compute the mean and SD quite easily:

μ = np          σ = √[npq]

What are the mean and SD of the number of cat-owning households in a random sample of five households?

μ = np = 5 × 0.304 = 1.52

σ = √[npq] = √[5 × .304 × (1−.304)] = 1.028552381

Conclusion: μ = 1.5 and σ = 1.0.

Interpretation: in a sample of five households, the expected number of cat-owning households is 1.5. Or, if you take a whole lot of samples of five households, on average you will find that 1.5 households per sample own cats. The SD is 1.0. You can’t use the Empirical Rule, but you can say that you expect most of the samples of five to contain μ±2σ = 1.5±2×1.0 = 0 to 3 cat-owning households.

6E4.  Surprised?

Example 19: 30.4% of US households own one or more cats. You visit ten random households and seven of them own cats. Are you surprised at this result?


A result is surprising or unusual or unexpected if it has low probability, given what you think you know about the population in question. The threshold for “low probability” can vary in different problems, but a typical choice is 5%.

When we ask whether a result is surprising (unusual, unexpected), we are really talking about that result or one even further from the expected value.

You think you know that 30.4% of US households own cats. A sample of ten doesn’t seem very large; how do you decide whether seven successes seems reasonable or unreasonable?

First, what’s the expected value? That’s μ = np = 10×.304 = 3.04.

Next, what does “that result or one further from the expected value” mean? The expected value is 3.04, seven is greater than 3.04, so we’re talking about seven or more successes, x = 7 to 10.

1 minus binomcdf(10, .304, 6) = .0114590334 MATH200A output screen: n=10, p=.304, x=7 to 10, probability=.0114590334 Find the probability of that result or one even further from the expected value. That’s easiest with MATH200A part 3: set n=10, p=.304, x=7 to 10. You can also do it with binomcdf: seven or more successes is the complement of zero to six successes (0 1 2 3 4 5 6 7 8 9 10). Either way, the probability is 0.0115 or just over 1%.

Draw your conclusion. If 30.4% of US households own cats, finding seven or more cat houses in a random sample of ten households is unusual (surprising, unexpected).

That was a trivial example. But in real life, when a result is unexpected it can cast doubt on what you’ve been told. Here’s an example.

6E5.  A Life-or-Death Example

Example 20: In Talladega County, Alabama, in 1962, an African American man named Robert Swain was accused of rape. 26% of eligible jurors in the county were African American, but the 100-man jury panel for Swain’s trial included only 8 African Americans. (Through exemptions and peremptory challenges, all were excluded from the final 12-man jury.) Swain was convicted and sentenced to death.

Swain’s lawyer appealed, on grounds of racial bias in jury selection. The Supreme Court ruled in 1965 that “The overall percentage disparity has been small and reflects no studied attempt to include or exclude a specified number of blacks.”

What do you think of that ruling? If 100 men in the county were randomly selected, is eight out of 100 in the jury pool unexpected (unusual, surprising)?

Solution: This is a binomial model: every man in the county either is or is not African American, the sample size is a fixed 100, and in a random sample there’s the same 26% chance that any given man is African American.

To determine whether eight in 100 is unexpected, ask what is expected. For binomial data, μ = np = 100×.26 = 26; in a sample of 100, you expect 26 African Americans.

MATH200A output screen: n=100, p=.26, x=0 to 8, probability=4.734795002E minus 6 Okay, 26 is expected, 8 is less than 26, “further away from expected” is less than 8, so you compute the probability for x = 0 to 8.

Use binomcdf(100,.26,8) or MATH200A part 3. Either way you get a probability of 4.734795002E-6, or about 0.000 005, five chances in a million. That is unexpected. It’s so unlikely that we have to question the county’s claim that the selection was random.

Unfortunately, Mr. Swain’s lawyer didn’t consult a statistician.

What Have You Learned?

Key ideas:

(The online book has live links to all of these.)

Because this textbook helps you,
please click to donate!
Because this textbook helps you,
please donate at
Study aids:

Chapter 7 WHYL → ← Chapter 5 WHYL

Exercises for Chapter 6

Write out your solutions to these exercises, showing your work for all computations. Then check your solutions against the solutions page and get help with anything you don’t understand.

Caution! If you don’t see how to start a problem, don’t peek at the solution — you won’t learn anything that way. Ask your instructor or a tutor for a hint. Or just leave it and go on to a different problem for now. You may find when you return to that “impossible” problem that you see how to do it after all.

1 You roll five dice and count the number of twos that appear.
(a) List the possible values of the discrete random variable, X = “number of twos in five dice”.
(b) What type of probability model is appropriate? Why?
2 A lottery has a 1 in 10 million chance of paying $10,000,000, a 1 in 125 chance of paying $100, and a 1 in 20 chance of paying $10. A ticket costs $5, and you do not get that money back if you win a prize.
(a) Construct a discrete probability distribution.
(b) Is this a good deal or a bad deal for you? Explain.
3 Blood Types [see “Sources Used” at end of book] at the Stanford School of Medicine’s Web site lists the relative frequencies of blood types in the US. (There’s also a nice chart of what blood types you can safely receive, based on your own blood type.) Only 6.6% of the US have O negative blood.

Velma the Vampire will drink anything, but she prefers O negative. She doesn’t know a victim’s blood type until she tastes it.
(a) How many does she expect to drain before she gets some O negative?
(b) How likely is it that she’ll find her first O negative within her first ten victims?
(c) How likely is it that exactly two of her first ten victims will be O negative?

4 In January 2013, a CBS News story by Sarah Dutton and others [see “Sources Used” at end of book] reported poll results: 92% of American adults favored universal background checks for gun buyers.
(a) If TC3 students are representative of American adults when the poll was taken, what’s the chance that you’ll have to ask three TC3 students before the third one opposes universal background checks?
(b) How likely is it that you’d find a student opposing universal background checks somewhere in the first three you ask, not necessarily in third position?
5 Suppose 80% of students who register for Elizabethan Sonnets complete the course successfully.
(a) Imagine taking many, many samples of seven people, with replacement. What would be the expected number and standard deviation of the number of people that would finish successfully, per sample of seven?
(b) At the end of the semester, imagine a random group of seven students who originally registered for the course. Find the probability that four to six of them completed it successfully.
(c) What’s the chance that, when you ask each person in turn, the third person you ask is the first one who successfully completed the course?
(d) What’s the chance that the first person that you find who successfully completed the course is one of the first two you ask?
6 In a June 2013 poll, the Pew Research Center (2013b) [see “Sources Used” at end of book] found that 49% of American adults approved of President Obama’s job performance. In a random sample of 40 American adults, taken at the same time, would you be surprised if 13 approved his performance? Why or why not?
7 According to the Social Security Administration (2010) [see “Sources Used” at end of book], 0.1304% of 22-year-old males are expected to die in the next year.
(a) What is the fair price of a $100,000 one-year term life insurance policy on a 22-year-old male? (To keep things simple, assume that the company will charge the same price to every 22-year-old male, without regard to lifestyle or health factors.)
(b) The company actually charges $180.00 for this policy, more than the fair price. Is this unfair? Explain.
8 A coin is weighted — the chance of heads is not 50%. On five flips of that coin, the probability of various numbers of heads is shown by this model:
x 012345
P(x) 0.07780.25910.34560.23050.07680.0102

(a) Find and interpret the mean and standard deviation of this probability model.
(b) For an extra challenge, can you use your answer from part (a) to construct a simpler probability model for five flips of this coin?

9 Long experience shows that a particular drug will help 70% of the people who take it.
(a) If you take a random sample of five people, what is the probability that the drug helps at least three?
(b) If you take many samples of 10 people, what’s the average number of people per sample that the drug will help?
(c)In a random sample of 10 people, would you be surprised if the drug helps only five? Why or why not?
10 In April 2013, the Pew Research Center [see “Sources Used” at end of book] released poll results for the question “Which of the following best describes how you feel about doing your taxes?” Surprisingly (to me, anyway), 34% said they like or love doing their taxes.
(a) How many Americans would you expect to have to ask to find one who likes or loves doing her taxes?
(b) If you ask five random Americans, what’s the probability that none of them will say they like or love doing their taxes?
11 In a sentence or two, write down the difference between the geometric and binomial models. (Write it, don’t just think it. It’s easy to tell yourself you understand something, but the rubber meets the road when you have to put your understanding into words on paper.)
12 In a sentence or two, write down the difference between pdf and cdf.

Solutions → 

What’s New

Updates and new info:

Site Map | Home Page | Contact