BrownMath.com → TI-83/84/89 → Normality Check TI-83/84
Updated 14 Nov 2020 (What’s New?)

Normality Check on TI-83/84

Copyright © 2012–2023 by Stan Brown, BrownMath.com

Summary: In inferential statistics, you need to know whether small data sets are normally distributed. Actually, no real-life data set is exactly normal, but you can use your TI-83/84 to test whether a data set is close enough to normally distributed. The main tool for this is a normal probability plot. The closer the data set is to normal, the closer that plot will be to a straight line.

But just looking at a plot, you may not be sure whether it’s “close enough” to a straight line, especially with smaller data sets. Most of the time, you need to make some fairly gnarly computations to answer that question.

This note shows you how to make the plot and do the computations for an example, and then in an appendix it gives the theory behind all of this.

Alternatives: normal probability plot with n, r, and CRIT
Contents:

Example — Vehicle Weights

Consider these vehicle weights (in pounds):

2950 4000 3300 3350 3500 3550 3500 2900 3250 3350

Construct a plot to decide whether these vehicle weights seem to be normally distributed.

The Procedure

Step 1: Enter the numbers in L1.

Enter the data points. data points entered in L1 [STAT] [1] selects the list-edit screen.
 
Cursor onto the label L1 at top of first column, then [CLEAR] [ENTER] erases the list.
 
Enter the data values. (The order doesn’t matter.)

Step 2: Clear other plots.

In this step, you disable any other plots and graphs that could overlay your normal probability plot.

sample Y= screen Press [Y=] to open the list of equations and plots.
Look at the plots across the top, and look at the column of = signs. If any are enabled (highlighted), disable them. Y= with highlights removed Use the arrow keys to get to any highlight, and press [ENTER] to remove the highlight.
 
Caution! There are ten equations. Use [] to scroll down and check them all, down to Y0.

Step 3: Set up the normal probability plot.

Clear the grid and enable coordinate display for later use in tracing. Format screen with GridOff Press [2nd ZOOM makes FORMAT].
 
If GridOn is highlighted, press [ 3 times] [ENTER].
 
If CoordOff is highlighted, use the [] or [] key to get to CoordOn, and press [ENTER].
Select Plot1. Press [2nd Y= makes STAT PLOT] [ENTER].
Select the normal probability plot. Press [] [ 5 times] [ENTER] []. The bottom part of the screen may change when you do this.
Select list 1, data axis X, and squares for plotting. Plot1 setup screen Press [2nd 1 makes L1] [ENTER] [ENTER] [] [ENTER].

Step 4: Display the normal probability plot.

normal probability plot Press [ZOOM] [9], which is ZoomStat or “zoom to statistics”.

If this plot is close to a straight line, then the data set is close to a normal distribution. But how close is close enough? If you’re very lucky, the plot will be an obvious straight line, or it will be very far from a straight line. Then you can declare the data normal or not normal, and stop. But usually the plot is iffy, and you could go either way just from looking at it.

What about the plot for our example? There are certainly some bumps, but is the plot too far from a straight line? It’s hard to say. This is one of those iffy cases that you usually get.

The solution is to compute a correlation coefficient and a critical value, and compare them. The correlation coefficient is the same one you already know from an earlier chapter. The critical value is not the decision point from that chapter, for reasons explained in the appendix.

Step 5: Compute the y’s and r.

Because this article helps you,
please click to donate!
Because this article helps you,
please donate at
BrownMath.com/donate.

The x’s in the plot are your data values. You have an xy scatterplot, so it must have an r. The problem is that the calculator doesn’t give you any easy way to get at the y’s or the r, so you have to start from scratch.

(Again, MATH200A part 4 does all these calculations for you, so it’s really worth your while to download it. But if you can’t do that, keep reading and follow along.)

Sort the data points. Press [2nd STAT makes LIST] [] [1] to paste the sort( command. Press [2nd 1 makes L1] [(] [ENTER]. The calculator responds Done.

The next calculations could all be combined into one very long command storing into L2, but it would be really easy to make mistakes in such a long formula. Instead, I’ve broken the calculations into chunks.

Get the numbers 1, 2, 3, … into L2. sort and seq commands; see text Press [2nd STAT makes LIST] [] [5] to paste the seq( command.
 
Press [x,T,θ,n] [,] [x,T,θ,n] [,] [1] [,] and your number of data points. (For this example there are 10 data points.)
 
Finish with [)] [STO>] [2nd 2 makes L2] [ENTER].
Get the normal probabilities into L3. (The normal probabilities are the probabilities of getting each data point or a lower one by random selection, if the data points are normally distributed.)

The appendix gives the formula (i−0.375)/(n+0.25), where i is the numbers 1, 2, 3, … and n is the sample size. (By the way, this is slightly different from the formula that the TI-83/84 uses.)

storing normal probabilities in L3; see text Press [(] [2nd 2 makes L2] [] .375 [)] [÷]. Enter the number of data points and then finish with .25 [STO>] [2nd 3 makes L3] [ENTER].
Find z-scores that correspond to those probabilities. These are the z-scores that your data would have if they are normally distributed, and they are the y’s that your calculator used for the normal probability plot above. storing z-scores in L4; see text Press [2nd STAT makes LIST] [] [5] to paste the seq(command.
 
Press [2nd VARS makes DISTR] [3] to paste invNorm(. Press [2nd 3 makes L3] [(] [x,T,θ,n] [)] [)] — notice the double parenthesis — then [,] [x,T,θ,n] [,] [1] [,].
 
Enter the number of data points, then finish with [)] [STO>] [2nd 4 makes L4] [ENTER].
Now at long last you’re ready to compute r. This is the correlation coefficient of the points in the normal probability plot, and it tells you how close those points lie to a straight line. computing r; see text Press [STAT] [] [4] to paste LinReg(ax+b). Press [2nd 1 makes L1] [,] [2nd 4 makes L4] [ENTER].
 
(If r doesn’t appear and you get only a and b, run the DiagnosticOn command as explained in the Setup step of Scatterplot, Correlation, and Regression on TI-83/84.)

Step 6: Compute CRIT and compare it to r.

The correlation coefficient of 0.9599 (about 0.96) seems pretty good, but is it good enough? To answer this question, you have to compare it to a critical value. If r > CRIT, you can treat the data set as normally distributed. If r < CRIT, the data set is not normally distributed.

This rule is a little bit of an oversimplification, but it’s good enough as a rule of thumb. For a more precise statement, see the Theory appendix below.

The formula for CRIT is 1.0063 − 0.6118/n + 1.3505/n² − 0.1288/√n, where n is the sample size. computing CRIT; see text Enter that formula in your calculator, but with your actual number of data points in place of n. This data set has 10 points.

Whew! r = 0.9599 and CRIT = 0.9179. r > CRIT, and therefore you can say that the data set is close enough to a normal distribution.

Appendix — The Theory

The basic idea isn’t too bad. You make an xy scatterplot where the x’s are the data points, sorted in ascending order, and the y’s are the expected z-scores for a normal distribution. (I’m going to abbreviate “normally distributed” or “normal distribution” as ND to save wear and tear on my keyboard and your eyes.)

Why would you expect that to be a straight line? Recall the formula for a z-score: z = (x)/s. Breaking the one fraction into two, you have z = x/s/s. That’s just a linear equation, with slope 1/s and intercept /s. So an xz plot of any theoretical ND, plotting each data point’s z-score against the actual data value, would be a straight line.

Further, if your actual data points are ND, then their actual z-scores will match their expected-for-a-normal-distribution z-scores, and therefore a scatterplot of expected z-scores against actual data values will also be a straight line.

Now, in real life no data set is ever exactly a ND, so you won’t ever see a perfectly straight line. Instead, you say that the closer the points are to a straight line, the closer the data set is to normal. If the data points are too far from a straight line — if their correlation coefficient r is lower than some critical value — then you reject the idea that the data set is ND.

Okay, so you have to plot the data points against what their z-scores should be if this is a ND, and specifically for a sample of n points from a ND, where n is your sample size. This must be built up in a sequence of steps:

  1. Divide the normal curve (mentally) into n regions of equal probability and take one probability from each region. For technical reasons, the probability number you use for region i is (i−.375)/(n+.25). This formula is in many textbooks, and also in Ryan and Joiner’s paper Normal Probability Plots and Tests for Normality [full citation at https://BrownMath.com/swt/sources.htm#so_Ryan1976].
  2. Compute the expected z-scores for those probabilities. Working with the calculator, that’s just invNorm of (i−.375)/(n+.25).
  3. Plot those expected z-scores against the data values. This xy plot (or xz plot) has a correlation coefficient r, computed just like any other correlation coefficient.
  4. Compare the r for your data set to the critical value for the size of your data set. Ryan and Joiner determined that the critical value for sample size n, at the 0.05 significance level,, is 1.0063 − .1288/√n − .6118/n + 1.3505/n². To make it a little easier on the calculator I rearranged it as 1.0063 − .6118/n + 1.3505/n² − .1288/√n.
    In the same paper, they gave formulas for critical values at other significance levels:

    1.0071 − 0.1371/√n − 0.3682/n + 0.7780/n² at α=0.10

    0.9963 − 0.0211/√n − 1.4106/n + 3.1791/n² at α=0.01

The closer the points are to a straight line, the closer the data set is to fitting a normal model. In other words, a larger r indicates a ND, and a smaller r indicates a non-ND. You can draw one of two conclusions:

So the bottom line is, if r > CRIT, treat the data as normal, and if r < CRIT, don’t.

The normal probability plot is just one of many possible ways to determine whether a data set fits the normal model. Another method, the D’Agostino-Pearson test, uses numerical measures of the shape of a data set called skewness and kurtosis to test for normality. For details, see Assessing Normality in Measures of Shape: Skewness and Kurtosis.

What’s New?

Because this article helps you,
please click to donate!
Because this article helps you,
please donate at
BrownMath.com/donate.

Updates and new info: https://BrownMath.com/ti83/

Site Map | Searches | Home Page | Contact