The Normal Probability Plot is used to help judge whether or not a sample of numeric data

comes from a normal distribution. If it does not, you can often determine the type of departure

from normality by examining the way in which the data deviate from the normal reference line.

Sample StatFolio: probplot.sgp

Sample Data

The file bottles.sgd contains the measured bursting strength of n = 100 glass bottles, similar to a

dataset contained in Montgomery (2005). The table below shows a partial list of the data from

that file:

strength

255

232

282

260

255

233

240

255

254

259

235

262

The data to be analyzed consist of a single numeric column containing n = 2 or more

observations.

Data: numeric column containing the data to be summarized.

Select: subset selection.

The Analysis Summary shows the number of observations in the data column.

Probability Plot - strength

Data variable: strength

100 values ranging from 225.0 to 282.0

Also displayed are the largest and smallest values.

This pane displays the probability plot.

99.9

n:100

99 Median:255.0

95

Sigma:8.14815

W:0.97781

80 P:0.4162

50

20

5

1

0.1

220 240 260 280 300

stre ngth

The plot is constructed in the following manner:

The data are sorted from smallest to largest and the order statistics are determined. By

definition, the j-th order statistic is the j-th smallest observation in the sample, denoted by

x(j).

The data are then plotted at the positions

x( j) , 1 j 0.375

n 0.25 (1)

where 1(u) indicates the inverse standard normal distribution evaluated at u.

If desired, a straight line is fit to the data and added to the plot.

The normal probability plot is created in such a way that, if the data are random samples from a

normal distribution, they should lie approximately along a straight line. In the above plot, the

deviation of the values from the reference line at both ends indicates that the data may come

from a distribution with relatively longer tails than a normal distribution.

percentage

