Drexel dragonThe Math ForumDonate to the Math Forum

Ask Dr. Math - Questions and Answers from our Archives
_____________________________________________
Associated Topics || Dr. Math Home || Search Dr. Math
_____________________________________________

Outliers in a Box-And-Whisker Plot


Date: 12/07/2000 at 07:21:19
From: Bob Tucek
Subject: Outliers in a box-and-whisker plot

I am teaching box and whisker plots to my seventh grade students. We 
have calculated 1.5 times the IQR and added it to the upper quartile 
and subtracted it from the lower quartile. If any data is beyond those 
points, it is an outlier.

The question is, "would it be an outlier if the point were equal to 
1.5*IQR away from one of the quartiles? My instincts tell me that it 
would not be; that the point would need to be farther away than that. 
This appears to be confirmed by the TI-83 calculator, since it does 
not graph the point as an outlier.

However, I used a worksheet for an assignment that had only one point 
that was exactly equal to 1.5*IQR away, and none farther away. The 
first question on the page asks them to identify the outlier. This 
implies that it would be an outlier. I have looked in every book I 
have, searched your archives and the archives of other math sites 
online and can't find a clarification anywhere. They all explain how 
to find them, but aren't specific enough to answer my question.  

Thank you,
Bob Tucek


Date: 12/07/2000 at 11:58:54
From: Doctor TWE
Subject: Re: Outliers in a box-and-whisker plot

Hi Bob - thanks for writing to Dr. Math.

Outliers are data points that are outside the range of the data values 
that we want to describe. Outliers can be due to an error in 
measurement of the value, a value from a different population, or 
simply a rare chance event. In any case, where we draw the line for 
"outside" is somewhat arbitrary. (Why 1.5*IQR? Why not 2*IQR? 
Orsqrt(10)*IQR? Or (pi/2)*IQR?) When taking statistical measurements, 
there is a "gray area," and methods for finding outliers are simply 
guidelines to help us find errant points - they're not intended to be 
absolute.

The farther away from the mean a data point is, the more suspect it 
is. I would describe a data point that is exactly 1.5*IQR away from 
the Quartiles as a "borderline outlier." The idea is to recognize that 
these data points are more likely to be "tainted."

The method you describe is, in fact, only one way of finding outliers. 
Another method is to define an outlier as any data point where the 
absolute value of the z-score is greater than 3 (i.e. it lies more 
than 3 standard deviations away from the mean). This definition would 
create a different set of boundaries for outliers.

Incidentally, my college statistics textbook (_Statistics for 
Engineering and the Sciences_, 4th edition; W. Mendenhall and T. 
Sincich; Prentice-Hall; 1995) describes suspect outliers as 
"observations that fall between the inner fences and the outer 
fences," where inner fences are defined at 1.5*IQR and outer fences 
are defined at 3*IQR. To me, this implies that a data point exactly on 
the inner fence would not be considered a suspect outlier (since it is 
not "between" the fences). But then it proceeds to describe highly 
suspect outliers as "observations that fall outside the outer fences." 
But how then are we to interpret a data point that falls exactly on 
the outer fence? It is, strictly speaking, neither "between the 
fences" nor "beyond the outer fence." [Perhaps we can call it a 
"somewhat highly suspect outlier."] An important thing to note is 
that in the end-of-chapter summary, it describes both of these as 
"rules of thumb for detecting outliers."

The bottom line: The reliability of data points in our data set is not 
an "all-or-nothing" situation, but rather colored in shades of gray. 
Where we choose to "draw the line" is somewhat arbitrary and can be 
determined using different methods. So pick a method, and just be 
consistent.

I hope this helps. If you have any more questions, write back.

- Doctor TWE, The Math Forum
  http://mathforum.org/dr.math/   
    


For more on the meanings of "quartile" and mathematicians' 
disagreements about them, see

  Defining Quartiles
  http://mathforum.org/library/drmath/view/60969.html

- Doctor Melissa, The Math Forum
  http://mathforum.org/dr.math/   
    
Associated Topics:
College Statistics
High School Statistics

Search the Dr. Math Library:


Find items containing (put spaces between keywords):
 
Click only once for faster results:

[ Choose "whole words" when searching for a word like age.]

all keywords, in any order at least one, that exact phrase
parts of words whole words

Submit your own question to Dr. Math

[Privacy Policy] [Terms of Use]

_____________________________________
Math Forum Home || Math Library || Quick Reference || Math Forum Search
_____________________________________

Ask Dr. MathTM
© 1994-2013 The Math Forum
http://mathforum.org/dr.math/