Date: Dec 8, 2012 11:07 PM
Author: neuronet
Subject: How to fix ROC curve with point below diagonal?

I am building receiver operating characteristic (ROC) curves to evaluate classifiers using the area under the curve (AUC) (more details on that at end if you are interested). Unfortunately, points on the curve often go below the diagonal. For example, I end up with graphs that look like the one linked here (ROC curve in blue, identity line in grey):

http://i.stack.imgur.com/F4ZbX.gif

The the third point (0.3, 0.2) goes below the diagonal. To calculate AUC I want to fix such recalcitrant points. The standard way to do this, for point (fp, tp) on the curve, is to replace it with a point (1-fp, 1-tp), which is equivalent to swapping the predictions of the classifier.

For instance, in our example, our troublesome point A (0.3, 0.2) becomes point B (0.7, 0.8), which I have indicated in red:

http://i.imgur.com/s1IVT.gif

This is about as far as my books go in treating this issue. However, the problem is that if you now incorporate point B into a new ROC with point A removed, you end up with a nonmonotonic ROC curve as follows (red is the new ROC curve, and dotted blue line is the old one):

http://i.imgur.com/34OL4.gif

And here I am stuck. How can I fix this ROC curve?

Do I need to re-run my classifier with the data or classes somehow transformed to take into account this weird behavior? I have looked over reference [1] (below) but frankly it seems to be addressing a larger problem than this.

In terms of some details: I still have all the original threshold values, fp values, and tp values (and the output of the original classifier for each data point, an output which is just a scalar from 0 to 1 that is a probability estimate of class membership). I am doing this in Matlab starting with the perfcurve function.

------------------
[1] Flach and Wu (2005) Repairing concavities in ROC curves. Proceedings of the 19th international joint conference on Artificial intelligence.