Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Q: Statistical Significance - Neural Networks?
Replies: 4   Last Post: Feb 25, 2014 1:31 PM

 Messages: [ Previous | Next ] Topics: [ Previous | Next ]
 Richard Ulrich Posts: 2,961 Registered: 12/13/04
Re: Q: Statistical Significance - Neural Networks?
Posted: Feb 23, 2014 9:36 PM

On Sun, 23 Feb 2014 15:49:12 -0800, "JoeCL" <joecl@earthlink.net>
wrote:

>Hello Everyone,
>
>I'm working with artificial neural networks and I need to derive some
>indication of how well the networks are identifying intended targets versus
>non-targets.
>
>The thing is that all targets are definitely not created equal. Some targets
>are much more important than others to get right, while others are of lower
>importance.

I'm wondering whether you are confounding "importance"
with frequency ... where you might consider the rare ones
important, OR you might consider the common ones important.

>
>I looked at this problem from a coin-toss perspective and used an online
>binomial calculator
>(http://stattrek.com/online-calculator/binomial.aspx#TopPage) to try to say
>that the chances of the neural network identifying targets by dumb luck was
>exceedingly unlikely. But this is based on all targets being of equal value
>and assuming a simple correctness measure (fraction right out of the total
>targets the network thought it found).

You might want to look at something like "Measures of
association for cross classifications" by Goodman and Kruskal.
JASA, 1954, or textbook, 1979.

For symmetrical measures of success or effect, the Odds Ratio
has the advantage of being very transparent. If you do have
data for which it is important not to miss certain outcomes,
you may prefer some asymmetrical measure. However, if
you want a test on a single outcome, almost every test
comes very close to the result of the simple contingency
chisquared test on the 2x2 table (Yes/No) by (Predicted/Actual).

If you have a large number of targets, I think you must
have a pretty good idea that you will have *some*
success -- Thus, a marginal test of "significance", I think,
should not be very interesting.

Again, you should be looking for *measures* of success,
not *tests* of success.

>
>If, as I said, the targets are of unequal value, then a correctness measure
>is only part of what's needed and some kind of weighting is involved.
>
>Do you have any thoughts about how to derive statistics that demonstrate the
>statistical significance of a neural network's "correctness" versus sheer
>chance when the targets are of unequal importance? And then, how would I
>state the results?

(Repeating myself....)
I think you are probably off-base when you try to force
the question into the shape of "statistical significance",
especially if you are concerned with matching some
criterion ("p < 0.01"). Neural networks - in my small
base of information - are poorly suited for direct tests
because they look at far too many possibilities. And that
should be the case, even more so, when there are
"multiple targets."

For a test, you need to measure the success of cross-validation,
not of the immediate fit. Find the measures that tell you
something interesting, Odds ratio or otherwise. Is it enough
to look at the separate targets, or do you need something
overall? Are you satisfied by selecting out a subset of
targets, in order to reflect their extra importance?

If your fit is pretty good, you might focus on the count
of actual *errors* (or "important errors"?) instead of
some more complicated statistic.

>
>With much appreciation for any insights?

Hope this helps.

--
Rich Ulrich

Date Subject Author
2/23/14 JoeCL
2/23/14 Richard Ulrich
2/24/14 Stephen Wolstenholme
2/24/14 Herman Rubin
2/25/14 JoeCL