Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
NCTM or The Math Forum.
|
|
|
Using Nearest function
Posted:
Jun 7, 2010 8:06 AM
|
|
This is my first attempt at writing Mathematica code but I am getting strange results which are probably due to some bug I cannot detect.
This code creates a test set with 2 classes and size 1000 from a Bivariate Gaussian distribution then creates 6 training sets with 2 classes and sizes 10^i, i = 1..6.
Then I run the Nearest neighbor algorithm on each train set and test set and compute the error rate.
However, as you can see from the table at the end, I get error rates that don't make much sense. I might as well flip a coin instead of running the algorithm. Unfortunately I cannot spot the bug in the code.
Thanks. **************************************************** (* Code *)
Needs["MultivariateStatistics`"];
m = 6; testSize = 1000;
MN1=MultinormalDistribution[{0.5,0.5},(1 0 0 1
)]; MN2=MultinormalDistribution[{-0.5,-0.5},(1 0 0 1
)];
RandomVector[n_]:=Join[Array[RandomReal[MN1]&,n/2],
Array[RandomReal[MN2]&,n/2]];
testSet = RandomVector[testSize]; trainingSets=Map[Function[x,RandomVector[x]],NestList[10 #&,10,m-1]];
classOf[i_] = If[i<=(testSize/2),1,2];
NN[trainingSet_]:=Module[{nnFunc=Nearest[trainingSet->Automatic]}, N[Fold[Plus,0,MapIndexed[If [classOf[First[nnFunc[#1]]]!=classOf[First[#2]],1,0]&,testSet]]/testSize]]
Grid[{Prepend[NestList[10 #&,10,m-1],"m"],Prepend[Map[Function[trainingSet,NN[trainingSet]],trainingSets],"error rate"]},Frame->All]
m 10 100 1000 10000 100000 1000000 error rate 0.5 0.5 0.322 0.484 0.499 0.501
|
|
|
|