The Math Forum

Search All of the Math Forum:

Views expressed in these public forums are not endorsed by NCTM or The Math Forum.

Math Forum » Discussions » Software » comp.soft-sys.matlab

Notice: We are no longer accepting new posts, but the forums will continue to be readable.

Topic: Minimum data set size required for Kruskal-Wallis test?
Replies: 2   Last Post: May 24, 2013 8:18 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Kate J.

Posts: 177
Registered: 6/9/11
Minimum data set size required for Kruskal-Wallis test?
Posted: May 22, 2013 9:11 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

I'm attempting to perform the nonparametric Kruskal-Wallis test (kruskalwallis() function) on 3 sets of data generated from 3 different testing conditions. In the past, I've successfully performed this test on large data sets from a different project (with 100+ members in each set). Currently, each of my sets only has 3 to 5 values. (I'm always comparing sets of equal size.)

The problem: despite my use of previous code that successfully performed Kruskal-Wallis analysis on larger data sets, when I try to perform the same analysis on my current, much smaller data sets, I'm receiving error messages. I'm wondering: is there a minimum set size required to perform Kruskal-Wallis analysis?

Here is my code:

dataSetA = [21.4 27.2 31.8];
dataSetB = [54.0 57.0 59.4];
dataSetC = [30.6 48.2 35.2];

myData = [dataSetA dataSetB dataSetC];
[p,table,stats] = kruskalwallis(mydata)
c1 = multcompare(stats)

The plot that is generated contains only a single boxplot instead of 3 (I know that a boxplot for only 3 values is dicey...), and here is the Matlab screen output:

p = 1

table =
'Source' 'SS' 'df' 'MS' 'Chi-sq' 'Prob>Chi-sq'
'Columns' [ 0] [14] [ 0] [ 0] [ 1]
'Error' [279] [ 0] [NaN] [] []
'Total' [279] [14] [] [] []

stats =
gnames: '1'
n: [1 1 1 1 1 1 1 1 1 1 1 1 1 1 1]
source: 'kruskalwallis'
meanranks: 8
sumt: 12

Note: Intervals can be used for testing but are not simultaneous confidence intervals.
??? Subscripted assignment dimension mismatch.

Error in ==> multcompare>makeM at 564
MM(:,2) = sqrt(diag(gcov));

Error in ==> multcompare at 475
[M,MM,hh] = makeM(gmeans, gcov, crit, gnames, mname, dodisp);


Since collecting this particular data is time-consuming, it would be good to be able to get an idea about approximately how large my data sets will need to be in order for this type of analysis to work (as it *is* possible for me to collect more, if necessary); otherwise, I should consider other forms of statistical analysis.

Thanks in advance for your insights!

Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© The Math Forum at NCTM 1994-2018. All Rights Reserved.