Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.



Re: how to find unique rows and calculate the average.
Posted:
Apr 10, 2013 5:10 PM


dpb <none@non.net> wrote in message <kk2k8a$h1m$1@speranza.aioe.org>... > On 4/9/2013 4:32 PM, Alan Hitch wrote: > > I have a matrix (2563x19) with text and numeric characters. The first > > row is a header. > > > > e.g. [Site Marshtype Date Species Throw Season Year UniqueID Total Pass1 > > Pass2 Pass3 Pass4 Pass5 Pass6 Pass7 Pass8 Pass9 Pass10 > > 3BAB Brackish 27Oct04 PAPU 2 Fall 2004 3BAB38287PAPU 35 25 9 0 1 0 0 0 > > 0 0 0 > > 3BAB Brackish 27Oct04 POLA 2 Fall 2004 3BAB38287POLA 79 46 19 9 5 0 0 > > 0 0 0 0 > > 3BAB Brackish 27Oct04 GAAF 2 Fall 2004 3BAB38287GAAF 64 45 13 6 0 0 0 > > 0 0 0 0 > > 3BAB Brackish 27Oct04 CYVA 2 Fall 2004 3BAB38287CYVA 11 2 4 2 3 0 0 0 > > 0 0 0 > > 3BAB Brackish 27Oct04 LUPA 2 Fall 2004 3BAB38287LUPA 20 11 2 4 2 1 0 0 > > 0 0 0] > > > > I would like to get average values (Total) of particular rows that have > > identical column values (column of interest is UniqueID) and create a > > new matrix with the calculated mean and associated UniqueID. For each > > UniqueID there are between 2 and 3 values. > > > > Like this: > > > > [3BAB38287PAPU 234 > > 3BAB38287POLA 456 > > 3BAB38287GAAF 35] > > > > Could someone help me with this? I have tried a lot of things and > > nothing seems to be working. > > Well, not seeing what you tried can't try to fix it, but one > straightforward approach... > > >> type 'alan.dat' > > 3BAB Brackish 27Oct04 PAPU 2 Fall 2004 3BAB38287PAPU 35 25 9 0 1 0 0 > 0 0 0 0 > 3BAB Brackish 27Oct04 POLA 2 Fall 2004 3BAB38287POLA 79 46 19 9 5 0 0 > 0 0 0 0 > 3BAB Brackish 27Oct04 GAAF 2 Fall 2004 3BAB38287GAAF 64 45 13 6 0 0 0 > 0 0 0 0 > 3BAB Brackish 27Oct04 CYVA 2 Fall 2004 3BAB38287CYVA 11 2 4 2 3 0 0 > 0 0 0 0 > 3BAB Brackish 27Oct04 LUPA 2 Fall 2004 3BAB38287LUPA 20 11 2 4 2 1 0 > 0 0 0 0 > > >> fid=fopen('alan.dat','rt'); > >> C=textscan(fid,[repmat('%*s',1,7) '%s' > repmat('%d',1,11)],'collectoutput',1); > >> fid=fclose(fid); > >> u=unique(C{1}); > >> for i=1:length(u),m=mean(C{2}(strcmp(C{1},u(i)),1));disp([u(i) m]),end > '3BAB38287CYVA' [11] > '3BAB38287GAAF' [64] > '3BAB38287LUPA' [20] > '3BAB38287PAPU' [35] > '3BAB38287POLA' [79] > >> > > Note I didn't worry about the header and for simplicity of demo just > threw away all the data columns up to the uniqeID one. Then used > 'collectoutput' to put rest in a single cell. > > Then we found the unique IDs that are in the file (of which there is > only one of each in the sample and ran a loop over it to find the mean > by using a logical vector returned by strcmp as the row index into the > data. I chose the first column arbitrarily. > > Salt to suit for your case, of course... > >  dpb,
Thanks for responding. I neglected to tell you that the data is a cell array with different data types and textscan only works with double or string. I read the data in using xlsread and was using the raw data. I am relatively new to MatLab. How do I convert the cell array into a double or string? I tried using cell2mat but the contents of my cellarray are different data types.
Thanks again Alan



