Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » Software » comp.soft-sys.matlab

Topic: how to find unique rows and calculate the average.
Replies: 9   Last Post: Apr 12, 2013 6:31 PM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
dpb

Posts: 8,257
Registered: 6/7/07
Re: how to find unique rows and calculate the average.
Posted: Apr 9, 2013 10:54 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On 4/9/2013 4:32 PM, Alan Hitch wrote:
> I have a matrix (2563x19) with text and numeric characters. The first
> row is a header.
>
> e.g. [Site Marshtype Date Species Throw Season Year UniqueID Total Pass1
> Pass2 Pass3 Pass4 Pass5 Pass6 Pass7 Pass8 Pass9 Pass10
> 3BAB Brackish 27-Oct-04 PAPU 2 Fall 2004 3BAB38287PAPU 35 25 9 0 1 0 0 0
> 0 0 0
> 3BAB Brackish 27-Oct-04 POLA 2 Fall 2004 3BAB38287POLA 79 46 19 9 5 0 0
> 0 0 0 0
> 3BAB Brackish 27-Oct-04 GAAF 2 Fall 2004 3BAB38287GAAF 64 45 13 6 0 0 0
> 0 0 0 0
> 3BAB Brackish 27-Oct-04 CYVA 2 Fall 2004 3BAB38287CYVA 11 2 4 2 3 0 0 0
> 0 0 0
> 3BAB Brackish 27-Oct-04 LUPA 2 Fall 2004 3BAB38287LUPA 20 11 2 4 2 1 0 0
> 0 0 0]
>
> I would like to get average values (Total) of particular rows that have
> identical column values (column of interest is UniqueID) and create a
> new matrix with the calculated mean and associated UniqueID. For each
> UniqueID there are between 2 and 3 values.
>
> Like this:
>
> [3BAB38287PAPU 234
> 3BAB38287POLA 456
> 3BAB38287GAAF 35]
>
> Could someone help me with this? I have tried a lot of things and
> nothing seems to be working.


Well, not seeing what you tried can't try to fix it, but one
straightforward approach...

>> type 'alan.dat'

3BAB Brackish 27-Oct-04 PAPU 2 Fall 2004 3BAB38287PAPU 35 25 9 0 1 0 0
0 0 0 0
3BAB Brackish 27-Oct-04 POLA 2 Fall 2004 3BAB38287POLA 79 46 19 9 5 0 0
0 0 0 0
3BAB Brackish 27-Oct-04 GAAF 2 Fall 2004 3BAB38287GAAF 64 45 13 6 0 0 0
0 0 0 0
3BAB Brackish 27-Oct-04 CYVA 2 Fall 2004 3BAB38287CYVA 11 2 4 2 3 0 0
0 0 0 0
3BAB Brackish 27-Oct-04 LUPA 2 Fall 2004 3BAB38287LUPA 20 11 2 4 2 1 0
0 0 0 0

>> fid=fopen('alan.dat','rt');
>> C=textscan(fid,[repmat('%*s',1,7) '%s'

repmat('%d',1,11)],'collectoutput',1);
>> fid=fclose(fid);
>> u=unique(C{1});
>> for i=1:length(u),m=mean(C{2}(strcmp(C{1},u(i)),1));disp([u(i) m]),end

'3BAB38287CYVA' [11]
'3BAB38287GAAF' [64]
'3BAB38287LUPA' [20]
'3BAB38287PAPU' [35]
'3BAB38287POLA' [79]
>>

Note I didn't worry about the header and for simplicity of demo just
threw away all the data columns up to the uniqeID one. Then used
'collectoutput' to put rest in a single cell.

Then we found the unique IDs that are in the file (of which there is
only one of each in the sample and ran a loop over it to find the mean
by using a logical vector returned by strcmp as the row index into the
data. I chose the first column arbitrarily.

Salt to suit for your case, of course...

--




Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.