Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » Software » comp.soft-sys.matlab

Topic: Downsampling very large text file
Replies: 7   Last Post: Apr 18, 2013 9:02 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
dpb

Posts: 8,217
Registered: 6/7/07
Re: Downsampling very large text file
Posted: Apr 10, 2013 4:13 PM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On 4/10/2013 1:13 PM, bram wrote:
> Hi all,
>
> For an experiment I measured vast amounts of data because of high sample
> rate. Ive taken out important parts, now I need to downsample the data
> to get a general trend.
>
> The data is in a textfile, it had 8 columns and so many rows the text
> files are around 50GB. columns are seperated by tabs. first 84 rows are
> sensor properties.
>
> Im trying to read 200 000 rows and average them into a single value and
> write them to a new array, then take the next ones....and so on.
>
> Can you guys help me along? Ive used quite some matlab but never
> analysed such data files.
>
> Ive made a start with textscan, seems to be working nicely, but im
> having trouble making proper for-loops in combination with the ~foef test.


Should be piece 'o cake...here's a sample w/ a very short file but the
ideas, the same...

NtoAvg = 4; % how many records for each average
m=zeros(SomeLargeNumber,YourNoColumns);
fmt=repmat('%f',1,nColumnsinYourFile);
i=0;
fid=fopen('yourfile.dat','rt');
while ~feof(fid)
C=textscan(fid,fmt,NtoAvg,'collectoutput',1,'delimiter','\t');
i=i+1;
m(i,:)=mean(C{:})
end
fid=fclose(fid);

>>

Note that the above works for the last set being less than your NtoAVg;
will just average over a smaller number. I'm presuming this won't matter.

Obviously you'll will need the logic to check that if need more room
grow the array, etc., but the ideas are as simple as the above.

NB that feof() isn't [yet] T after the last record is read if it turns
out that the number of records in the file were to be an exact multiple
of NtoAvg because feof() isn't called until the next read. In that (I
think unlikely) case you'll get a record of NaN because the cell array
will be empty. You can use a test on isempty() to stop that from
happening but it probably would be noticeable in runtime I didn't
include it above. If wanted, it would look sotoo...

...
C=textscan(fid,'%f %f',NtoAvg,'collectoutput',1);
if ~isempty(C{:})
i=i+1;
m(i,:) = mean(C{:});
end
...

--



Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.