dpb
Posts:
6,679
Registered:
6/7/07
|
|
Re: textscan: handling spaces in fixed-width data
Posted:
Jan 3, 2013 10:08 AM
|
|
On 1/3/2013 7:23 AM, David wrote: > Hello everyone, please consider the following simple example, with > fixed-width-data (here: 1-digit per data): > > myString = '12 45 78'; > a = textscan(myString,repmat('%1d',1,9)); > b = textscan(myString,repmat('%1d',1,9),'TreatAsEmpty',' '); > > What I want to have is something like > {1,2,<empty>,4,5,<empty>,7,8,<empty>}, i.e. spaces should be treated as > empty/undefined value. > > But, what I get for a AND for b (they are identical!) is: > {1,2,4,5,7,8,<empty>,<empty>,<empty>} > > Maybe I did not understand the "TreatAsEmpty"-option very well?! Do you > have any advice how to use textscan for this? > > I have to read huge amounts of data, so the rowwise evaluation as > characters/strings in combination with str2double seems not to be a good > solution for me.
You're at the conundrum in scanning fixed-width text input in Matlab--it just simply doesn't know how (im[ns]ho it's just broke)
About all you can do is the workaround you describe above or read as character and do a mass substitution of the blank w/ another character that won't be silently eaten no matter what...
Sotoo
str=strrep(myString,' ','b'); b = textscan('%1d','TreatAsEmpty','b');
Trouble is that even here the return value will be '0', not [] so if zero is a possible value as well you're still stuck because textscan() 'emptyvalue' argument is ignored for non-delimited files as well.
textscan() is useful but it still has warts...of course, w/ fixed-width fields, there's no parsing in Matlab that doesn't. :(
In reality, probably your best bet is to create the file as delimited as painful as that sounds.
If you're comfortable w/ Fortran, in the past I've written mex-files that use Fortran FORMAT statements to deal w/ the problem is another workaround.
I don't know what TMW won't take this on as a challenge despite it being an issue since day 1...other things are more glamorous it seems.
--
|
|