Search All of the Math Forum:
Views expressed in these public forums are not endorsed by
Drexel University or The Math Forum.
|
|
dpb
Posts:
6,677
Registered:
6/7/07
|
|
Re: Pull out specific numbers from unstructured text file
Posted:
Feb 9, 2013 4:50 PM
|
|
On 2/9/2013 1:12 PM, Stan wrote: > ^^^^^Okay I think I don't understand Lines 5,7,8 in your shortcut code: > > Line 1: > fid=fopen(....,'rt'); > Line 2: > l=' '; > Line 3: > while 1 > Line 4: > l=fgetl(fid); > Line 5: > if strfind(l,'Nmoves')>0,break,end > Line 6: > end > Line 7: > Nmoves=sscanf(l,'Nmoves=%d'); > Line 8: > Nrequired=fscanf(fid,'Nrequired=%d'); > Line 9: > fid=fclose(fid); > > My explanation is: > > while 1 > . > . > . > end > > This is for lines 4-6 and this reads the file. If fgetl encounters the > end-of-file indicator, it returns -1. So, as long as it returns 1 (i.e. > anywhere before the end of the file), this statement is saying the while > loop should perform the actions inside the if statement.
Not quite--the '1' in the WHILE construct is a constant and never changes--only finding the string 'Nmoves=' somewhere in the file will break the loop.
The condition in the WHILE would have to be something on the variable l after returned by fgetl() if it were to have any effect. I chose to not do that 'cuz I presumed you'd only use this on an appropriate file and it would take reading the first line outside the loop or to otherwise initialize the loop at the beginning. An alternate that would be a little cleaner in case the string weren't to be in the file would be to use while ~feof(fid) which would at least die gracefully on the EOF (eventually).
> My explanation for line 5: > If 'Nmoves' is found in the string l (where l is the contents of the > file that have been read up to that point) then stop reading at that line.
Essentially--it breaks the loop having found the desired string and therefore the first line to parse (on the assumption the string pattern only exists for the line desired or at least it is the first occurrence). At that point 'l' holds the content of the line read--the strfind() simply scans the content for a match and returns.
> > My explanations for lines 7 and 8: > 7: Scan l for 'Nmoves=%d'. > 8. Scan fid for 'Nrequired=%d'.
Well, depends on what you mean by "scan" -- they both do input conversion matching the formatting string according to the rules therefore. The rule for a literal string is to match that string in the input and essentially ignore those matching characters. %d is to convert a field as decimal number. sscanf() works from a string variable ('l' in this case which we filled w/ the desired line from the file previously so now we're getting the desired value to a variable) while fscanf takes input from the file which has been connected via fopen() and associated w/ a valid file handle (fid is just a convenient variable name for that).
> Questions: > In line 8, why did you change from l to fid?
Because we need to scan another line and it's done w/ one source code line directly from the file via fscanf() whereas we had used fgetl() to suck up a record in its entirety before while search for the target first line. By your file, the next line was the location for the next value wanted so didn't need any more searching to find another randomly place record--it was given to be the next.
> What is the connection between line 5 and lines 7,8? > How does it know, after line 5 (i.e. after reaching the end of the line > containing Nmoves), that it needs to search for the next two lines?
You described the file format and said the next line after the one containing "Nmoves" was the next desired field to be parsed.
You still don't seem to grasp that the fgetl() reads a record including the \n (newline) and returned that in the character variable 'l' and the first sscanf() is parsing that string--nothing else has happened in the file at that point (after the sscanf() that is). _THEN_, we went back to the file and got as much of the next record as required to get the next variable by the use of fscanf().
fscanf(), however, unlike fgetl() does _NOT_ automagically read the entire record _UNLESS_ and _IFF_ the format string provided tells it to do that. Your initial description didn't say anything about reading anything except these two values so I did just that--read records until found the first one desired, then read just what was needed to get the variable value requested from the following record. Period. End of story. That's why later when you came back and said "Oh, that's not the end of what's needed" I said what I gave you was a shortcut specifically for the first problem outlined.
Now, the problem is that to read the rest of the desired records you've got to either write specific formatting strings to handle them (a pita since they're not symmetric in much of any useful way) to continue on w/ fscanf() (and including the fact that the file position marker is in the middle of the Nrequired record as above).
So, as noted in my previous response, given you want to do the other stuff I'd suggest it's simpler to revert to fgetl/sscanf pairs.
Again, take the sample code and your example file and just type the while loop in at the command line and look at what the contents of 'l' are and then what happens if you follow the fscanf() call w/ a fgetl() to understand the difference...
Also read
doc fscanf doc fgetl
and friends carefully...
--
|
|
|
|