Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » Software » comp.soft-sys.math.mathematica

Topic: Obtaining Random LIne from A file
Replies: 9   Last Post: Feb 21, 2013 5:46 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
Albert Retey

Posts: 688
Registered: 7/15/08
Re: Obtaining Random LIne from A file
Posted: Feb 18, 2013 5:59 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

Am 17.02.2013 10:08, schrieb David Bailey:
> On 16/02/2013 06:07, Ramiro Barrantes wrote:
>> Hello,
>>
>> I would like to get a random line from a file, I know this can be done
>> with Mathematica but I am playing with using sed to see if it goes
>> faster, say I want to get line 1000
>>
>> In mathematica it would be:
>>
>> <<"! sed -n p1000 filename.txt"
>>
>> However, I am trying to put the filename as a variable, say
>>
>> filename="hugefile.txt"
>>
>> cmd="! sed -n p1000 "<>filename
>> <<cmd
>>
>> does not work.
>>
>> How can I do this?
>>
>> Lastly, I am getting a randomline using mathematica doing:
>>
>> getRandomLine[file_, n_] :=
>> Block[{i = RandomInteger[{1, n}], str = OpenRead[file], res},
>> Skip[str, "String", i];
>> res = Read[str, Expression];
>> Close[str];
>> res[[2]]
>> ]
>>
>> However, it is very slow so I was going to try with sed.Any suggestions?
>>
>> Thanks in advance,
>> Ramiro
>>
>>

> I would stick with Mathematica to do this job! How big is the file
> (number of lines and number of bytes)? If it will fit inside Mathematica
> comfortable, I'd see how it works to read it all in as a list of strings
> and pick the one you want:
>
> xx=ReadList["C:\\some file",String];//Timing
>
> Then you have an array of strings, and you can select what you want
> directly.
>
> Remember, the basic problem with reading at an arbitrary position in a
> text file, is that if the line lengths are not the same, any algorithm
> has to read every line before the one you want!


if he just wants to get an arbitrary line that's not true: just choosing
a position in the file at random and searching e.g. the previous and
next linebreak would also result in picking a random line. Of course the
probability of choosing longer lines would be larger than that for
shorter lines, but it isn't clear from the question whether that would
be a problem for what the OP tries to do...

> If you create this file,
> you should consider packing the lines to make them all the same length -
> then you could access what you want very efficiently (but with a little
> more coding!)


... and slightly (?) higher memory requirements...

hth,

albert





Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.