Drexel dragonThe Math ForumDonate to the Math Forum



Search All of the Math Forum:

Views expressed in these public forums are not endorsed by Drexel University or The Math Forum.


Math Forum » Discussions » Software » comp.soft-sys.math.mathematica

Topic: Importing a file and extracting data
Replies: 2   Last Post: Jun 15, 2013 4:13 AM

Advanced Search

Back to Topic List Back to Topic List Jump to Tree View Jump to Tree View   Messages: [ Previous | Next ]
David Bailey

Posts: 712
Registered: 11/7/08
Re: Importing a file and extracting data
Posted: Jun 15, 2013 4:13 AM
  Click to see the message monospaced in plain text Plain Text   Click to reply to this topic Reply

On 14/06/2013 09:54, howardfink@gmail.com wrote:
> I have a series of files of this form:
>
> June 7, 2013
> Tc+Naphthalene vs Temperature (oC)
>
> Run 1
> Tc_Naph_84_2C 3.740 ns
> Tc_Naph_87_1C 3.731 ns
> Tc_Naph_89_9C 3.720 ns
> Tc_Naph_92_9C 3.704 ns
> Tc_Naph_94_7C 3.687 ns
> Tc_Naph_97_6C 3.694 ns
>
>
> Run 2
> Tc_Naph_83_2C 3.758 ns
> Tc_Naph_83_4C 3.750 ns
> Tc_Naph_86_4C 3.728 ns
> Tc_Naph_88_1C 3.725 ns
> Tc_Naph_90_2C 3.716 ns
> Tc_Naph_93_1C 3.704 ns
> Tc_Naph_94_7C 3.673 ns
> Tc_Naph_97_7C 3.684 ns
> Tc_Naph_97_9C 3.665 ns
>
>
>
>
> I used an Import command to read in the file, but now I am just sitting and=
> staring, without a clue how to get the 84_2 converted to the number 84.2,e=
> tc. and ending up with two lists: Run 1 and Run 2, consisting of pairs of t=
> emperature and time. The temperature will eventually be converted to 1/abs=
> olute temperature.
>
> I've read lots and lots of help, thumbed through dozens of pages of a Mathe=
> matica 5 manual, and don't know where to start. I'm trying to help a 90-ye=
> ar-old chemistry professor, who is currently using a calculator, but there=
> will be dozens of runs of this experiment.
>

Import is designed to read text or binary data formatted in a standard
form - e.g. CSV. Clearly your files have an ad-hoc format, so you can't
expect Mathematica (or anything else) to read them without some effort!

The job is complicated by the fact that you (or the prof) used "_"
rather than a decimal point, and that the number is joined on to other
textual data. I am going to assume that the units (C and ns) are the
same throughout, and can be discarded, and that all the temperatures
have a decimal part, even if it is 0.

First define a couple of functions:

dataConvert[{{run_}, samples_}] := {run, Map[process, samples]};

process[line_] := Module[{tmp},
tmp = StringReplace[
"\"" <> line, {"_" ~~ a : (DigitCharacter ..) ~~ "_" ~~
b : (DigitCharacter ..) ~~ "C" :>
"\"," <> a <> "." <> b <> ",", " ns" :> ""}];
ToExpression["{" <> tmp <> "}"]
];

It is best to avoid Import if the data is not in a recognised format,
and to read it in as strings, discarding the empty lines, and extracting
the data in the first two lines:

In[4]:= data = ReadList["c:\\maths\\data.dat", String];

In[7]:= data = DeleteCases[data, {}];

In[5]:= fileDate = data[[1]]

Out[5]= "June 7, 2013 "

In[9]:= fileTitle = data[[2]]

Out[9]= "Tc+Naphthalene vs Temperature (oC)"

Break up the rest by detecting the 'Run' lines:
In[29]:= tmp =
Partition[SplitBy[data[[3 ;;]], StringMatchQ[#, "Run" ~~ ___] &], 2]

Out[29]= {{{"Run 1"}, {"Tc_Naph_84_2C 3.740 ns",
"Tc_Naph_87_1C 3.731 ns", "Tc_Naph_89_9C 3.720 ns",
"Tc_Naph_92_9C 3.704 ns", "Tc_Naph_94_7C 3.687 ns",
"Tc_Naph_97_6C 3.694 ns"}}, {{"Run 2"}, {"Tc_Naph_83_2C 3.758 \
ns", "Tc_Naph_83_4C 3.750 ns", "Tc_Naph_86_4C 3.728 ns",
"Tc_Naph_88_1C 3.725 ns", "Tc_Naph_90_2C 3.716 ns",
"Tc_Naph_93_1C 3.704 ns", "Tc_Naph_94_7C 3.673 ns",
"Tc_Naph_97_7C 3.684 ns", "Tc_Naph_97_9C 3.665 ns"}}}

Now apply the previous functions to produce a nested list structure of
strings and real numbers:

In[38]:= Map[dataConvert, tmp]

Out[38]= {{"Run 1", {{"Tc_Naph", 84.2, 3.74}, {"Tc_Naph", 87.1,
3.731}, {"Tc_Naph", 89.9, 3.72}, {"Tc_Naph", 92.9,
3.704}, {"Tc_Naph", 94.7, 3.687}, {"Tc_Naph", 97.6,
3.694}}}, {"Run 2", {{"Tc_Naph", 83.2, 3.758}, {"Tc_Naph", 83.4,
3.75}, {"Tc_Naph", 86.4, 3.728}, {"Tc_Naph", 88.1,
3.725}, {"Tc_Naph", 90.2, 3.716}, {"Tc_Naph", 93.1,
3.704}, {"Tc_Naph", 94.7, 3.673}, {"Tc_Naph", 97.7,
3.684}, {"Tc_Naph", 97.9, 3.665}}}}

Clearly it is best in future to record data in an easier format -
whatever language you use to process it!

You may want to lookup StringReplace and StringExpression to understand
the above, and help you with other problems of this type.

David Bailey
http://www.dbaileyconsultancy.co.uk





Point your RSS reader here for a feed of the latest messages in this topic.

[Privacy Policy] [Terms of Use]

© Drexel University 1994-2014. All Rights Reserved.
The Math Forum is a research and educational enterprise of the Drexel University School of Education.