Text Processing
Jérôme
jerome at jolimont.fr
Tue Dec 20 15:03:21 EST 2011
Tue, 20 Dec 2011 11:17:15 -0800 (PST)
Yigit Turgut a écrit:
> Hi all,
>
> I have a text file containing such data ;
>
> A B C
> -------------------------------------------------------
> -2.0100e-01 8.000e-02 8.000e-05
> -2.0000e-01 0.000e+00 4.800e-04
> -1.9900e-01 4.000e-02 1.600e-04
>
> But I only need Section B, and I need to change the notation to ;
>
> 8.000e-02 = 0.08
> 0.000e+00 = 0.00
> 4.000e-02 = 0.04
>
> Text file is approximately 10MB in size. I looked around to see if
> there is a quick and dirty workaround but there are lots of modules,
> lots of options.. I am confused.
>
> Which module is most suitable for this task ?
You could try to do it yourself.
You'd need to know what seperates the datas. Tabulation character ? Spaces ?
Exemple :
Input file
----------
A B C
-------------------------------------------------------
-2.0100e-01 8.000e-02 8.000e-05
-2.0000e-01 0.000e+00 4.800e-04
-1.9900e-01 4.000e-02 1.600e-04
Python code
-----------
# Open file
with open('test1.plt','r') as f:
b_values = []
# skip as many lines as needed
line = f.readline()
line = f.readline()
line = f.readline()
while line:
#start = line.find(u"\u0009", 0) + 1 #seek Tab
start = line.find(" ", 0) + 4 #seek 4 spaces
#end = line.find(u"\u0009", start)
end = line.find(" ", start)
b_values.append(float(line[start:end].strip()))
line = f.readline()
print b_values
It gets trickier if the amount of spaces is not constant. I would then try
with regular expressions. Perhaps would regexp be more efficient in any case.
--
Jérôme
More information about the Python-list
mailing list