Reading 'scientific' csv using Pandas?
Peter Otten
__peter__ at web.de
Mon Nov 19 06:32:35 EST 2018
Martin Schöön wrote:
> My pandas is up to date.
>
> In your example you use the US convention of using "." for decimals
> and "," to separate data. This works perfect for me too.
>
> However, my data files use European conventions: decimal "," and TAB
> to separate data:
>
> col1 col2
> 1,1 0
> 10,24e-05 1
> 9,492e-10 2
>
> I use
>
> EUData = pd.read_csv('file.csv', skiprows=1, sep='\t',
> decimal=',', engine='python')
>
> to read from such files. This works so so. 'Common floats' (3,1415 etc)
> works just fine but 'scientific' stuff (1,6023e23) does not work.
With
>>> with open("file.csv", "w") as f:
... f.write("col1\tcol2\n"
... "1,1\t0\n"
... "10,24e-05\t1\n"
... "9,492e-10\t2\n")
...
40
the following works on my system:
>>> pd.read_csv("file.csv", delimiter="\t", decimal=",")
col1 col2
0 1.100000e+00 0
1 1.024000e-04 1
2 9.492000e-10 2
[3 rows x 2 columns]
The version is a bit old, though:
>>> pd.__version__
'0.13.1'
The engine="python" produces an exception over here:
"""
ValueError: The 'decimal' option is not supported with the 'python' engine
"""
Maybe you can try and omit that option?
If that doesn't work you can specify a converter:
>>> pd.read_csv("file.csv", sep="\t", converters={0: lambda s:
float(s.replace(",", "."))})
col1 col2
0 1.100000e+00 0
1 1.024000e-04 1
2 9.492000e-10 2
[3 rows x 2 columns]
More information about the Python-list
mailing list