[Tutor] Problem on parsing data
jarod_v6 at libero.it
jarod_v6 at libero.it
Mon Mar 13 17:19:32 EDT 2017
I have a csv file with "," as separator.
If I try to separate using ",":
I have many different rows some with 30 columns some with 50 depend on many ","
In [105]: dimension_columns = []
In [106]: with open(nomi) as f:
for i in f:
lines = i.rstrip("\n").split(",")
if "#" not in lines[0]:
dimension_columns.append(len(lines))
.....:
In [108]: set(dimension_columns)
Out[108]:
{30,
32,
33,
34,
35,
36,
37,
38,
39,
40,
41,
42,
43,
44,
45,
46,
47,
48,
49,
50,
53,
54,
59}
The last coluns are as string "........." but they contqin some "," so using that script they parse.
In [99]: lines
Out[99]:
['chr10',
'19896830',
'19896830',
'C',
'A',
'"intergenic"',
'"ARL5B(dist=929890)',
'PLXDC2(dist=208542)"',
'NA',
'NA',
'"Score=458;Name=lod=97"',
'NA',
'NA',
'"0.83"',
'"rs7909976"',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'NA',
'"chr10\t19896830\trs7909976\tC\tA\t.\tREJECT\tDB\tGT:AD:BQ:DP:FA\t0:0',
'69:.:69:1.00\t0/1:0',
'37:32:37:1.00"']
What can I do for parse better that file and Have only the comma outside the string ?
More information about the Tutor
mailing list