[Tutor] Problem on parsing data

jarod_v6 at libero.it jarod_v6 at libero.it
Mon Mar 13 17:19:32 EDT 2017





I have a   csv   file  with "," as separator. 

If I try to separate using ",":




 I have many  different rows some with 30 columns some with 50 depend on many "," 

In [105]: dimension_columns = []

In [106]: with open(nomi) as f:
        for i in f:
                        lines = i.rstrip("\n").split(",")
                        if "#"  not  in lines[0]:
                                    dimension_columns.append(len(lines))
   .....:         


In [108]: set(dimension_columns)
Out[108]: 
{30,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 53,
 54,
 59}

The last coluns are as string "........."  but they contqin some "," so using that script they parse. 

In [99]: lines
Out[99]: 
['chr10',
 '19896830',
 '19896830',
 'C',
 'A',
 '"intergenic"',
 '"ARL5B(dist=929890)',
 'PLXDC2(dist=208542)"',
 'NA',
 'NA',
 '"Score=458;Name=lod=97"',
 'NA',
 'NA',
 '"0.83"',
 '"rs7909976"',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 'NA',
 '"chr10\t19896830\trs7909976\tC\tA\t.\tREJECT\tDB\tGT:AD:BQ:DP:FA\t0:0',
 '69:.:69:1.00\t0/1:0',
 '37:32:37:1.00"']

What can I do for parse better that file and Have only the  comma outside the string ?






	
	
	
	
	
	




	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
	
		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

		

	











More information about the Tutor mailing list