Out of memory while reading excel file
Mahmood Naderan
nt_mahmood at yahoo.com
Thu May 11 03:19:03 EDT 2017
I wrote this:
a = np.zeros((p.max_row, p.max_column), dtype=object)
for y, row in enumerate(p.rows):
for cell in row:
print (cell.value)
a[y] = cell.value
print (a[y])
For one of the cells, I see
NM_198576.3
['NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3'
'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3' 'NM_198576.3']
These are 50 NM_198576.3 in a[y] and 50 is the number of columns in my excel file (p.max_column)
The excel file looks like
CHR1 11,202,100 NM_198576.3 PASS 3.08932 G|B|C - . . .
Note that in each row, some cells are '-' or '.' only. I want to read all cells as string. Then I will write the matrix in a file and my main code (java) will process that. I chose openpyxl for reading excel files, because Apache POI (a java package for manipulating excel files) consumes huge memory even for medium files.
So my python script only transforms an xlsx file to a txt file keeping the cell positions and formats.
Any suggestion?
Regards,
Mahmood
More information about the Python-list
mailing list