Newbie - converting csv files to arrays in NumPy - Matlab vs. Numpy comparison

oyekomova oyekomova at hotmail.com
Sun Jan 14 01:39:34 CET 2007


Thanks for your note. I have 1Gig of RAM. Also, Matlab has no problem
in reading the file into memory. I am just running Istvan's code that
was posted earlier.

import time, csv, random
from numpy import array


def make_data(rows=1E6, cols=6):
    fp = open('data.txt', 'wt')
    counter = range(cols)
    for row in xrange( int(rows) ):
        vals = map(str, [ random.random() for x in counter ] )
        fp.write( '%s\n' % ','.join( vals ) )
    fp.close()


def read_test():
    start  = time.clock()
    reader = csv.reader( file('data.txt') )
    data   = [ map(float, row) for row in reader ]
    data   = array(data, dtype = float)
    print 'Data size', len(data)
    print 'Elapsed', time.clock() - start


#make_data()
read_test()



On Jan 13, 5:47 pm, "sturlamolden" <sturlamol... at yahoo.no> wrote:
> oyekomova wrote:
> > Thanks to everyone for their excellent suggestions. I was able to
> > acheive the following results with all your suggestions. However, I am
> > unable to cross file size of 6 million rows. I would appreciate any
> > helpful suggestions on avoiding memory errors. None of the solutions
> > posted was able to cross this limit.The error message means you are running out of RAM.
>
> With 6 million rows and 6 columns, the size of the data array is (only)
> 274 MiB. I have no problem allocating it on my laptop. How large is the
> csv file and how much RAM do you have?
>
> Also it helps to post the whole code you are trying to run. I don't
> care much for guesswork.




More information about the Python-list mailing list