Newbie - converting csv files to arrays in NumPy - Matlab vs. Numpy comparison

oyekomova oyekomova at
Wed Jan 10 20:48:06 CET 2007

Thanks for your help. I compared the following code in NumPy with the
csvread in Matlab for a very large csv file. Matlab read the file in
577 seconds. On the other hand, this code below kept running for over 2
hours. Can this program be made more efficient? FYI - The csv file was
a simple 6 column file with a header row and more than a million

import csv
from numpy import array
import time
file_to_read = file('somename.csv','r')
read_from = csv.reader(file_to_read)

datalist = [ map(float, row[:]) for row in read_from ]

# now the real data
data = array(datalist, dtype = float)

print elapsed

Robert Kern wrote:
> oyekomova wrote:
> > I would like to know how to convert a csv file with a header row into a
> > floating point array without the header row.
> Use the standard library module csv. Something like the following is a cheap and
> cheerful solution:
> import csv
> import numpy
> def float_array_from_csv(filename, skip_header=True):
>     f = open(filename)
>     try:
>         reader = csv.reader(f)
>         floats = []
>         if skip_header:
>         for row in reader:
>             floats.append(map(float, row))
>     finally:
>         f.close()
>     return numpy.array(floats)
> --
> Robert Kern
> "I have come to believe that the whole world is an enigma, a harmless enigma
>  that is made terrible by our own mad attempt to interpret it as though it had
>  an underlying truth."
>   -- Umberto Eco

More information about the Python-list mailing list