fastest way to read a text file in to a numpy array

Michael Selik michael.selik at gmail.com
Tue Jun 28 10:29:56 EDT 2016


On Tue, Jun 28, 2016 at 10:08 AM Hedieh Ebrahimi <hemla21 at gmail.com> wrote:

> File 1 has :
> x1,y1,z1
> x2,y2,z2
> ....
>
> and file2 has :
> x1,y1,z1,value1
> x2,y2,z2,value2
> x3,y3,z3,value3
> ...
>
> I need to read the coordinates from file 1 and then interpolate a value
> for these coordinates on file 2 to the closest coordinate possible. The
> problem is file 2 is has around 5M lines. So I was wondering what would be
> the fastest approach?
>

Is this a one-time task, or something you'll need to repeat frequently?
How many points need to be interpolated?
How do you define distance? Euclidean 3d distance? K-nearest?

5 million can probably fit into memory, so it's not so bad.

NumPy is a good option for broadcasting the distance function across all 5
million labeled points for each unlabeled point. Given that file format,
NumPy can probably read from file directly into an array.

http://stackoverflow.com/questions/3518778/how-to-read-csv-into-record-array-in-numpy


More information about the Python-list mailing list