file data => array(s)
Steven D'Aprano
steve+comp.lang.python at pearwood.info
Wed Dec 14 18:27:58 EST 2011
On Wed, 14 Dec 2011 14:20:40 -0800, Eric wrote:
> I'm trying to read some file data into a set of arrays. The file data
> is just four columns of numbers, like so:
>
> 1.2 2.2 3.3 0.5
> 0.1 0.2 1.0 10.1
> ... and so on
>
> I'd like to read this into four arrays, one array for each column.
> Alternatively, I guess something like this is okay too:
>
> [[1.2, 2.2, 3.3, 0.5], [0.1, 0.2, 1.0, 10.1], ... and so on]
First thing: due to the fundamental nature of binary floating point
numbers, if you convert text like "0.1" to a float, you don't get 0.1,
you get 0.10000000000000001. That is because 0.1000...01 is the closest
possible combination of fractions of 1/2, 1/4, 1/8, ... that adds up to
1/10.
If this fact disturbs you, you can import the decimal module and use
decimal.Decimal instead; otherwise forget I said anything and continue
using float. I will assume you're happy with floats.
Assuming the file is small, say, less than 50MB, I'd do it like this:
# Version for Python 2.x
f = open(filename, 'r')
text = f.read() # Grab the whole file at once.
numbers = map(float, text.split())
f.close()
That gives you a single list [1.2, 2.2, 3.3, 0.5, 0.1, 0.2, ...] which
you can now split into groups of four. There are lots of ways to do this.
Here's an inefficient way which hopefully will be simple to understand:
result = []
while numbers != []:
result.append(numbers[0:4])
del numbers[0:4]
Here is a much more efficient method which is only a tad harder to
understand:
result = []
for start in range(0, len(numbers), 4):
result.append(numbers[start:start+4])
And just for completeness, here is an advanced technique using itertools:
n = len(numbers)//4
numbers = iter(numbers)
from itertools import islice
result = [list(islice(numbers, 4)) for i in range(n)]
Be warned that this version throws away any partial group left over at
the end; if you don't want that, change the line defining n to this
instead:
n = len(numbers)//4 + (len(numbers)%4 > 0)
--
Steven
More information about the Python-list
mailing list