On 8/17/07, Barry Wark <barrywark@gmail.com> wrote:
Is there a reason not to add an argument to fromiter that specifies the final size of the n-d array? Reading this discussion, I realized that there are several places in my code where I create 2-D arrays like this:
arr = N.array([d.data() for d in list_of_data_containers]),
where d.data() returns a buffer object.
I would guess that this paradigm causes lots of memory copying. The more efficient solution, I think, would be to preallocate the array and then assign each row in a loop. It's so much clearer this way, however, that I've kept it as is in the code.
So, what if I could do something like
arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)),
I don't know that there's any theoretical problem in terms of doing something like this. There are a couple of practical issues though. One is that it would significantly increase the implementation complexity of fromiter, which right now is about as simple as it can reasonably be. Someone would need to step forward and write and test the code. The second issue is with the interface. The interface that you propose isn't really right. The current interface is: fromiter(iterable, dtype, count=-1) where count indicates how many items to extract from the iterable (-1 iterates until it is empty). 'shape' as you propose would couple to this in an unnatural way. Adding another keyword argument that indicates just the shape of the elements would make more sense, but it starts to seem a bit clunky. fromiter(iterable, dtype, count-1, itemshape=()) For this particular application, there doesn't seem to be any problem simply defining yourself a little utility function to do this for you. def from_shaped_iter(iterable, dtype, shape): a = numpy.empty(shape, dtype) for i, x in enumerate(iterable): a[i] = x return a I expect this would have decent performance if y dimension is reasonably large. regards, -tim with the contract that fromiter will throw an exception if any of the
d.data() are not of size y or if there are more than x elements in list_of_data_containers?
Just a thought for discussion.
barry
On 8/16/07, Robert Kern <robert.kern@gmail.com> wrote:
Geoffrey Zhu wrote:
Hi All,
I want to construct a numpy array based on Python objects. In the below code, opts is a list of tuples.
For example,
opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
If I use a generator like the following:
K=numpy.array(o[2]/1000.0 for o in opts)
It does not work.
I have to use:
numpy.array([o[2]/1000.0 for o in opts])
Is this behavior intended?
Yes. With arbitrary generators, there is no good way to do the kind of mind-reading that numpy.array() usually does with sequences. It would have to unroll the whole generator anyways. fromiter() works for this, but you are restricted to 1-D arrays which is a lot easier to implement the mind-reading for.
-- Robert Kern
"I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth." -- Umberto Eco _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
-- . __ . |-\ . . tim.hochberg@ieee.org