Re: [Numpy-discussion] numpy.array does not take generators

18 Aug 2007

      On 8/17/07, Barry Wark <barrywark@gmail.com> wrote:
...
Is there a reason not to add an argument to fromiter that specifies
the final size of the n-d array? Reading this discussion, I realized
that there are several places in my code where I create 2-D arrays
like this:
arr = N.array([d.data() for d in list_of_data_containers]),
where d.data() returns a buffer object.
I would guess that this paradigm causes lots of memory copying. The
more efficient solution, I think, would be to preallocate the array
and then assign each row in a loop. It's so much clearer this way,
however, that I've kept it as is in the code.
So, what if I could do something like
arr = N.fromiter(d.data() for d in list_of_data_containers, shape=(x,y)),
I don't know that there's any theoretical problem in terms of doing
something like this. There are a couple of practical issues though. One is
that it would significantly increase the implementation complexity of
fromiter, which right now is about as simple as it can reasonably be.
Someone would need to step forward and write and test the code. The second
issue is with the interface. The interface that you propose isn't really
right. The current interface is:

   fromiter(iterable, dtype, count=-1)

where count indicates how many items to extract from the iterable (-1
iterates until it is empty). 'shape' as you propose would couple to this in
an unnatural way. Adding another keyword argument that indicates just the
shape of the elements would make more sense, but it starts to seem a bit
clunky.

  fromiter(iterable, dtype, count-1, itemshape=())

For this particular application, there doesn't seem to be any problem simply
defining yourself a little utility function to do this for you.

def from_shaped_iter(iterable, dtype, shape):
    a = numpy.empty(shape, dtype)
    for i, x in enumerate(iterable):
        a[i] = x
    return a

I expect this would have decent performance if y dimension is reasonably
large.

regards,

-tim

with the contract that fromiter will throw an exception if any of the
...
d.data() are not of size y or if there are more than x elements in
list_of_data_containers?
Just a thought for discussion.
barry
On 8/16/07, Robert Kern <robert.kern@gmail.com> wrote:
...
Geoffrey Zhu wrote:
...
Hi All,
I want to construct a numpy array based on Python objects. In the
below code, opts is a list of tuples.
For example,
opts=[ ('C', 100, 3, 'A'), ('K', 200, 5.4, 'B')]
If I use a generator like the following:
K=numpy.array(o[2]/1000.0 for o in opts)
It does not work.
I have to use:
numpy.array([o[2]/1000.0 for o in opts])
Is this behavior intended?
Yes. With arbitrary generators, there is no good way to do the kind of
mind-reading that numpy.array() usually does with sequences. It would
have to
unroll the whole generator anyways. fromiter() works for this, but you
are
restricted to 1-D arrays which is a lot easier to implement the
mind-reading for.
--
Robert Kern
"I have come to believe that the whole world is an enigma, a harmless
enigma
 that is made terrible by our own mad attempt to interpret it as though
it had
 an underlying truth."
  -- Umberto Eco
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion
-- 
.  __
.   |-\
.
.  tim.hochberg@ieee.org

Re: [Numpy-discussion] numpy.array does not take generators

Timothy Hochberg