fromiter shape argument -- was Re: For loop tips

return uL,asmatrix(fromiter((idx[x] for x in L),dtype=int))
Is it possible for fromiter to take an optional shape (or count) argument in addition to the dtype argument? If both is given it could preallocate memory and we only have to iterate over L once.
//Torgil
On 8/29/06, Keith Goodman kwgoodman@gmail.com wrote:
On 8/29/06, Torgil Svensson torgil.svensson@gmail.com wrote:
something like this?
def list2index(L): uL=sorted(set(L)) idx=dict((y,x) for x,y in enumerate(uL)) return uL,asmatrix(fromiter((idx[x] for x in L),dtype=int))
Wow. That's amazing. Thank you.
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Torgil Svensson wrote:
return uL,asmatrix(fromiter((idx[x] for x in L),dtype=int))
Is it possible for fromiter to take an optional shape (or count) argument in addition to the dtype argument?
Yes. fromiter(iterable, dtype, count) works.
If both is given it could preallocate memory and we only have to iterate over L once.
Regardless, L is only iterated over once. In general you can't rewind iterators, so that's a requirement. This is accomplished by doing successive overallocation similar to the way appending to a list is handled. By specifying the count up front you save a bunch of reallocs, but no iteration.
-tim
//Torgil
On 8/29/06, Keith Goodman kwgoodman@gmail.com wrote:
On 8/29/06, Torgil Svensson torgil.svensson@gmail.com wrote:
something like this?
def list2index(L): uL=sorted(set(L)) idx=dict((y,x) for x,y in enumerate(uL)) return uL,asmatrix(fromiter((idx[x] for x in L),dtype=int))
Wow. That's amazing. Thank you.
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Yes. fromiter(iterable, dtype, count) works.
Oh. Thanks. I probably had too old documentation to see this (15 June). If it's not updated since I'll give Travis a rest about this, at least until 1.0 is released :)
Regardless, L is only iterated over once.
How can this be true? If no size is given, mustn't numpy either loop over L twice or build an internal representation on which it'll iterate or copy in chunks?
I just found out that this works
import numpy,itertools rec_dt=numpy.dtype(">i4,S10,f8") rec_iter=itertools.cycle([(1,'s',4.0),(5,'y',190.0),(2,'h',-8)]) numpy.fromiter(rec_iter,rec_dt,10).view(recarray)
recarray([(1, 's', 4.0), (5, 'y', 190.0), (2, 'h', -8.0), (1, 's', 4.0), (5, 'y', 190.0), (2, 'h', -8.0), (1, 's', 4.0), (5, 'y', 190.0), (2, 'h', -8.0), (1, 's', 4.0)], dtype=[('f0', '>i4'), ('f1', '|S10'), ('f2', '<f8')])
but what's wrong with this?
d2_dt=numpy.dtype("4f8") d2_iter=itertools.cycle([(1.0,numpy.nan,-1e10,14.0)]) numpy.fromiter(d2_iter,d2_dt,10)
Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: a float is required
numpy.__version__
'1.0b4'
//Torgil
On 8/30/06, Tim Hochberg tim.hochberg@ieee.org wrote:
Torgil Svensson wrote:
return uL,asmatrix(fromiter((idx[x] for x in L),dtype=int))
Is it possible for fromiter to take an optional shape (or count) argument in addition to the dtype argument?
Yes. fromiter(iterable, dtype, count) works.
If both is given it could preallocate memory and we only have to iterate over L once.
Regardless, L is only iterated over once. In general you can't rewind iterators, so that's a requirement. This is accomplished by doing successive overallocation similar to the way appending to a list is handled. By specifying the count up front you save a bunch of reallocs, but no iteration.
-tim
//Torgil
On 8/29/06, Keith Goodman kwgoodman@gmail.com wrote:
On 8/29/06, Torgil Svensson torgil.svensson@gmail.com wrote:
something like this?
def list2index(L): uL=sorted(set(L)) idx=dict((y,x) for x,y in enumerate(uL)) return uL,asmatrix(fromiter((idx[x] for x in L),dtype=int))
Wow. That's amazing. Thank you.
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion

Torgil Svensson wrote:
Yes. fromiter(iterable, dtype, count) works.
Oh. Thanks. I probably had too old documentation to see this (15 June). If it's not updated since I'll give Travis a rest about this, at least until 1.0 is released :)
Actually I just knew 'cause I wrote it. I don't see a docstring for fromiter, although I though I wrote one. Maybe I just forgot?
Regardless, L is only iterated over once.
How can this be true? If no size is given, mustn't numpy either loop over L twice or build an internal representation on which it'll iterate or copy in chunks?
Well, it can't in general loop over L twice since the only method that L is guaranteed to have is next(); that's the extent of the iterator protocol. What it does is allocate an initial chunk of memory (the size of which I forget -- I did some tuning) and start filling it up. Once it's full, it does a realloc, which expands the existing chunk or memory, if possible, or returns a new, larger, chunk of memory with the data copied into it. Then we iterate on L some more until we fill up the new larger chunk, in which case we go get another one, etc. This is exactly how list.append works, although in that case the chunk of data is acutally a chunk of pointers to objects.
-tim
I just found out that this works
import numpy,itertools rec_dt=numpy.dtype(">i4,S10,f8") rec_iter=itertools.cycle([(1,'s',4.0),(5,'y',190.0),(2,'h',-8)]) numpy.fromiter(rec_iter,rec_dt,10).view(recarray)
recarray([(1, 's', 4.0), (5, 'y', 190.0), (2, 'h', -8.0), (1, 's', 4.0), (5, 'y', 190.0), (2, 'h', -8.0), (1, 's', 4.0), (5, 'y', 190.0), (2, 'h', -8.0), (1, 's', 4.0)], dtype=[('f0', '>i4'), ('f1', '|S10'), ('f2', '<f8')])
but what's wrong with this?
d2_dt=numpy.dtype("4f8") d2_iter=itertools.cycle([(1.0,numpy.nan,-1e10,14.0)]) numpy.fromiter(d2_iter,d2_dt,10)
Traceback (most recent call last): File "<stdin>", line 1, in ? TypeError: a float is required
numpy.__version__
'1.0b4'
//Torgil
On 8/30/06, Tim Hochberg tim.hochberg@ieee.org wrote:
Torgil Svensson wrote:
return uL,asmatrix(fromiter((idx[x] for x in L),dtype=int))
Is it possible for fromiter to take an optional shape (or count) argument in addition to the dtype argument?
Yes. fromiter(iterable, dtype, count) works.
If both is given it could preallocate memory and we only have to iterate over L once.
Regardless, L is only iterated over once. In general you can't rewind iterators, so that's a requirement. This is accomplished by doing successive overallocation similar to the way appending to a list is handled. By specifying the count up front you save a bunch of reallocs, but no iteration.
-tim
//Torgil
On 8/29/06, Keith Goodman kwgoodman@gmail.com wrote:
On 8/29/06, Torgil Svensson torgil.svensson@gmail.com wrote:
something like this?
def list2index(L): uL=sorted(set(L)) idx=dict((y,x) for x,y in enumerate(uL)) return uL,asmatrix(fromiter((idx[x] for x in L),dtype=int))
Wow. That's amazing. Thank you.
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&da... _______________________________________________ Numpy-discussion mailing list Numpy-discussion@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/numpy-discussion
participants (2)
-
Tim Hochberg
-
Torgil Svensson