[Python-ideas] Float range class

Chris Barker chris.barker at noaa.gov
Fri Jan 9 18:21:25 CET 2015


On Thu, Jan 8, 2015 at 7:33 PM, Nathaniel Smith <njs at pobox.com> wrote:

> On 9 Jan 2015 03:02, "Neil Girdhar" <mistersheik at gmail.com> wrote:
> >
> > I agree with everyone above.  At first I was +1 on this proposal, but
> why not suggest to the numpy people that arange and linespace should return
> Sequence objects rather than numpy arrays?
>
numpy arrays are a python sequence -- what do you mean here? Did you mean
iterator?

This whole conversation made me think a bit about numpy and iterators --
python has been moving toward more use of iterators, maybe numpy should do
the same?

But as I think about, it's actually a totally different model -- py3 made
range() an iterator, because it is most often used that way -- i.e. in a
for loop or list comporehension or genator expression. when you really want
the sequence, you wrap it in a list().

But numpy is, at it's core, about arrays, and the iteration happens INSIDE
the array object. e.g. to multiply all the elements of an array by a number
you do:

new_arr = arr * x

then the actual looping(iteration) happens inside numpy, with C data types,
at C speed.

turning that into:

new_arr = [i*x for i in arr]

would push all the work back out into python, killing the point of numpy.
In fact, killing BOTH points of numpy:

1) performance

2) clean readable array expressions -- that is:

    c = np.sqrt(a**2 + b**2)

    rather than:

    c = [ math.sqrt(x) for x in (x+y for x, y in zip( (x**2 for x in a),
 (x**2 for x in b) ) ) ]

    OK, there may  be a more readable way to write that...

Anyway, numpy is about arrays, so linspace, arange, etc create arrays.

There may well be a good reason to make a numpy iterator version of these,
for when you HAVE to loop, but that wouldn't help folks that aren't using
numpy anyway.

but sequences (iterators) of ranges of floating point numbers (and other
"numeric-like" objects) is a generally useful thing for all users of
python, and not entirely trivial to do right -- hence this conversation.

BTW, here is an example of the OP's point about the performance hit of
pulling numpy integares (or floats) out of numpy array objects:

using np.arange() as the source in a list comp:

In [39]: timeit a = [x**2 for x in np.arange(1000)]
1000 loops, best of 3: 907 µs per loop

compared to python xrange() - sorry, on py2 herre...

In [45]: In [40]: timeit a = [x**2 for x in xrange(1000)]
10000 loops, best of 3: 74.9 µs per loop

factor of 12 or so.

but really, for the most part, if you are using numpy, you're going to use
numpy:

In [43]: timeit a = np.arange(1000)**2
100000 loops, best of 3: 3.99 µs per loop

another factor of 18...

So again, we don't ned this because numpy linspace and arrage are "slow"
but because it's useful when you are not using numpy.


-Chris
-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150109/b42e6790/attachment.html>


More information about the Python-ideas mailing list