[Python-ideas] Vectorization [was Re: Add list.join() please]

Steven D'Aprano steve at pearwood.info
Fri Feb 1 23:32:55 EST 2019


On Fri, Feb 01, 2019 at 07:02:30PM +0300, Kirill Balunov wrote:
> пт, 1 февр. 2019 г. в 02:24, Steven D'Aprano <steve at pearwood.info>:
[...]
> > Julia has special "dot" vectorize operator that looks like this:
> >
> >      L .+ 1   # adds 1 to each item in L
> >
> >      func.(L)   # calls f on each item in L
> >
> > https://julialang.org/blog/2017/01/moredots
> >
> > The beauty of this is that you can apply it to any function or operator
> > and the compiler will automatically vectorize it. The function doesn't
> > have to be written to specifically support vectorization.
> >
> >
> IMO, the beauty of vector type is that it contains homogeneous data.

I didn't say anything about a vector type.

"Vectorization" does not mean "vector type". Please read the link I 
posted, it talks about what Julia does and how it works.

There are two relevant meanings for vectorization here:

https://en.wikipedia.org/wiki/Vectorization

- a style of computer programming where operations are applied to 
  whole arrays instead of individual elements

- a compiler optimization that transforms loops to vector operations

Given that none of my examples involved writing loops by hand, I could 
only be talking about the first.

The link I posted has examples which should be fairly clear even if 
you don't know Julia well.


> Therefore, it allows you to ensure that the method is present for each
> element in the vector. The first given example is what numpy is all about
> and without some guarantee that L consists of homogeneous data it hardly
> make sense.

Of course it makes sense. Even numpy supports inhomogeneous data:

py> a = np.array([1, 'spam'])
py> a
array(['1', 'spam'],
      dtype='|S4')

Inhomogeneous data may rule out some optimizations, but that hardly 
means that it "doesn't make sense" to use it.

Again, if you read the link I posted, they make it clear that Julia can 
vectorize code which supports any type:

    "Our function f accepts any x type"

I don't know Julia well enough to tell whether it supports inhomogeneous 
arrays. My experiments suggest that it forces all the elements to a 
single type. But that's not really the point: you can call the function 
f on an array of one type (say, Spam), then call it again on an array of 
another type (say, Eggs). So long as the function supports both Spam and 
Eggs types, it will just work, without having to re-write your array 
handling code.


> The second one is just `map`. So I can't catch what you are
> proposing:

I'm not proposing anything, I'm drawing people's attention to something 
another language does to solve an annoyance that Chris has. If someone 
else likes that solution and wishes to make a concrete proposal for 
Python, we can consider it. Otherwise it is just food for thought. It 
may or may not lead anywhere.

 
> 1. To make an operator form of `map`.
> 2. To pull numpy into stdlib.

I cannot imagine how you got that conclusion from anything I said. I was 
talking about syntax for vectorization, and didn't mention numpy once.

I didn't mention django or beautifulsoup either. I hope that you 
didn't conclude that I wanted to pull them into the stdlib too.


-- 
Steven


More information about the Python-ideas mailing list