[Python-ideas] Vectorization [was Re: Add list.join() please]
Steven D'Aprano
steve at pearwood.info
Fri Feb 1 23:32:55 EST 2019
On Fri, Feb 01, 2019 at 07:02:30PM +0300, Kirill Balunov wrote:
> пт, 1 февр. 2019 г. в 02:24, Steven D'Aprano <steve at pearwood.info>:
[...]
> > Julia has special "dot" vectorize operator that looks like this:
> >
> > L .+ 1 # adds 1 to each item in L
> >
> > func.(L) # calls f on each item in L
> >
> > https://julialang.org/blog/2017/01/moredots
> >
> > The beauty of this is that you can apply it to any function or operator
> > and the compiler will automatically vectorize it. The function doesn't
> > have to be written to specifically support vectorization.
> >
> >
> IMO, the beauty of vector type is that it contains homogeneous data.
I didn't say anything about a vector type.
"Vectorization" does not mean "vector type". Please read the link I
posted, it talks about what Julia does and how it works.
There are two relevant meanings for vectorization here:
https://en.wikipedia.org/wiki/Vectorization
- a style of computer programming where operations are applied to
whole arrays instead of individual elements
- a compiler optimization that transforms loops to vector operations
Given that none of my examples involved writing loops by hand, I could
only be talking about the first.
The link I posted has examples which should be fairly clear even if
you don't know Julia well.
> Therefore, it allows you to ensure that the method is present for each
> element in the vector. The first given example is what numpy is all about
> and without some guarantee that L consists of homogeneous data it hardly
> make sense.
Of course it makes sense. Even numpy supports inhomogeneous data:
py> a = np.array([1, 'spam'])
py> a
array(['1', 'spam'],
dtype='|S4')
Inhomogeneous data may rule out some optimizations, but that hardly
means that it "doesn't make sense" to use it.
Again, if you read the link I posted, they make it clear that Julia can
vectorize code which supports any type:
"Our function f accepts any x type"
I don't know Julia well enough to tell whether it supports inhomogeneous
arrays. My experiments suggest that it forces all the elements to a
single type. But that's not really the point: you can call the function
f on an array of one type (say, Spam), then call it again on an array of
another type (say, Eggs). So long as the function supports both Spam and
Eggs types, it will just work, without having to re-write your array
handling code.
> The second one is just `map`. So I can't catch what you are
> proposing:
I'm not proposing anything, I'm drawing people's attention to something
another language does to solve an annoyance that Chris has. If someone
else likes that solution and wishes to make a concrete proposal for
Python, we can consider it. Otherwise it is just food for thought. It
may or may not lead anywhere.
> 1. To make an operator form of `map`.
> 2. To pull numpy into stdlib.
I cannot imagine how you got that conclusion from anything I said. I was
talking about syntax for vectorization, and didn't mention numpy once.
I didn't mention django or beautifulsoup either. I hope that you
didn't conclude that I wanted to pull them into the stdlib too.
--
Steven
More information about the Python-ideas
mailing list