[Python-ideas] Vectorization [was Re: Add list.join() please]
Steven D'Aprano
steve at pearwood.info
Thu Feb 7 18:42:00 EST 2019
On Thu, Feb 07, 2019 at 03:17:18PM -0500, David Mertz wrote:
> Many apologies if people got one or more encrypted versions of this.
>
> On 2/7/19 12:13 AM, Steven D'Aprano wrote:
>
> It wasn't a concrete proposal, just food for thought. Unfortunately the
> thinking seems to have missed the point of the Julia syntax and run off
> with the idea of a wrapper class.
>
> I did not miss the point! I think adding new syntax à la Julia is a bad
> idea—or at very least, not something we can experiment with today (and
> wrote as much).
I'm sorry, I did not see your comment that you thought new syntax was a
bad idea. If I had, I would have responded directly to that.
Why is it an overtly *bad* (i.e. harmful) idea? As opposed to merely
not sufficiently useful, or unnecessary?
You're certainly right that we can't easily experiment in the
interpreter with new syntax, but we can perform thought-experiments and
we don't need anything but a text editor for that. As far as I'm
concerned, the thought experiment of comparing these two snippets:
((seq .* 2)..name)..upper()
versus
map(str.upper, map(operator.attrgetter('name'), map(lambda a: a*2, seq)))
demonstrates conclusively that even with the ugly double dot syntax,
infix syntax easily and conclusively beats map.
If I recall correctly, the three maps here were originally proposed by
you as examples of why map() alone was sufficient and there was no
benefit to the Julia syntax. I suggested composing them together as a
single operation instead of considering them in isolation.
> Therefore, something we CAN think about and experiment with today is a
> wrapper class.
Again, I apologise, I did not see where you said that this was intended
as a proof-of-concept to experiment with the concept.
[...]
> One of the principles I had in mind in my demonstration is that I want
> to wrap the original collection type (or keep it an iterator if it
> started as one). A number of other ideas here, whether for built-in
> syntax or different behaviors of a wrapper, effectively always reduce
> every sequence to a list under the hood. This makes my approach less
> intrusive to move things in and out of "vector mode." For example:
If the Vector class is only a proof of concept, then we surely don't
need to care about moving things in and out of "vector mode". We can
take it as a given that "the real thing" will work that way: the syntax
will be duck-typed and work with any iterable, and there will not be any
actual wrapper class involved and consequently no need to move things in
and out of the wrapper.
I had taken note of this functionality of the class before, and that was
one of the things which lead me to believe that you thought that a
wrapper class was in and of itself a solution to the problem. If you had
been proposing this Vector class as a viable working solution (or at
least a first alpha version towards a viable solution) then worrying
about round-tripping would be important.
But as a proof-of-concept of the functionality, then:
set( Vector(set_of_stuff) + spam )
list( Vector(list_of_stuff) + spam )
should be enough to play around with the concept.
[...]
> Inasmuch as I want to handle iterator here, it is impossible to do any
> type check upon creating a Vector. For concrete
> `collections.abc.Sequence` objects we could check, in principle. But
> I'd rather it be "we're all adults here" ... or at most provide some
> `check_type_uniformity()` function or method that had to be called
> explicitly.
Why do you care about type uniformity or type-checking the contents of
the iterable?
Comments like this suggest to me that you haven't understood the
idea as I have tried to explain it. I'm sorry that I have failed to
explain it better.
Julia is (if I understand correctly) statically typed, and that allows
it to produce efficient machine code because it knows that it is
iterating over (let's say) an array of 32-bit ints.
While that might be important for the efficiency of the generated
machine code, that's not important for the semantic meaning of the code.
In Python, we duck-type and resolve operations at runtime. We don't
typically validate types in advance:
for x in sequence:
if not isinstance(x, Spam):
raise TypeError('not Spam')
for x in sequence:
process(x)
(except under unusual circumstances). More to the point, when we write a
for-loop:
result = []
for a_string in seq:
result.append(a_string.upper())
we don't expect that the interpreter will validate that the sequence
contains nothing but strings in advance. So if I write this using Julia
syntax:
result = seq..upper()
I shouldn't expect the iterpreter to check that seq contains nothing but
strings either.
--
Steven
More information about the Python-ideas
mailing list