Making the case for "typed" lists/iterators in python

Ulrich Eckhardt ulrich.eckhardt at dominolaser.com
Mon Dec 19 08:16:42 EST 2011


Am 16.12.2011 18:48, schrieb Nathan Rice:
> I realize this has been discussed in the past, I hope that I am
> presenting a slightly different take on the subject that will prove
> interesting.  This is primarily motivated by my annoyance with using
> comprehensions in certain circumstances.
[...]
> Having "typed" lists let you take preexisting string/int/etc methods
> and expose them in a vectorized context and provides an easy way for
> developers to support both vectors and scalars in a single function
> (you could easily "fix" other people's functions dynamically to
> support both).  Additionally, "typed" lists/iterators will allow
> improved code analysis and optimization.  The PyPy people have
> already stated that they are working on implementing different
> strategies for lists composed of a single type, so clearly there is
> already community movement in this direction.
>
> Just compare the above examples to their type-aware counterparts:
>
> L2 = X(L1) L2 = L1.X()
>
> L2 = Z(Y(X(L1))) L2 = L1.X().Y().Z()
>
> Also, this would provide a way to clean up stuff like:
>
> "\n".join(l.capitalize() for l in my_string.split("\n"))
>
> into:
>
> my_string.split("\n").capitalize().join_this("\n")
>
> Before anyone gets up in arms at the idea of statically typed
> python, what I am suggesting here would be looser than that.
> Basically, I believe it would be a good idea in instances where it is
> known that a list of single type is going to be returned, to return a
> list subclass (for example, StringList, IntegerList, etc).  To avoid
> handcuffing people with types, the standard list modification methods
> could be hooked so that if an object of an incorrect type is placed
> in the list, a warning is raised and the list converts to a generic
> object list.  The only stumbling block is that you can't use
> __class__ to convert from stack types to heap types in CPython.  My
> workaround for this would be to have a factory that creates generic
> "List" classes, modifying the bases to produce the correct behavior.
> Then, converting from a typed list to a generic object list would
> just be a matter of removing a member from the bases for a class.
> This of course basically kills the ability to perform type specific
> list optimization in CPython, but that isn't necessarily true for
> other implementations. The additional type information would be
> preserved for code analysis in any case.  The case would be even
> simpler for generators and other iterators, as you don't have to
> worry about mutation.
>
> I'd like to hear people's thoughts on the subject.  Currently we are
> throwing away useful information in many cases that could be used
> for code analysis, optimization and simpler interfaces.

I think there are two aspects to your idea:
1. collections that share a single type
2. accessing multiple elements via a common interface

Both are things that should be considered and I think both are useful in 
some contexts. The former would provide additional guarantees, for 
example, you could savely look up an attribute of the type only once 
while iterating over the sequence and use it for all elements. Also, I 
believe you could save some storage.

The second aspect would mean that you have a single function call that 
targets multiple objects, which is syntactic sugar, but that's a good 
thing. To some extent, this looks like a C++ valarray, see e.g. [1] and 
[2] (note that I don't trust [1] and that [2] is perhaps a bit 
outdated), in case you know C++ and want to draw some inspiration from this.

Anyway, I believe something like that would already be possible today, 
which would give people something they could actually try out instead of 
just musing about:

    class ValarrayWrapper(object):
        def __init__(self, elements):
            self._elements = elements
        def map(self, function):
            tmp = [function(x) for x in self._elements]
            return ValarrayWrapper(tmp)
        def apply(self, function)
            self._elements[:] = [function(x) for x in self._elements]

I could even imagine this to implement "generic" attribute lookup by 
looking at the first element. If it contains the according attribute, 
return a proxy that allows calls to member functions or property access, 
depending on the type of the attribute.


> I believe that "typed" lists that get "demoted" to normal lists with
> a warning on out of type operations preserve this information while
> providing complete backwards compatibility and freedom.

I don't think that a warning helps people write correct code, a 
meaningful error does. Otherwise, with the same argument you could 
convert a tuple on the fly to a list when someone tries to change an 
element of it.


Cheers!

Uli



[1] http://www.cplusplus.com/reference/std/valarray/abs/
[2] http://drdobbs.com/184403620



More information about the Python-list mailing list