[Python-ideas] Adding "Typed" collections/iterators to Python

Nathan Rice nathan.alexander.rice at gmail.com
Mon Dec 19 00:28:27 CET 2011


I believe it would be a good idea in instances where it is known that a
collection of a single type is going to be returned, to return a subclass
with type information and type specific methods "mixed in".  You could
provide member methods as collection methods that operate in a
vectorized manner, returning a new collection or iterator with the results
much like the mathematical functions in NumPy.  This would also give
people a reliable method to make functions operate on both scalar and
vector values.  I believe this could be implemented without needing
subclasses for everything under the sun with a generic collection "type
contract" mix-in.  If a developer wanted to provide additional type specific
collection/iterator methods they would of course need to subclass that.

To avoid handcuffing people with types (which is definitely un-pythonic)
and maintain backwards compatibility, the standard collection
modification methods could be hooked so that if an object of an
incorrect type is added, a warning is raised and the collection
gracefully degrades by removing mixed-in type information and
methods.  Additionally, a method could be provided that lets the user
"terminate the contract" causing the collection to degrade without a
warning.

I have several motivations for this:

-- Performing a series of operations using comprehensions or map
tends to be highly verbose in an uninformative way.  Compare the
current method with what would be possible using "typed" collections:

L2 = [X(e) for e in L1]
L3 = [Y(e) for e in L2]
vs
L2 = X(L1) # assuming X has been updated to work in both vector/scalar
L3 = Y(L2) # context...

L2 = [Z(Y(X(e))) for e in L1]
vs
L2 = Z(Y(X(L1)))

L2 = [e.X().Y().Z() for e in L1]
vs
L2 = L1.X().Y().Z() # assuming vectorized versions of member methods
#are folded into the collection via the mixin.

--  Because collections are type agnostic, it is not possible to place
methods on them that are type specific.  This leads to a lot of cases
where python forces you to read inside out or a the syntax gets
very disjoint in general.  A good example of this is:

"\n".join(l.capitalize() for l in my_string.split("\n"))

which could reduce to something far more readable, such as:

my_string.split("\n").capitalize().join_items("\n")

Besides the benefits to basic language usability (in my opinion) there
are tool and development benefits:

-- The additional type information would simplify static analysis and
provide cues for optimization (I'm looking at pypy here; their list
strategies play to this perfectly)

-- The warning on "violating the contract" and without first terminating
it would be a helpful tool in catching and debugging errors.

I have some thoughts on syntax and specifics that I think would work well,
however I wanted to solicit more feedback before I go too far down that path.


Nathan



More information about the Python-ideas mailing list