[Python-ideas] Adding "Typed" collections/iterators to Python
tjreedy at udel.edu
Tue Dec 20 02:14:49 CET 2011
On 12/19/2011 9:30 AM, Nathan Rice wrote:
> Couple things.
> 1. The "broadcasting" that people seemed to have latched on to is only
> part of what I put forward,
Perhaps because it is the most understandable.
> and I agree it is something that would
> have to be done *correctly* to be beneficial. I would have no issues
> with providing a userspace lib to do this if type decorations were
> included in homogeneous collections/iterables,
The meaning of 'homogeneous' depends on the context -- the purpose and
use of the collection. For some purposes -- str(o), len(c), o in c,
c.index(o), and others, -- all objects, collections, or seqeuences
*are* 'homogeneous' as instances or subclasses of 'object'. On the other
hand, even [-1, 0, 1] is heterogeneous with respect to both sqrt and
log, with the divide different for each. So I do not consider
'homogeneous' to be a property of collections as such.
Python's current restricted-type mutable sequence factory is
array.array. The types do not even have to be Python types, just machine
storage types. The typecode is part of the object and exposed as an
attribute. Such sequences cannot be 'degraded' because type-checking is
done with all operations. It would not be difficult to make a TypedList
class that did the same, either subclassing or wrapping list.
What you have noticed is that iter(array(tc,init)) does not get the
typecode information, so potentially useful information is lost. Your
first concrete proposal might be that the information be kept and that
arrayiterators get a type attribute corresponding to the Python type
that the produced values are converted to. Also, array could expose the
mapping to typecodes to Python types. These changes would allow
experiments that would show the value of your basic idea.
> as long as the
> implementation of the decoration didn't suffer from some form of
> "string failure" (string subclasses are basically worthless as methods
> return strings, not an instance of the class).
This problem is generic to subclassing built-in classes. List would be a
better example here since strings already are specialized sequences.
> 2. A "type decorator" on homogeneous collections and iterables has a
> lot of nice little benefits throughout the toolchain.
That is what you need to demonstrate, because it does not seem clear
yet. What would you do with an arrayiterator with a type attribute.
By the way, a 'decorator' in Python is a specific category of callable
used in a specific way. Perhaps you mean 'type attribute'?
> 3. Being able to place methods on a "type decorator" is useful,
'Placing methods' on an attribute or even a callable does not mean much.
You can only concretely add methods to concrete classes, not abstract
> it solves issues like "foo".join() which really wants to be a method on
> string collections.
No it does not. 'String collection' is a category, not a class. Nor can
it be a class without drastically revising Python. It is a category that
cuts across all generic collection classes. So .join has to be a method
of the joiner class.
> 4. I wanted to gauge people's feelings before I went through the steps
> involved in writing a PEP. I believe that is the right thing to do,
> so I don't feel the "hand waving" comment is warranted.
To the extent one does not understand what you say, and to the extent
that it seems disconnected from concrete reality, it is easy to see it
as hand waving. That you perhaps did not understand why .join is a
string method points in that direction.
> I've already
> learned people view collections that provide child object methods in
> vector form as a very big change
Because we understand that non-method functions have virtues, and Python
already has collection functions.
> even if it is backwards compatible; that is fine.
Backwards compatible duplication needs justification.
> I agree that changes to syntax and commonly used modules that impact
> how people interface with them should be carefully vetted. Type
> decorations on homogeneous collections/iterators are effectively
I am still not sure what you are really proposing. You may have the germ
of a useful idea, but I think it needs clarification and a demonstration.
> invisible in that perspective though;
Slowdowns are not invisible. Requiring a type check on every addition to
every built-in collection might result in such.
> the main problem with them as I
> see it is that it involves touching a lot of code to implement, even
> if the actual implementation would be simple.
Changes that touch a lot of code are fairly rare and require major
benefits. One was the switch to new-style classes started in 2.2 and
ended in 3.0. Several people contributed patches. They must have thought
that unifying types and classes into one system was worth it.
In 3.3, the two unicode implementations (one per build) are effectively
combined with a third with a new C-level API. Adding and tweaking the
new API (which continues today) and converting the entire C core and
stdlib codebase to the new API has required something on the order of 50
patches over 3 months, so far. But it improves performance (overall) and
removed the inherent bugs in representing 3-bytes chars with 2 2-byte
chars and in having different Python builds respond differently to the
same code. Note that the PEP concretely lays out the new C structures
and API and that there was a prototype implementation showing benefits
before it was approved.
Terry Jan Reedy
More information about the Python-ideas