[Python-ideas] Lessons from typing hinting Whoosh (PEP484)

Andrew Barnert abarnert at yahoo.com
Mon Nov 16 17:16:38 EST 2015


On Nov 16, 2015, at 09:31, Guido van Rossum <guido at python.org> wrote:
> 
> If you really want DocId to be as small as int you should add
> `__slots__ = ()` to the class def. It's still a bit slower (some code
> special-cases exactly int) and it also has some other imperfections --
> e.g. DocId(2) + DocId(2) == int(4), not DocId(4). But that's probably
> okay for a document ID (and the solution is way too messy to use).

Good points. But I don't think any of those things matter for the OP's use case. And surely being able to runtime-check the type and get the same results as compile-time checks, and not requiring any new language features, are advantages?

I'm sure there are cases where the performance matters, but are there enough cases where the performance matters, runtime typing (e.g., in logs and debugger) doesn't matter, and static typing does matter?

As a side note, most of the C++ family have something akin to "typedef" and/or "using" type aliases that are explicitly _not_ treated as different types by the static typing algorithm: if your function wants a DocId and I pass an int, you get neither an error nor a warning, because they are literally just different names for the same type. So, if Python really does need a feature that's stricter than what C++ and friends do, using a name that evokes comparisons to their feature is probably not a good idea. The way C++ does what the OP is asking for is exactly what I suggested: an empty class that publicly inherits the type. (Of course it only works with class types, not simple types like int, in C++. But Python only has class types, so that wouldn't be a problem.)

> But in general this is a pretty verbose option.

Even with __slots__, it's fewer characters, and the same number of concepts, as the OP's proposed solution.

I think it's also more obvious to read: you're saying that DocId is a brand-new type that's a subtype of int (that adds nothing), which by definition means it can be used anywhere an int can be used, but not vice-versa. You don't have to know anything about MyPy or isinstance or anything else to figure out how they'll handle it (except the basic knowledge that Python's type system is generally a sensible OO system).

> If the main purpose is
> for documentation maybe type aliases, only used on annotations and
> other types are enough (`DocId = int`). But yeah, the type checker
> won't track it. (That's a possible future feature though.)
> 
> On Mon, Nov 16, 2015 at 9:19 AM, Andrew Barnert via Python-ideas
> <python-ideas at python.org> wrote:
>> On Nov 15, 2015, at 22:17, Matt Chaput <matt at whoosh.ca> wrote:
>>> 
>>> 2. It would be really nice if we could have "type aliasing" (or whatever it's called). I have a lot of types that are just something like "Tuple[int, int]", so type checking doesn't help much. It would be much more useful if I have a value that Python sees as (e.g.) an int, but have the type system track it as something more specific. Something like this:
>>> 
>>>   DocId = typing.TypeAlias(int)
>>>   DocLength = typing.TypeAlias(int)
>>> 
>>>   def id_and_length() -> Tuple[DocId, Length]:
>>>       docid = 5  # type: DocId
>>>       length = 10  # type: Length
>>>       return docid, length
>> 
>> It sounds like you want a subtype that adds no new semantics or other runtime effects. Like this:
>> 
>>    class DocId(int): pass
>>    class DocLength(int): pass
>> 
>>    def id_and_length() -> Tuple[DocId, DocLength]:
>>        return DocId(5), DocLength(10)
>> 
>> These will behave exactly like int objects, except that you can (dynamically or statically) type check them. It does mean that if someone uses "type(spam[0]) == int" it will fail, but I think if you care either way, you'd actually want it to fail. Meanwhile, "isinstance(spam[0], int)" or "spam[0] + eggs" or even using it in a function that requires something usable as a C long will work as expected. The object will also be the same size as an int in memory (although it will pickle a little bigger). It can't be optimized into a constant at compile time, but I doubt that's ever an issue. And it makes your intentions perfectly clear.
>> _______________________________________________
>> Python-ideas mailing list
>> Python-ideas at python.org
>> https://mail.python.org/mailman/listinfo/python-ideas
>> Code of Conduct: http://python.org/psf/codeofconduct/
> 
> 
> 
> -- 
> --Guido van Rossum (python.org/~guido)


More information about the Python-ideas mailing list