[Python-ideas] "Sum" Type hinting [was: Type hinting for path-related functions]

Stephen J. Turnbull stephen at xemacs.org
Sun May 15 13:20:08 EDT 2016


Steven D'Aprano writes:

 > Are we supposed to know what category theory's sum functor is?

No.  There are Pythonistas who do know, that was for their benefit.
But to clarify what I mean, I'll quote Guido's example (aside to
Guido: yes, that helped!):

 > PEP 484 has type variables with constraints, and the typing module
 > predefines AnyStr = TypeVar('AnyStr', str, bytes). This is then
 > used to define various polymorphic functions, e.g. in os.path:

 > def abspath(path: AnyStr) -> AnyStr: ...

 > This means that the type checker will know that the type of abspath(b'.')
 > is bytes, the type of abspath('.') is str.

My bad here.  This has the effect of what I called a Sum, I just
didn't recognize it as that when I last read PEP 484.

 > I also don't understand what you mean by '"suspicious" unions'
 > (quotes in original) or what makes them suspicious, or "suspicious"
 > as the case may be. What's suspicious about Union[bytes, str]?

Union types doesn't allow you to trace the source of an unexpected
type backward (or forward), thus, all instances of a Union type must
be considered potential sources of the "wrong" type, even if you can
infer some of the types of source variables.  Compare Number, another
union type: consider trying to figure out where an "unexpected" float
came from in function performing a long sequence of arithmetic
operations including many divisions.  You can't do it knowing only the
types of the arguments to the function, you need to know the values as
well.

But I shouldn't have taken the OP's Union[bytes,str] seriously.  In
fact the os.path functions and other such functions are defined with
AnyStr, which is a *variable* type, not a Union type.  In typeshed, we
write

os.path.basename(path: AnyStr) -> AnyStr: ...

Ie, AnyStr can be either str or bytes, but it must be the same in all
places it appears in the annotated signature.  This allows us to
reason forward from an appearance of bytes or str to determine what
the value of any composition of os or os.path AnyStr->AnyStr functions
would be.  Eg, in

    def foo(path: str) -> str:
        return os.path.basename(os.path.dirname(os.realpath(path)))

the composition is str->AnyStr->AnyStr->AnyStr, which resolves to
str->str->str->str, and so foo() typechecks sucessfully.  

Back to typing and the __fspath__ PEP.  The remaining question dealing
with typing of __fspath__ as currently specified is determining the
subtype of DirEntry you're looking at when its __fspath__ gets
invoked.  (A similar problem remains for the name attribute of file
objects, but they aren't scheduled to get a __fspath__.)  But AFAICT
there is no way to do that yet (I can't find DirEntry in the typing
module).  Maybe I can contribute to resolving these issues.

 > Are you proposing a Sum contructor for the typing module, as an 
 > alternative to Union? What will it do?

Not any more.  It would have done basically what constrained TypeVars
like AnyStr do, but more obscurely, and it would have required much
more effort and notation to work with functions with multiple
arguments of different types and the like.

 > If not, why are you talking about Sum[bytes, str]?

Because I didn't know better then. :-)



More information about the Python-ideas mailing list