I've opened an issue for adding non-English names to the turtle module's
function names: https://bugs.python.org/issue24990
This would effectively take this code:
t = turtle.Pen()
...and have this code in French be completely equivalent:
t = turtle.Plume()
(Pardon my google-translate French.)
This, of course, is terrible way for a software module to implement
internationalization, which usually does not apply to the source code names
itself. But turtle is used as a teaching tool. While professional
developers are expected to obtain proficiency with English, the same does
not apply to school kids who are just taking a small computer programming
unit. Having the turtle module available in their native language (even if
Python keywords are not) would remove a large barrier and let them focus on
the core programming concepts that turtle provides.
The popular Scratch tool has a similar internationalized setup and also has
LOGO-style commands, so most of the translation work is already done.
Are there any design or technical issues I should be aware of before doing
this? It seems like a straight forward "Tortuga = Turtle" assignment of
names, though I would have a set up so that it is easy to add languages to
I have a Google-translated set of translations here:
But of course, a native speaker would have to sign off on it before making
it part of the turtle module API.
I propose adding a function into inspect module that will retrieve
definitions of classes and functions (standard and lambdas) located
inside another function/method.
In my opinion this would a small but nice and useful addition to the
standard library. It can be implemented using a couple of undocumented
function from that module (findsource and getblock) without any
In : print(getsource(function))
# Some code
# Some more code
# Even more code
l = lambda x: 42
# Ugh code again
In : for c in function.__code__.co_consts:
....: if not iscode(c):
....: name, starts_line = c.co_name, c.co_firstlineno
....: if not name.startswith('<') or name == '<lambda>':
....: lines, _ = findsource(c)
....: source = ''.join(getblock(lines[starts_line-1:]))
....: print(dedent(source), end='-' * 30 + '\n')
l = lambda x: 42
What do you think?
I've received several long emails from Theo de Raadt (OpenBSD founder)
about Python's default random number generator. This is the random module,
and it defaults to a Mersenne Twister (MT) seeded by 2500 bytes of entropy
taken from os.urandom().
Theo's worry is that while the starting seed is fine, MT is not good when
random numbers are used for crypto and other security purposes. I've
countered that it's not meant for that (you should use
random.SystemRandom() or os.urandom() for that) but he counters that people
don't necessarily know that and are using the default random.random() setup
for security purposes without realizing how wrong that is.
There is already a warning in the docs for the random module that it's not
suitable for security, but -- as the meme goes -- nobody reads the docs.
Theo then went into technicalities that went straight over my head,
concluding with a strongly worded recommendation of the OpenBSD version of
arc4random() (which IIUC is based on something called "chacha", not on
"RC4" despite that being in the name). He says it is very fast (but I don't
know what that means).
I've invited Theo to join this list but he's too busy. The two core Python
experts on the random module have given me opinions suggesting that there's
not much wrong with MT, so here I am. Who is right? What should we do? Is
there anything we need to do?
--Guido van Rossum (python.org/~guido)
This is an expansion of the random module enhancement idea I
previously posted to Donald's thread:
I'll write it up as a full PEP later, but I think it's just as useful
in this form for now.
= Defining the problem =
We're moving into an era where the easiest way to publish software is
as a web application, with "deployment" to client systems done at
runtime via a web browser. It's regularly the case that "learn to
program" classes (especially those aimed at adults picking up
programming for the first time) will introduce folks to both a web
development framework and how to deploy web applications on a
developer focused service with a free hosting tier, like Heroku or
It's also the case that we live in an era where there's a lot of
well-intentioned-but-actually-bad advice on the internet when it comes
to generating security sensitive tokens, and the folks receiving that
advice through forums like Stack Overflow aren't necessarily ever
going to see the "don't do that" guidance in the standard library's
random module documentation, or the docs for the cryptography library,
or the docs for a web framework like Flask, Django or Pyramid.
One of the ways we know many of the folks doing web development often
don't take admonitions in documentation seriously is because one of
the most popular web servers for Python on these kinds of services is
Django's "runserver", even though Django's docs specifically say only
to use that for local development. It isn't OK to say "the developers
deserve the consequences that come to them" as in many case, it isn't
the developers that suffer the consequences, but the users of their
One reason we know weak RNGs can be a problem in practice is because
the same kind of concern exists in PHP web applications, and
shows how the relative predictability of password reset tokens can be
used to compromise administrator accounts.
Rather than playing whackamole with individual web applications (many
of which will be written by inexperienced developers), or attempting
to demonstrate that a deterministic PRNG is "secure enough" for these
use cases (when the research on PHP and deterministic PRNGs in general
indicates that it isn't), it is proposed to migrate Python to a
default random implementation that *is* known to be secure enough for
these kinds of use cases.
At the same time, deterministic random number generation is still
desirable in many situations, and we also don't want to require that
folks learning Python in the future be required to take a crash course
in web application security theory first. Thus, it is also proposed
that the abstraction used to present these differences to end users
minimise the references to the underlying security concepts.
A key outcome of this proposal is that it will retroactively upgrade a
lot of existing instructions on the internet for generating default
passwords and other sensitive tokens in Python from "actively harmful"
to "not necessarily ideal, but at least not wrong if you're using
This *is* a compatibility break for the sake of correcting default
behaviours that are fine when developing applications for local use,
but problematic from a network service security perspective, just as
happened with the introduction of hash randomisation. Unlike the hash
randomisation change, this one is readily addressed in old versions on
a case by case basis, so it is only proposed to make the change in a
future feature release of Python, not in any current maintenance
= Core abstraction =
The core concept of this proposal involves classifying random number
generators in Python as follows:
These terms are chosen to make sense to folks that have *no idea*
about the way different kinds of random number generator work and how
that affects their security properties, but do know whether or not
they need to be able to pass in a particular fixed seed in order to
regenerate the same series of outputs.
The guidance to Python users is then:
* we use the seedless RNG by default as it provides the best balance
of speed and security
* if you need to be able to exactly reproduce output sequences, use
the seedable RNG
* if you know you're doing security sensitive work, use the system RNG
directly to eliminate Python's seedless RNG as a potential source of
Importantly, there are relatively simple answers to the following two
questions (which could be added to the Design FAQ):
Q: Why isn't the seedable RNG the default random implementation (any more)?
A: The same properties that make it possible to provide an explicit
seed to the seedable RNG and get a predictable series of outputs make
it inappropriate for tasks like generating session IDs and password
reset tokens in web applications. Since folks continued to use the
default RNG for those cases, even after years of the core development
team, web framework developers and security engineers saying "Don't do
that, use the system RNG instead", we eventually changed the default
behaviour to just make those cases OK.
Q: Why isn't the system RNG the default implementation?
A: Due to the way operating systems work, calling into the kernel to
get a random number is always going to be slower than generating one
within the Python runtime. The default seedless generator provides
most of the same benefits as using the system RNG directly, but is an
order of magnitude faster as it doesn't need to call into the kernel
= Proposed change for Python 3.6 =
* add a random.SeedlessRandom API that omits the seed(), getstate()
and setstate() methods and uses a cryptographically secure PRNG
internally (such as the ChaCha20 algorithm implemented by OpenBSD)
* rename random.Random to random.SeedableRandom
* make random.Random a subclass of SeedableRandom that deprecates
seed(), getstate() and setstate()
* deprecate the seed(), getstate() and setstate() methods on SystemRandom
* expose the global SeedableRandom instance as random.seedable_random
* expose a global SeedlessRandom instance as random.seedless_random
* expose a global SystemRandom instance as random.system_random
* provide a random.set_default_instance() API that makes it possible
to specify the instance used by the module level methods
* the module level seed(), getstate(), and setstate() functions will
throw RuntimeError if the corresponding method is missing from the
In 3.6, "random.set_default_instance(random.seedless_random)" will opt
in to the CSPRNG when using the module level functions process wide,
while "from random import seedless_random as random" will do so on a
module by module basis.
"from random import system_random as random" also becomes available as
a simple upgrade path for security sensitive modules.
Appropriate helpers would be added to the six and future projects to
allow single source Python 2/3 projects to easily cope with the change
in behaviour when using the seeded RNG for its intended purposes. For
many projects, compatibility code will consist of the following lines
in a compatibility module:
from random import seedable_random as random
It would also be desirable for the seedless random number generator to
be made available as a PyPI package for use on older Python versions.
= Proposed change for Python 3.7 =
* random.Random becomes an alias for random.SeedlessRandom
* the default instance changes to be random.seedless_random
In 3.7, "random.set_default_instance(random.seedable_random)" will opt
back in to the deterministic PRNG when using the module level
functions process wide, while "from random import seedable_random as
random" will do so on a module by module basis.
= Seedable random number generation =
This is what we have today. The MT random implementation supports
explicit seeding, state retrieval, and state restoration. It doesn't
automatically mix in additional system entropy as it operates.
This is the right choice for use cases like computer games, map
generation, and randomising the order of test execution, as in these
situations, it's desirable to be able to reproduce a past sequence
= Seedless random number generators =
This is the key proposed new addition: a cryptographically secure,
non-deterministic, userspace PRNG. It's faster than the system RNG as
it avoids the need to make a system API call.
The "seedless" name comes from the fact that the inability to feed in
a fixed seed is the most obvious API difference relative to
deterministic RNGs, and hence provides a mental hook for people to
remember which is which, without needing to know the relevant
background security theory (which is arcane enough to be opaque even
to developers with decades of experience and hence isn't something we
want to be inflicting on folks in the process of learning to program).
= System random number generator =
The only proposed change here is providing a default instance to
enable the "from random import system_random as random" pattern.
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
Ok, I reached out to Theo de Raadt to talk to him about what he was suggesting
without Guido having to play messenger and forward fragments of the email
conversation. I'm starting a new thread because this email is rather long, and
I'm hoping to divorce it a bit from the back and forth about a proposal that
wasn't exactly what Theo was suggesting that is being discussed in the other
Essentially, there are three basic types of uses of random (the concept, not
the module). Those are:
1. People/usecases who absolutely need deterministic output given a seed and
for whom security properties don't matter.
2. People/usecases who absolutely need a cryptographically random output and
for whom having a deterministic output is a downside.
3. People/usecases that fall somewhere in between where it may or may not be
security sensitive or it may not be known if it's security sensitive.
The people in group #1 are currently, in the Python standard library, best
served using the MT random source as it provides exactly the kind of determinsm
they need. The people in group #2 are currently, in the Python standard
library, best served using os.urandom (either directly or via
However, the third case is the one that Theo's suggestion is attempting to
solve. In the current landscape, the security minded folks will tell these
people to use os.urandom/random.SystemRandom and the performance or otherwise
less security minded folks will likely tell them to just use random.py. Leaving
these people with a random that is not cryptographically safe.
The questin then is, does it matter if #3 are using a cryptographically safe
source of randomness? The answer is obviously that we don't know, and it's
possible that the user doesn't know. In these cases it's typically best if we
default to the more secure option and expect people to opt in to insecurity.
In the case of randomness, a lot of languages (Python included) don't do that
and instead they opt to pick the more peformant option first, often with the
argument (as seen in the other thread) that if people need a cryptographically
secure source of random, they'll know how to look for it and if they don't
know how to look for it, then it's likely they'll have some other security
problem. I think (and I believe Theo thinks) this sort of thinking is short
sighted. Let's take an example of a web application, it's going to need session
identifiers to put into a cookie, you'll want these to be random and it's not
obvious on the tin for a non-expert that you can't just use the module level
functions in the random module to do this. Another examples are generating API
keys or a password.
Looking on google, the first result for "python random password" is
StackOverflow which suggests:
''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))
However, it was later edited to, after that, include:
''.join(random.SystemRandom().choice(string.ascii_uppercase + string.digits) for _ in range(N))
So it wasn't obvious to the person who answered that question that the random
module's module scoped functions were not appropiate for this use. It appears
that the original answer lasted for roughly 4 years before it was corrected,
so who knows how many people used that in those 4 years.
The second result has someone asking if there is a better way to generate a
random password in Python than:
import os, random, string
length = 13
chars = string.ascii_letters + string.digits + '!@#$%^&*()'
random.seed = (os.urandom(1024))
print ''.join(random.choice(chars) for i in range(length))
This person obviously knew that os.urandom existed and that he should use it,
but failed to correctly identify that the random module's module scoped
functions were not what he wanted to use here.
The third result has this code:
chars=string.ascii_uppercase + string.ascii_lowercase + string.digits
return ''.join(random.choice(chars) for x in range(size,12))
I'm not going to keep pasting snippets, but going through the results it is
clear that in the bulk of cases, this search turns up code snippets that
suggest there is likely to be a lot of code out there that is unknownly using
the random module in a very insecure way. I think this is a failing of the
random.py module to provide an API that guides users to be safe which was
attempted to be papered over by adding a warning to the documentation, however
like has been said before, you can't solve a UX problem with documentation.
Then we come to why might we want to not provide a safe random by default for
the folks in the #3 group. As we've seen in the other thread, this basically
boils down to the fact that for a lot of users they don't care about the
security properties and they just want a fast random-esque value. This
particular case is made stronger by the fact that there is a lot of code out
there using Python's random module in a completely safe way that would regress
in a meaningful way if the random module slowed down.
The fact that speed is the primary reason not to give people in #3 a
cryptographically secure source of random by default is where we come back to
the meat of Theo's suggestion. His claim is that invoking os.urandom through
any of the interfaces imposes a performance penalty because it has to round
trip through the kernel crypto sub system for every request. His suggestion is
essentially that we provide an interface to a modern, good, userland
cryptographically secure source of random that is running within the same
process as Python itself. One such example of this is the arc4random function
(which doesn't actually provide ARC4 on OpenBSD, it provides ChaCha, it's not
tied to one specific algorithm) which comes from libc on many platforms.
According to Theo, modern userland CSPRNGs can create random bytes faster than
memcpy which eliminates the argument of speed for why a CSPRNG shouldn't be
the "default" source of randomness.
Thus the proposal is essentially:
* Provide an API to access a modern userland CSPRNG.
* Provide an implementation of random.SomeKindOfRandom that utilizes this.
* Move the MT based implementation of the random module to
* Deprecate the module scoped functions, instructing people to use the new
random.SomeKindofRandom unless they need deterministic random, in which case
This can of course be tweaked one way or the other, but that's the general idea
translated into something actionable for Python. I'm not sure exactly how I
feel about it, but I certainly do think that the current situation is confusing
to end users and leaving them in an insecure state, and that a minimum we
should move MT to something like random.DeterministicRandom and deprecate the
module scoped functions because it seems obvious to me that the idea of a
"default" random function that isn't safe is a footgun for users.
As an additional consideration, there are security experts who believe that
userland CSPRNGs should not be used at all. One of those is Thomas Ptacek who
wrote a blog post  on the subject. In this, Thomas makes the case that a
userland CSPRNG pretty much always depends on the cryptographic security of
the system random, but that it itself may be broken which means you're adding
a second, single point of failure where a mistake can cause you to get
non-random data out of the system. I had asked Theo about this, and he stated
that he disagreed with Thomas about never using a userland CSPRNG and in his
opinion that blog post was mostly warning people away from using something like
MT in the userland and away from /dev/random (which is often the cause of
people reaching for MT because /dev/random blocks which makes programs even
It seems to boil down to, do we want to try to protect users by default or at
least make it more obvious in the API which one they want to use (I think yes),
and if so do we think that /dev/urandom is "fast enough" for most people in
group #3 and if not, do we agree with Theo that a modern userland CSPRNG is
safe enough to use, or do we agree with Thomas that it's not and if we think
that it is, do we use arc4random and what do we do on systems that don't have
a modern userland CSPRNG in their libc.
PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
[Nathaniel Smith <njs(a)vorpus.org>]
> Yeah, the independent-seed-for-each-thread approach works for any RNG, but
> just like people feel better if they have a 100% certified guarantee that
> the RNG output in a single thread will pass through every combination of
> possible values (if you wait some cosmological time), they also feel better
> if there is some 100% certified guarantee that the RNG values in two threads
> will also be uncorrelated with each other.
> With something like MT, if two threads did end up with nearby seeds, then
> that would be bad: each thread individually would see values that looked
> like high quality randomness, but if you compared across the two threads,
> they would be identical modulo some lag. So all the nice theoretical
> analysis of the single threaded stream falls apart.
> However, for two independently seeded threads to end up anywhere near each
> other in the MT state space requires that you have picked two numbers
> between 0 and 2**19937 and gotten values that were "close". Assuming your
> seeding procedure is functional at all, then this is not a thing that will
> ever actually happen in this universe.
I think it's worse than that. MT is based on a linear recurrence.
Take two streams "far apart" in MT space, and their sum also satisfies
the recurrence. So a possible worry about a third stream isn't _just_
about correlation or overlap with the first two streams, but,
depending on the app, also about correlation/overlap with the sum of
the first two streams. Move to N streams, and there are O(N**2)
direct sums to worry about, and then sums of sums, and ...
Still won't cause a problem in _my_ statistical life expectancy, but I
only have 4 cores ;-)
> So AFAICT the rise of explicitly multi-threaded RNG designs is one of
> those fake problems that exists only so people can write papers about
> solving it. (Maybe this is uncharitable.)
Uncharitable, but fair :-)
> So there exist RNG designs that handle multi-threading explicitly, and it
> shows up on feature comparison checklists. I don't think it should really
> affect Python's decisions at all though.
There are some clean and easy approaches to this based on
crypto-inspired schemes, but giving up crypto strength for speed. If
you haven't read it, this paper is delightful:
In Python there is a operation for floor division: a // b.
Ceil division easy can be expressed via floor division: -((-a) // b).
But round division is more complicated. This operation is needed in
Fraction.__round__, in a number of methods in the datetime module (see
_divide_and_round). Due to the complexity of the correct Python
implementation, it is slower then just division.
I propose to add special function in the math module. This not only will
speed up Python implementation of the datetime module and the fractions
module, but will encourage users to use correct algorithm instead of
obvious but incorrect round(a/b).
On 10.09.2015 19:04, Xavier Combelle wrote:
>> I think this is the major misunderstanding here:
>> The random module never suggested that it generates pseudo-random data
>> of crypto quality.
>> I'm pretty sure people doing crypto will know and most others
>> simply don't care :-)
>> Evidence: We used a Wichmann-Hill PRNG as default in random
>> for a decade and people still got their work done. Mersenne
>> was added in Python 2.3 and bumped the period from
>> 6,953,607,871,644 (13 digits) to 2**19937-1 (6002 digits).
> It is not a evidence, I have an evidence of the opposite:
> some people can and does use random.random() for generating session key or
> csrf tokens and it's an insecure default.
It all depends on what you consider "secure" or "secure enough"
and points directly to another misunderstanding: that "secure"
is a well-defined term :-)
The random module seeds its global Random instance using urandom
(if available on the system), so while the generator itself is
deterministic, the seed used to kick off the pseudo-random series
is not. For many purposes, this is secure enough.
It's also easy to make the output of the random instance more
secure by passing it through a crypto hash function.
But back to the original question: What is "secure" ?
In crypto terms, "secure" usually refers to "computationally
infeasible to calculate before the sun goes dark" (to take one
More realistically, it can be defined as: Based on the public
knowledge known today, it's impossible to run a program which
allows converting the output of a crypto function back to its
inputs within a reasonable time span. And this property will
- based on today's knowledge - hold for at least the next
You may notice the many parameters in these definition attempts.
It all depends on who you ask.
With the advent of new technologies like quantum computers,
it's not at all clear that any of those definitions will still
hold in a couple of years. It's well possible that only quantum
computers will be able to implement the necessary programs
and it'll take a while for mobile phones to catch up and come
with chips implementing those ;-)
Now, leaving aside this bright future, what's reasonable today ?
If you look at tools like untwister:
you can get a feeling for how long it takes to deduce the
seed from an output sequence. Bare in mind, that in order
to be reasonably sure that the seed is correct, the available
output sequence has to be long enough.
That's a known plain text attack, so you need access to lots
of session keys to begin with.
The tools is still running on an example set of 1000 32-bit
numbers and it says it'll be done in 1.5 hours, i.e. before
the sun goes down in my timezone. I'll leave it running to
see whether it can find my secret key.
Untwister is only slightly smarter than bruteforce. Given
that MT has a seed size of 32 bits, it's not surprising that
a tool can find the seed within a day.
Perhaps it's time to switch to a better version of MT, e.g.
a 64-bit version (with 64-bit internal state):
or an even faster SIMD variant with better properties and
128 bit internal state:
Esp. the latter will help make brute force attacks practically
BTW: Looking at the sources of the _random module, I found that
the seed function uses the hash of non-integers such as e.g.
strings passed to it as seeds. Given the hash randomization
for strings this will create non-deterministic results, so it's
probably wise to only use 32-bit integers as seed values for
portability, if you need to rely on seeding the global Python
Professional Python Services directly from the Source (#1, Sep 11 2015)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
2015-09-18: PyCon UK 2015 ... 7 days to go
::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
On Fri, Sep 11, 2015, at 09:36, Steven D'Aprano wrote:
> Yes, calling `random.choice` is *significantly better* than calling
> `random.SomethingRandom().choice`. It's better for beginners, it's even
> better for expert users whose random needs are small, and those whose
> needs are greater shouldn't be using the later anyway.
Why is it that people who need deterministic/seed based random aren't
considered to be "those whose needs are greater"?
On Wed, Sep 9, 2015 at 10:48 PM, Andrew Barnert <abarnert(a)yahoo.com> wrote:
> On Sep 9, 2015, at 21:34, Jukka Lehtosalo <jlehtosalo(a)gmail.com> wrote:
> I'm not sure if I fully understand what you mean by implicit vs. explicit
> ABCs (and the static/runtime distinction). Could you define these terms and
> maybe give some examples of each?
> I just gave examples just one paragraph above.
> A (runtime) implicit ABC is something that uses a __subclasshook__
> (usually implementing a structural check). So, for instance, any type that
> implements __iter__ is-a Iterable, e.g., according to isinstance or
> issubclass or @singledispatch, because that's what
> Iterable.__subclasshook__ checks for.
> A (runtime) explicit ABC is something that isn't implicit, like Sequence:
> no hook, so nothing is-a Sequence unless it either inherits the ABC or
> registers with it.
> You're proposing a parallel but separate distinction at static typing
> time. Any ABC that's a Protocol is checked based on a structural check;
> otherwise, it's checked based on inheritance.
In my proposal I actually suggest that protocols shouldn't support
isinstance or issubclass (these operations should raise an exception) by
default. A protocol is free to override the default exception-raising
__subclasshook__ to implement a structural check, and a static type checker
would allow isinstance and issubclass for protocols that do this. I'll need
to explain this idea in more detail, as clearly the current explanation is
too easy to misundertand.
Here's a concrete example:
def f(self): ...
def f(self): print('f')
if isinstance(A(), X): ... # Raise an exception, because no
__subclasshook__ override in X
Previously I toyed with the idea of having a default implementation of
__subclasshook__ that actually does a structural check, but I'm no longer
sure if that would be desirable, as it's difficult to come up with an
implementation that does the right thing in all reasonable cases. For
example, consider a structural type like this that people might want to use
to work around the current limitations of Callable (it doesn't support
keyword arguments, for example):
def __call__(self, x, y): ...
(This example has some other potential issues that I'm hand-waving away for
Now how would the default isinstance work? Preferably it should only accept
callables that are compatible with the signature, but doing that check is
pretty difficult for arbitrary functions and should probably be out of
scope for the typing module. Just checking whether __call__ exists would be
too general, as the programmer probably expects that he's able to call the
method with the specific arguments the type suggests. Also, sometimes
checking the argument names would be a good thing to do, but sometimes any
names (as long the the number of arguments is compatible) would be fine.
> This means it's now possible to create supertypes that are implicit at
> runtime but explicit at static typing time (which might occasionally be
> useful), or vice-versa (which I can't imagine why you'd ever want).
As I showed above, you wouldn't get the latter unless you really try very
hard (consenting adults and all).
> Besides the obvious negatives in having two not-quite-compatible and
> very-different-looking ways of expressing the same concept, this is going
> to lead to people wanting to know why their type checker is complaining
> about perfectly good code ("I tested that constant with isinstance, and it
> really is-a Spammable, and the type checker is inferring its type properly,
> and yet I get an error passing it to a function that wants a Spammable") or
> allowing blatantly invalid code ("I annotated my function to only take
> Spammable arguments, but someone is passing something that calls the
> fallback implementation of my singledispatch function instead of the
> Spammable overload").
I agree that having the default nominal/explicit isinstance semantics for a
protocol type would be a very bad idea.
> Maybe the solution is to expand your proposal a little: make Protocol
> automatically create a __subclasshook__ (which you listed as an optional
> idea in the proposal), and also change all of the existing stdlib implicit
> ABCs to Protocols and scrap their manual hooks, and also update the
> relevant documentation (e.g., the abc module and the data model section on
> __subclasshook__) to recommend using Protocol instead of implementing a
> manual hook if the only thing you want is structural subtyping. Of course
> the backward compatibility isn't perfect (unless you want to manually munge
> up collections.abc when typing is imported), and people using legacy
> third-party code might need to add stubs (although that seems necessary
> anyway). But for most people, everything should just work as people expect.
> A type is either structurally typed or explicitly (via inheritance or
> registration) types, both at static typing time and a runtime, and that's
> always expressed by the name Protocol. (But for the rare cases when you
> really need a type check that's looser at runtime, you can still write a
> manual hook to handle that.)
Yeah, this would be nice, but as I argued above, implementing a generic
__subclasshook__ is actually quite tricky.