Python-Dev
Threads by month
- ----- 2025 -----
- February
- January
- ----- 2024 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2003 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2002 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2001 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2000 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 1999 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
May 2017
- 79 participants
- 47 discussions
I'm back, I've re-read the PEP, and I've re-read the long thread with "(no
subject)".
I think Georg Brandl nailed it:
"""
*I like the "sequence and dict flattening" part of the PEP, mostly because
itis consistent and should be easy to understand, but the comprehension
syntaxenhancements seem to be bad for readability and "comprehending" what
the codedoes.The call syntax part is a mixed bag on the one hand it is nice
to be consistent with the extended possibilities in literals (…
[View More]flattening),
but on the other hand there would be small but annoying inconsistencies
anyways (e.g. the duplicate kwarg case above).*
"""
Greg Ewing followed up explaining that the inconsistency between dict
flattening and call syntax is inherent in the pre-existing different rules
for dicts vs. keyword args: {'a':1, 'a':2} results in {'a':2}, while f(a=1,
a=2) is an error. (This form is a SyntaxError; the dynamic case f(a=1,
**{'a': 1}) is a TypeError.)
For me, allowing f(*a, *b) and f(**d, **e) and all the other combinations
for function calls proposed by the PEP is an easy +1 -- it's a
straightforward extension of the existing pattern, and anybody who knows
what f(x, *a) does will understand f(x, *a, y, *b). Guessing what f(**d,
**e) means shouldn't be hard either. Understanding the edge case for
duplicate keys with f(**d, **e) is a little harder, but the error messages
are pretty clear, and it is not a new edge case.
The sequence and dict flattening syntax proposals are also clean and
logical -- we already have *-unpacking on the receiving side, so allowing
*x in tuple expressions reads pretty naturally (and the similarity with *a
in argument lists certainly helps). From here, having [a, *x, b, *y] is
also natural, and then the extension to other displays is natural: {a, *x,
b, *y} and {a:1, **d, b:2, **e}. This, too, gets a +1 from me.
So that leaves comprehensions. IIRC, during the development of the patch we
realized that f(*x for x in xs) is sufficiently ambiguous that we decided
to disallow it -- note that f(x for x in xs) is already somewhat of a
special case because an argument can only be a "bare" generator expression
if it is the only argument. The same reasoning doesn't apply (in that form)
to list, set and dict comprehensions -- while f(x for x in xs) is identical
in meaning to f((x for x in xs)), [x for x in xs] is NOT the same as [(x
for x in xs)] (that's a list of one element, and the element is a generator
expression).
The basic premise of this part of the proposal is that if you have a few
iterables, the new proposal (without comprehensions) lets you create a list
or generator expression that iterates over all of them, essentially
flattening them:
>>> xs = [1, 2, 3]
>>> ys = ['abc', 'def']
>>> zs = [99]
>>> [*xs, *ys, *zs]
[1, 2, 3, 'abc', 'def', 99]
>>>
But now suppose you have a list of iterables:
>>> xss = [[1, 2, 3], ['abc', 'def'], [99]]
>>> [*xss[0], *xss[1], *xss[2]]
[1, 2, 3, 'abc', 'def', 99]
>>>
Wouldn't it be nice if you could write the latter using a comprehension?
>>> xss = [[1, 2, 3], ['abc', 'def'], [99]]
>>> [*xs for xs in xss]
[1, 2, 3, 'abc', 'def', 99]
>>>
This is somewhat seductive, and the following is even nicer: the *xs
position may be an expression, e.g.:
>>> xss = [[1, 2, 3], ['abc', 'def'], [99]]
>>> [*xs[:2] for xs in xss]
[1, 2, 'abc', 'def', 99]
>>>
On the other hand, I had to explore the possibilities here by experimenting
in the interpreter, and I discovered some odd edge cases (e.g. you can
parenthesize the starred expression, but that seems a syntactic accident).
All in all I am personally +0 on the comprehension part of the PEP, and I
like that it provides a way to "flatten" a sequence of sequences, but I
think very few people in the thread have supported this part. Therefore I
would like to ask Neil to update the PEP and the patch to take out the
comprehension part, so that the two "easy wins" can make it into Python 3.5
(basically, I am accepting two-thirds of the PEP :-). There is some time
yet until alpha 2.
I would also like code reviewers (Benjamin?) to start reviewing the patch
<http://bugs.python.org/issue2292>, taking into account that the
comprehension part needs to be removed.
--
--Guido van Rossum (python.org/~guido)
[View Less]
7
17
It has been a while since I posted a copy of PEP 1 to the mailing
lists and newsgroups. I've recently done some updating of a few
sections, so in the interest of gaining wider community participation
in the Python development process, I'm posting the latest revision of
PEP 1 here. A version of the PEP is always available on-line at
http://www.python.org/peps/pep-0001.html
Enjoy,
-Barry
-------------------- snip snip --------------------
PEP: 1
Title: PEP Purpose and Guidelines
Version: …
[View More]$Revision: 1.36 $
Last-Modified: $Date: 2002/07/29 18:34:59 $
Author: Barry A. Warsaw, Jeremy Hylton
Status: Active
Type: Informational
Created: 13-Jun-2000
Post-History: 21-Mar-2001, 29-Jul-2002
What is a PEP?
PEP stands for Python Enhancement Proposal. A PEP is a design
document providing information to the Python community, or
describing a new feature for Python. The PEP should provide a
concise technical specification of the feature and a rationale for
the feature.
We intend PEPs to be the primary mechanisms for proposing new
features, for collecting community input on an issue, and for
documenting the design decisions that have gone into Python. The
PEP author is responsible for building consensus within the
community and documenting dissenting opinions.
Because the PEPs are maintained as plain text files under CVS
control, their revision history is the historical record of the
feature proposal[1].
Kinds of PEPs
There are two kinds of PEPs. A standards track PEP describes a
new feature or implementation for Python. An informational PEP
describes a Python design issue, or provides general guidelines or
information to the Python community, but does not propose a new
feature. Informational PEPs do not necessarily represent a Python
community consensus or recommendation, so users and implementors
are free to ignore informational PEPs or follow their advice.
PEP Work Flow
The PEP editor, Barry Warsaw <peps(a)python.org>, assigns numbers
for each PEP and changes its status.
The PEP process begins with a new idea for Python. It is highly
recommended that a single PEP contain a single key proposal or new
idea. The more focussed the PEP, the more successfully it tends
to be. The PEP editor reserves the right to reject PEP proposals
if they appear too unfocussed or too broad. If in doubt, split
your PEP into several well-focussed ones.
Each PEP must have a champion -- someone who writes the PEP using
the style and format described below, shepherds the discussions in
the appropriate forums, and attempts to build community consensus
around the idea. The PEP champion (a.k.a. Author) should first
attempt to ascertain whether the idea is PEP-able. Small
enhancements or patches often don't need a PEP and can be injected
into the Python development work flow with a patch submission to
the SourceForge patch manager[2] or feature request tracker[3].
The PEP champion then emails the PEP editor <peps(a)python.org> with
a proposed title and a rough, but fleshed out, draft of the PEP.
This draft must be written in PEP style as described below.
If the PEP editor approves, he will assign the PEP a number, label
it as standards track or informational, give it status 'draft',
and create and check-in the initial draft of the PEP. The PEP
editor will not unreasonably deny a PEP. Reasons for denying PEP
status include duplication of effort, being technically unsound,
not providing proper motivation or addressing backwards
compatibility, or not in keeping with the Python philosophy. The
BDFL (Benevolent Dictator for Life, Guido van Rossum) can be
consulted during the approval phase, and is the final arbitrator
of the draft's PEP-ability.
If a pre-PEP is rejected, the author may elect to take the pre-PEP
to the comp.lang.python newsgroup (a.k.a. python-list(a)python.org
mailing list) to help flesh it out, gain feedback and consensus
from the community at large, and improve the PEP for
re-submission.
The author of the PEP is then responsible for posting the PEP to
the community forums, and marshaling community support for it. As
updates are necessary, the PEP author can check in new versions if
they have CVS commit permissions, or can email new PEP versions to
the PEP editor for committing.
Standards track PEPs consists of two parts, a design document and
a reference implementation. The PEP should be reviewed and
accepted before a reference implementation is begun, unless a
reference implementation will aid people in studying the PEP.
Standards Track PEPs must include an implementation - in the form
of code, patch, or URL to same - before it can be considered
Final.
PEP authors are responsible for collecting community feedback on a
PEP before submitting it for review. A PEP that has not been
discussed on python-list(a)python.org and/or python-dev(a)python.org
will not be accepted. However, wherever possible, long open-ended
discussions on public mailing lists should be avoided. Strategies
to keep the discussions efficient include, setting up a separate
SIG mailing list for the topic, having the PEP author accept
private comments in the early design phases, etc. PEP authors
should use their discretion here.
Once the authors have completed a PEP, they must inform the PEP
editor that it is ready for review. PEPs are reviewed by the BDFL
and his chosen consultants, who may accept or reject a PEP or send
it back to the author(s) for revision.
Once a PEP has been accepted, the reference implementation must be
completed. When the reference implementation is complete and
accepted by the BDFL, the status will be changed to `Final.'
A PEP can also be assigned status `Deferred.' The PEP author or
editor can assign the PEP this status when no progress is being
made on the PEP. Once a PEP is deferred, the PEP editor can
re-assign it to draft status.
A PEP can also be `Rejected'. Perhaps after all is said and done
it was not a good idea. It is still important to have a record of
this fact.
PEPs can also be replaced by a different PEP, rendering the
original obsolete. This is intended for Informational PEPs, where
version 2 of an API can replace version 1.
PEP work flow is as follows:
Draft -> Accepted -> Final -> Replaced
^
+----> Rejected
v
Deferred
Some informational PEPs may also have a status of `Active' if they
are never meant to be completed. E.g. PEP 1.
What belongs in a successful PEP?
Each PEP should have the following parts:
1. Preamble -- RFC822 style headers containing meta-data about the
PEP, including the PEP number, a short descriptive title
(limited to a maximum of 44 characters), the names, and
optionally the contact info for each author, etc.
2. Abstract -- a short (~200 word) description of the technical
issue being addressed.
3. Copyright/public domain -- Each PEP must either be explicitly
labelled as placed in the public domain (see this PEP as an
example) or licensed under the Open Publication License[4].
4. Specification -- The technical specification should describe
the syntax and semantics of any new language feature. The
specification should be detailed enough to allow competing,
interoperable implementations for any of the current Python
platforms (CPython, JPython, Python .NET).
5. Motivation -- The motivation is critical for PEPs that want to
change the Python language. It should clearly explain why the
existing language specification is inadequate to address the
problem that the PEP solves. PEP submissions without
sufficient motivation may be rejected outright.
6. Rationale -- The rationale fleshes out the specification by
describing what motivated the design and why particular design
decisions were made. It should describe alternate designs that
were considered and related work, e.g. how the feature is
supported in other languages.
The rationale should provide evidence of consensus within the
community and discuss important objections or concerns raised
during discussion.
7. Backwards Compatibility -- All PEPs that introduce backwards
incompatibilities must include a section describing these
incompatibilities and their severity. The PEP must explain how
the author proposes to deal with these incompatibilities. PEP
submissions without a sufficient backwards compatibility
treatise may be rejected outright.
8. Reference Implementation -- The reference implementation must
be completed before any PEP is given status 'Final,' but it
need not be completed before the PEP is accepted. It is better
to finish the specification and rationale first and reach
consensus on it before writing code.
The final implementation must include test code and
documentation appropriate for either the Python language
reference or the standard library reference.
PEP Template
PEPs are written in plain ASCII text, and should adhere to a
rigid style. There is a Python script that parses this style and
converts the plain text PEP to HTML for viewing on the web[5].
PEP 9 contains a boilerplate[7] template you can use to get
started writing your PEP.
Each PEP must begin with an RFC822 style header preamble. The
headers must appear in the following order. Headers marked with
`*' are optional and are described below. All other headers are
required.
PEP: <pep number>
Title: <pep title>
Version: <cvs version string>
Last-Modified: <cvs date string>
Author: <list of authors' real names and optionally, email addrs>
* Discussions-To: <email address>
Status: <Draft | Active | Accepted | Deferred | Final | Replaced>
Type: <Informational | Standards Track>
* Requires: <pep numbers>
Created: <date created on, in dd-mmm-yyyy format>
* Python-Version: <version number>
Post-History: <dates of postings to python-list and python-dev>
* Replaces: <pep number>
* Replaced-By: <pep number>
The Author: header lists the names and optionally, the email
addresses of all the authors/owners of the PEP. The format of the
author entry should be
address(a)dom.ain (Random J. User)
if the email address is included, and just
Random J. User
if the address is not given. If there are multiple authors, each
should be on a separate line following RFC 822 continuation line
conventions. Note that personal email addresses in PEPs will be
obscured as a defense against spam harvesters.
Standards track PEPs must have a Python-Version: header which
indicates the version of Python that the feature will be released
with. Informational PEPs do not need a Python-Version: header.
While a PEP is in private discussions (usually during the initial
Draft phase), a Discussions-To: header will indicate the mailing
list or URL where the PEP is being discussed. No Discussions-To:
header is necessary if the PEP is being discussed privately with
the author, or on the python-list or python-dev email mailing
lists. Note that email addresses in the Discussions-To: header
will not be obscured.
Created: records the date that the PEP was assigned a number,
while Post-History: is used to record the dates of when new
versions of the PEP are posted to python-list and/or python-dev.
Both headers should be in dd-mmm-yyyy format, e.g. 14-Aug-2001.
PEPs may have a Requires: header, indicating the PEP numbers that
this PEP depends on.
PEPs may also have a Replaced-By: header indicating that a PEP has
been rendered obsolete by a later document; the value is the
number of the PEP that replaces the current document. The newer
PEP must have a Replaces: header containing the number of the PEP
that it rendered obsolete.
PEP Formatting Requirements
PEP headings must begin in column zero and the initial letter of
each word must be capitalized as in book titles. Acronyms should
be in all capitals. The body of each section must be indented 4
spaces. Code samples inside body sections should be indented a
further 4 spaces, and other indentation can be used as required to
make the text readable. You must use two blank lines between the
last line of a section's body and the next section heading.
You must adhere to the Emacs convention of adding two spaces at
the end of every sentence. You should fill your paragraphs to
column 70, but under no circumstances should your lines extend
past column 79. If your code samples spill over column 79, you
should rewrite them.
Tab characters must never appear in the document at all. A PEP
should include the standard Emacs stanza included by example at
the bottom of this PEP.
A PEP must contain a Copyright section, and it is strongly
recommended to put the PEP in the public domain.
When referencing an external web page in the body of a PEP, you
should include the title of the page in the text, with a
footnote reference to the URL. Do not include the URL in the body
text of the PEP. E.g.
Refer to the Python Language web site [1] for more details.
...
[1] http://www.python.org
When referring to another PEP, include the PEP number in the body
text, such as "PEP 1". The title may optionally appear. Add a
footnote reference that includes the PEP's title and author. It
may optionally include the explicit URL on a separate line, but
only in the References section. Note that the pep2html.py script
will calculate URLs automatically, e.g.:
...
Refer to PEP 1 [7] for more information about PEP style
...
References
[7] PEP 1, PEP Purpose and Guidelines, Warsaw, Hylton
http://www.python.org/peps/pep-0001.html
If you decide to provide an explicit URL for a PEP, please use
this as the URL template:
http://www.python.org/peps/pep-xxxx.html
PEP numbers in URLs must be padded with zeros from the left, so as
to be exactly 4 characters wide, however PEP numbers in text are
never padded.
Reporting PEP Bugs, or Submitting PEP Updates
How you report a bug, or submit a PEP update depends on several
factors, such as the maturity of the PEP, the preferences of the
PEP author, and the nature of your comments. For the early draft
stages of the PEP, it's probably best to send your comments and
changes directly to the PEP author. For more mature, or finished
PEPs you may want to submit corrections to the SourceForge bug
manager[6] or better yet, the SourceForge patch manager[2] so that
your changes don't get lost. If the PEP author is a SF developer,
assign the bug/patch to him, otherwise assign it to the PEP
editor.
When in doubt about where to send your changes, please check first
with the PEP author and/or PEP editor.
PEP authors who are also SF committers, can update the PEPs
themselves by using "cvs commit" to commit their changes.
Remember to also push the formatted PEP text out to the web by
doing the following:
% python pep2html.py -i NUM
where NUM is the number of the PEP you want to push out. See
% python pep2html.py --help
for details.
Transferring PEP Ownership
It occasionally becomes necessary to transfer ownership of PEPs to
a new champion. In general, we'd like to retain the original
author as a co-author of the transferred PEP, but that's really up
to the original author. A good reason to transfer ownership is
because the original author no longer has the time or interest in
updating it or following through with the PEP process, or has
fallen off the face of the 'net (i.e. is unreachable or not
responding to email). A bad reason to transfer ownership is
because you don't agree with the direction of the PEP. We try to
build consensus around a PEP, but if that's not possible, you can
always submit a competing PEP.
If you are interested assuming ownership of a PEP, send a message
asking to take over, addressed to both the original author and the
PEP editor <peps(a)python.org>. If the original author doesn't
respond to email in a timely manner, the PEP editor will make a
unilateral decision (it's not like such decisions can be
reversed. :).
References and Footnotes
[1] This historical record is available by the normal CVS commands
for retrieving older revisions. For those without direct access
to the CVS tree, you can browse the current and past PEP revisions
via the SourceForge web site at
http://cvs.sourceforge.net/cgi-bin/cvsweb.cgi/python/nondist/peps/?cvsroot=…
[2] http://sourceforge.net/tracker/?group_id=5470&atid=305470
[3] http://sourceforge.net/tracker/?atid=355470&group_id=5470&func=browse
[4] http://www.opencontent.org/openpub/
[5] The script referred to here is pep2html.py, which lives in
the same directory in the CVS tree as the PEPs themselves.
Try "pep2html.py --help" for details.
The URL for viewing PEPs on the web is
http://www.python.org/peps/
[6] http://sourceforge.net/tracker/?group_id=5470&atid=305470
[7] PEP 9, Sample PEP Template
http://www.python.org/peps/pep-0009.html
Copyright
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
[View Less]
8
14
![](https://secure.gravatar.com/avatar/6f4d592eab04f593a4b62ff87015dbea.jpg?s=120&d=mm&r=g)
congrats on 3.5! Alas, windows 7 users are having problems installing it
by Laura Creighton Sept. 15, 2019
by Laura Creighton Sept. 15, 2019
Sept. 15, 2019
webmaster has already heard from 4 people who cannot install it.
I sent them to the bug tracker or to python-list but they seem
not to have gone either place. Is there some guide I should be
sending them to, 'how to debug installation problems'?
Laura
5
6
Hello all,
Over in Ubuntu, we've gotten reports about some performance regressions in
Python 2.7 when moving from Trusty (14.04 LTS) to Xenial (16.04 LTS).
Trusty's version is based on 2.7.6 while Xenial's version is based on 2.7.12
with bits of .13 cherry picked.
We've not been able to identify any change in Python itself (or the
Debian/Ubuntu deltas) which could account for this, so the investigation has
led to various gcc compiler options and version differences. In particular
disabling …
[View More]LTO (link-time optimization) seems to have a positive impact, but
doesn't completely regain the loss.
Louis (Cc'd here) has done a ton of work to measure and analyze the problem,
but we've more or less hit a roadblock, so we're taking the issue public to
see if anybody on this mailing list has further ideas. A detailed analysis is
available in this Google doc:
https://docs.google.com/document/d/1zrV3OIRSo99fd2Ty4YdGk_scmTRDmVauBprKL8e…
The document should be public for comments and editing.
If you have any thoughts, or other lines of investigation you think are
worthwhile pursuing, please add your comments to the document.
Cheers,
-Barry
[View Less]
11
21
Hi folks,
I was looking at some `dis` output today, and I was wondering if anyone has
investigated optimizing Python (slightly) by adding special-case bytecodes
for common expressions or statements involving constants?
For example, I (and, based on a quick grep of the stdlib, many others)
write "x is None" and "x is not None" very often. Or "return True" or
"return None" or "return 1" and things like that. These all expand into two
bytecodes, which seems pretty non-optimal (LOAD_CONST + …
[View More]COMPARE_OP or
LOAD_CONST + RETURN_VALUE). It seems we could get an easy speedup for these
common cases by adding a peephole optimization and some new opcodes (maybe
COMPARE_IS_SMALL_CONST and RETURN_SMALL_CONST for these cases).
I'm not proposing to do this yet, as I'd need to benchmark to see how much
of a gain (if any) it would amount to, but I'm just wondering if there's
any previous work on this kind of thing. Or, if not, any other thoughts
before I try it?
Thanks,
Ben
[View Less]
9
18
Hi all,
After collecting suggestions in the previous discussion on python-dev
https://mail.python.org/pipermail/python-dev/2017-March/thread.html#147629
and playing with implementation, here is an updated version of PEP 544.
--
Ivan
A link for those who don't like reading long e-mails:
https://www.python.org/dev/peps/pep-0544/
=========================
PEP: 544
Title: Protocols
Version: $Revision$
Last-Modified: $Date$
Author: Ivan Levkivskyi <levkivskyi(a)gmail.com>, Jukka …
[View More]Lehtosalo <
jukka.lehtosalo(a)iki.fi>, Łukasz Langa <lukasz(a)langa.pl>
Discussions-To: Python-Dev <python-dev(a)python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 05-Mar-2017
Python-Version: 3.7
Abstract
========
Type hints introduced in PEP 484 can be used to specify type metadata
for static type checkers and other third party tools. However, PEP 484
only specifies the semantics of *nominal* subtyping. In this PEP we specify
static and runtime semantics of protocol classes that will provide a support
for *structural* subtyping (static duck typing).
.. _rationale:
Rationale and Goals
===================
Currently, PEP 484 and the ``typing`` module [typing]_ define abstract
base classes for several common Python protocols such as ``Iterable`` and
``Sized``. The problem with them is that a class has to be explicitly marked
to support them, which is unpythonic and unlike what one would
normally do in idiomatic dynamically typed Python code. For example,
this conforms to PEP 484::
from typing import Sized, Iterable, Iterator
class Bucket(Sized, Iterable[int]):
...
def __len__(self) -> int: ...
def __iter__(self) -> Iterator[int]: ...
The same problem appears with user-defined ABCs: they must be explicitly
subclassed or registered. This is particularly difficult to do with library
types as the type objects may be hidden deep in the implementation
of the library. Also, extensive use of ABCs might impose additional
runtime costs.
The intention of this PEP is to solve all these problems
by allowing users to write the above code without explicit base classes in
the class definition, allowing ``Bucket`` to be implicitly considered
a subtype of both ``Sized`` and ``Iterable[int]`` by static type checkers
using structural [wiki-structural]_ subtyping::
from typing import Iterator, Iterable
class Bucket:
...
def __len__(self) -> int: ...
def __iter__(self) -> Iterator[int]: ...
def collect(items: Iterable[int]) -> int: ...
result: int = collect(Bucket()) # Passes type check
Note that ABCs in ``typing`` module already provide structural behavior
at runtime, ``isinstance(Bucket(), Iterable)`` returns ``True``.
The main goal of this proposal is to support such behavior statically.
The same functionality will be provided for user-defined protocols, as
specified below. The above code with a protocol class matches common Python
conventions much better. It is also automatically extensible and works
with additional, unrelated classes that happen to implement
the required protocol.
Nominal vs structural subtyping
-------------------------------
Structural subtyping is natural for Python programmers since it matches
the runtime semantics of duck typing: an object that has certain properties
is treated independently of its actual runtime class.
However, as discussed in PEP 483, both nominal and structural
subtyping have their strengths and weaknesses. Therefore, in this PEP we
*do not propose* to replace the nominal subtyping described by PEP 484 with
structural subtyping completely. Instead, protocol classes as specified in
this PEP complement normal classes, and users are free to choose
where to apply a particular solution. See section on `rejected`_ ideas at
the
end of this PEP for additional motivation.
Non-goals
---------
At runtime, protocol classes will be simple ABCs. There is no intent to
provide sophisticated runtime instance and class checks against protocol
classes. This would be difficult and error-prone and will contradict the
logic
of PEP 484. As well, following PEP 484 and PEP 526 we state that protocols
are
**completely optional**:
* No runtime semantics will be imposed for variables or parameters annotated
with a protocol class.
* Any checks will be performed only by third-party type checkers and
other tools.
* Programmers are free to not use them even if they use type annotations.
* There is no intent to make protocols non-optional in the future.
Existing Approaches to Structural Subtyping
===========================================
Before describing the actual specification, we review and comment on
existing
approaches related to structural subtyping in Python and other languages:
* ``zope.interface`` [zope-interfaces]_ was one of the first widely used
approaches to structural subtyping in Python. It is implemented by
providing
special classes to distinguish interface classes from normal classes,
to mark interface attributes, and to explicitly declare implementation.
For example::
from zope.interface import Interface, Attribute, implementer
class IEmployee(Interface):
name = Attribute("Name of employee")
def do(work):
"""Do some work"""
@implementer(IEmployee)
class Employee:
name = 'Anonymous'
def do(self, work):
return work.start()
Zope interfaces support various contracts and constraints for interface
classes. For example::
from zope.interface import invariant
def required_contact(obj):
if not (obj.email or obj.phone):
raise Exception("At least one contact info is required")
class IPerson(Interface):
name = Attribute("Name")
email = Attribute("Email Address")
phone = Attribute("Phone Number")
invariant(required_contact)
Even more detailed invariants are supported. However, Zope interfaces rely
entirely on runtime validation. Such focus on runtime properties goes
beyond the scope of the current proposal, and static support for
invariants
might be difficult to implement. However, the idea of marking an interface
class with a special base class is reasonable and easy to implement both
statically and at runtime.
* Python abstract base classes [abstract-classes]_ are the standard
library tool to provide some functionality similar to structural
subtyping.
The drawback of this approach is the necessity to either subclass
the abstract class or register an implementation explicitly::
from abc import ABC
class MyTuple(ABC):
pass
MyTuple.register(tuple)
assert issubclass(tuple, MyTuple)
assert isinstance((), MyTuple)
As mentioned in the `rationale`_, we want to avoid such necessity,
especially
in static context. However, in a runtime context, ABCs are good
candidates for
protocol classes and they are already used extensively in
the ``typing`` module.
* Abstract classes defined in ``collections.abc`` module [collections-abc]_
are slightly more advanced since they implement a custom
``__subclasshook__()`` method that allows runtime structural checks
without
explicit registration::
from collections.abc import Iterable
class MyIterable:
def __iter__(self):
return []
assert isinstance(MyIterable(), Iterable)
Such behavior seems to be a perfect fit for both runtime and static
behavior
of protocols. As discussed in `rationale`_, we propose to add static
support
for such behavior. In addition, to allow users to achieve such runtime
behavior for *user-defined* protocols a special ``@runtime`` decorator
will
be provided, see detailed `discussion`_ below.
* TypeScript [typescript]_ provides support for user-defined classes and
interfaces. Explicit implementation declaration is not required and
structural subtyping is verified statically. For example::
interface LabeledItem {
label: string;
size?: int;
}
function printLabel(obj: LabeledItem) {
console.log(obj.label);
}
let myObj = {size: 10, label: "Size 10 Object"};
printLabel(myObj);
Note that optional interface members are supported. Also, TypeScript
prohibits redundant members in implementations. While the idea of
optional members looks interesting, it would complicate this proposal and
it is not clear how useful it will be. Therefore it is proposed to
postpone
this; see `rejected`_ ideas. In general, the idea of static protocol
checking without runtime implications looks reasonable, and basically
this proposal follows the same line.
* Go [golang]_ uses a more radical approach and makes interfaces the primary
way to provide type information. Also, assignments are used to explicitly
ensure implementation::
type SomeInterface interface {
SomeMethod() ([]byte, error)
}
if _, ok := someval.(SomeInterface); ok {
fmt.Printf("value implements some interface")
}
Both these ideas are questionable in the context of this proposal. See
the section on `rejected`_ ideas.
.. _specification:
Specification
=============
Terminology
-----------
We propose to use the term *protocols* for types supporting structural
subtyping. The reason is that the term *iterator protocol*,
for example, is widely understood in the community, and coming up with
a new term for this concept in a statically typed context would just create
confusion.
This has the drawback that the term *protocol* becomes overloaded with
two subtly different meanings: the first is the traditional, well-known but
slightly fuzzy concept of protocols such as iterator; the second is the more
explicitly defined concept of protocols in statically typed code.
The distinction is not important most of the time, and in other
cases we propose to just add a qualifier such as *protocol classes*
when referring to the static type concept.
If a class includes a protocol in its MRO, the class is called
an *explicit* subclass of the protocol. If a class is a structural subtype
of a protocol, it is said to implement the protocol and to be compatible
with a protocol. If a class is compatible with a protocol but the protocol
is not included in the MRO, the class is an *implicit* subtype
of the protocol. (Note that one can explicitly subclass a protocol and
still not implement it if a protocol attribute is set to ``None``
in the subclass, see Python [data-model]_ for details.)
The attributes (variables and methods) of a protocol that are mandatory
for other class in order to be considered a structural subtype are called
protocol members.
.. _definition:
Defining a protocol
-------------------
Protocols are defined by including a special new class ``typing.Protocol``
(an instance of ``abc.ABCMeta``) in the base classes list, typically
at the end of the list. Here is a simple example::
from typing import Protocol
class SupportsClose(Protocol):
def close(self) -> None:
...
Now if one defines a class ``Resource`` with a ``close()`` method that has
a compatible signature, it would implicitly be a subtype of
``SupportsClose``, since the structural subtyping is used for
protocol types::
class Resource:
...
def close(self) -> None:
self.file.close()
self.lock.release()
Apart from few restrictions explicitly mentioned below, protocol types can
be used in every context where a normal types can::
def close_all(things: Iterable[SupportsClose]) -> None:
for t in things:
t.close()
f = open('foo.txt')
r = Resource()
close_all([f, r]) # OK!
close_all([1]) # Error: 'int' has no 'close' method
Note that both the user-defined class ``Resource`` and the built-in
``IO`` type (the return type of ``open()``) are considered subtypes of
``SupportsClose``, because they provide a ``close()`` method with
a compatible type signature.
Protocol members
----------------
All methods defined in the protocol class body are protocol members, both
normal and decorated with ``@abstractmethod``. If any parameters of a
protocol method are not annotated, then their types are assumed to be
``Any``
(see PEP 484). Bodies of protocol methods are type checked.
An abstract method that should not be called via ``super()`` ought to raise
``NotImplementedError``. Example::
from typing import Protocol
from abc import abstractmethod
class Example(Protocol):
def first(self) -> int: # This is a protocol member
return 42
@abstractmethod
def second(self) -> int: # Method without a default implementation
raise NotImplementedError
Static methods, class methods, and properties are equally allowed
in protocols.
To define a protocol variable, one can use PEP 526 variable
annotations in the class body. Additional attributes *only* defined in
the body of a method by assignment via ``self`` are not allowed. The
rationale
for this is that the protocol class implementation is often not shared by
subtypes, so the interface should not depend on the default implementation.
Examples::
from typing import Protocol, List
class Template(Protocol):
name: str # This is a protocol member
value: int = 0 # This one too (with default)
def method(self) -> None:
self.temp: List[int] = [] # Error in type checker
class Concrete:
def __init__(self, name: str, value: int) -> None:
self.name = name
self.value = value
var: Template = Concrete('value', 42) # OK
To distinguish between protocol class variables and protocol instance
variables, the special ``ClassVar`` annotation should be used as specified
by PEP 526. By default, protocol variables as defined above are considered
readable and writable. To define a read-only protocol variable, one can use
an (abstract) property.
Explicitly declaring implementation
-----------------------------------
To explicitly declare that a certain class implements a given protocol,
it can be used as a regular base class. In this case a class could use
default implementations of protocol members. ``typing.Sequence`` is a good
example of a protocol with useful default methods. Static analysis tools are
expected to automatically detect that a class implements a given protocol.
So while it's possible to subclass a protocol explicitly, it's *not
necessary*
to do so for the sake of type-checking.
The default implementations cannot be used if
the subtype relationship is implicit and only via structural
subtyping -- the semantics of inheritance is not changed. Examples::
class PColor(Protocol):
@abstractmethod
def draw(self) -> str:
...
def complex_method(self) -> int:
# some complex code here
class NiceColor(PColor):
def draw(self) -> str:
return "deep blue"
class BadColor(PColor):
def draw(self) -> str:
return super().draw() # Error, no default implementation
class ImplicitColor: # Note no 'PColor' base here
def draw(self) -> str:
return "probably gray"
def comlex_method(self) -> int:
# class needs to implement this
nice: NiceColor
another: ImplicitColor
def represent(c: PColor) -> None:
print(c.draw(), c.complex_method())
represent(nice) # OK
represent(another) # Also OK
Note that there is little difference between explicit and implicit
subtypes, the main benefit of explicit subclassing is to get some protocol
methods "for free". In addition, type checkers can statically verify that
the class actually implements the protocol correctly::
class RGB(Protocol):
rgb: Tuple[int, int, int]
@abstractmethod
def intensity(self) -> int:
return 0
class Point(RGB):
def __init__(self, red: int, green: int, blue: str) -> None:
self.rgb = red, green, blue # Error, 'blue' must be 'int'
# Type checker might warn that 'intensity' is not defined
A class can explicitly inherit from multiple protocols and also form normal
classes. In this case methods are resolved using normal MRO and a type
checker
verifies that all subtyping are correct. The semantics of
``@abstractmethod``
is not changed, all of them must be implemented by an explicit subclass
before it can be instantiated.
Merging and extending protocols
-------------------------------
The general philosophy is that protocols are mostly like regular ABCs,
but a static type checker will handle them specially. Subclassing a protocol
class would not turn the subclass into a protocol unless it also has
``typing.Protocol`` as an explicit base class. Without this base, the class
is "downgraded" to a regular ABC that cannot be used with structural
subtyping. The rationale for this rule is that we don't want to accidentally
have some class act as a protocol just because one of its base classes
happens to be one. We still slightly prefer nominal subtyping over
structural
subtyping in the static typing world.
A subprotocol can be defined by having *both* one or more protocols as
immediate base classes and also having ``typing.Protocol`` as an immediate
base class::
from typing import Sized, Protocol
class SizedAndClosable(Sized, Protocol):
def close(self) -> None:
...
Now the protocol ``SizedAndClosable`` is a protocol with two methods,
``__len__`` and ``close``. If one omits ``Protocol`` in the base class list,
this would be a regular (non-protocol) class that must implement ``Sized``.
Alternatively, one can implement ``SizedAndClosable`` protocol by merging
the ``SupportsClose`` protocol from the example in the `definition`_ section
with ``typing.Sized``::
from typing import Sized
class SupportsClose(Protocol):
def close(self) -> None:
...
class SizedAndClosable(Sized, SupportsClose, Protocol):
pass
The two definitions of ``SizedAndClosable`` are equivalent.
Subclass relationships between protocols are not meaningful when
considering subtyping, since structural compatibility is
the criterion, not the MRO.
If ``Protocol`` is included in the base class list, all the other base
classes
must be protocols. A protocol can't extend a regular class, see `rejected`_
ideas for rationale. Note that rules around explicit subclassing are
different
from regular ABCs, where abstractness is simply defined by having at least
one
abstract method being unimplemented. Protocol classes must be marked
*explicitly*.
Generic protocols
-----------------
Generic protocols are important. For example, ``SupportsAbs``, ``Iterable``
and ``Iterator`` are generic protocols. They are defined similar to normal
non-protocol generic types::
class Iterable(Protocol[T]):
@abstractmethod
def __iter__(self) -> Iterator[T]:
...
``Protocol[T, S, ...]`` is allowed as a shorthand for
``Protocol, Generic[T, S, ...]``.
User-defined generic protocols support explicitly declared variance.
Type checkers will warn if the inferred variance is different from
the declared variance. Examples::
T = TypeVar('T')
T_co = TypeVar('T_co', covariant=True)
T_contra = TypeVar('T_contra', contravariant=True)
class Box(Protocol[T_co]):
def content(self) -> T_co:
...
box: Box[float]
second_box: Box[int]
box = second_box # This is OK due to the covariance of 'Box'.
class Sender(Protocol[T_contra]):
def send(self, data: T_contra) -> int:
...
sender: Sender[float]
new_sender: Sender[int]
new_sender = sender # OK, 'Sender' is contravariant.
class Proto(Protocol[T]):
attr: T # this class is invariant, since it has a mutable attribute
var: Proto[float]
another_var: Proto[int]
var = another_var # Error! 'Proto[float]' is incompatible with
'Proto[int]'.
Note that unlike nominal classes, de-facto covariant protocols cannot be
declared as invariant, since this can break transitivity of subtyping
(see `rejected`_ ideas for details). For example::
T = TypeVar('T')
class AnotherBox(Protocol[T]): # Error, this protocol is covariant in T,
def content(self) -> T: # not invariant.
...
Recursive protocols
-------------------
Recursive protocols are also supported. Forward references to the protocol
class names can be given as strings as specified by PEP 484. Recursive
protocols are useful for representing self-referential data structures
like trees in an abstract fashion::
class Traversable(Protocol):
def leaves(self) -> Iterable['Traversable']:
...
Note that for recursive protocols, a class is considered a subtype of
the protocol in situations where the decision depends on itself.
Continuing the previous example::
class SimpleTree:
def leaves(self) -> List['SimpleTree']:
...
root: Traversable = SimpleTree() # OK
class Tree(Generic[T]):
def leaves(self) -> List['Tree[T]']:
...
def walk(graph: Traversable) -> None:
...
tree: Tree[float] = Tree()
walk(tree) # OK, 'Tree[float]' is a subtype of 'Traversable'
Using Protocols
===============
Subtyping relationships with other types
----------------------------------------
Protocols cannot be instantiated, so there are no values whose
runtime type is a protocol. For variables and parameters with protocol
types,
subtyping relationships are subject to the following rules:
* A protocol is never a subtype of a concrete type.
* A concrete type ``X`` is a subtype of protocol ``P``
if and only if ``X`` implements all protocol members of ``P`` with
compatible types. In other words, subtyping with respect to a protocol is
always structural.
* A protocol ``P1`` is a subtype of another protocol ``P2`` if ``P1``
defines
all protocol members of ``P2`` with compatible types.
Generic protocol types follow the same rules of variance as non-protocol
types. Protocol types can be used in all contexts where any other types
can be used, such as in ``Union``, ``ClassVar``, type variables bounds, etc.
Generic protocols follow the rules for generic abstract classes, except for
using structural compatibility instead of compatibility defined by
inheritance relationships.
Unions and intersections of protocols
-------------------------------------
``Union`` of protocol classes behaves the same way as for non-protocol
classes. For example::
from typing import Union, Optional, Protocol
class Exitable(Protocol):
def exit(self) -> int:
...
class Quittable(Protocol):
def quit(self) -> Optional[int]:
...
def finish(task: Union[Exitable, Quittable]) -> int:
...
class DefaultJob:
...
def quit(self) -> int:
return 0
finish(DefaultJob()) # OK
One can use multiple inheritance to define an intersection of protocols.
Example::
from typing import Sequence, Hashable
class HashableFloats(Sequence[float], Hashable, Protocol):
pass
def cached_func(args: HashableFloats) -> float:
...
cached_func((1, 2, 3)) # OK, tuple is both hashable and sequence
If this will prove to be a widely used scenario, then a special
intersection type construct could be added in future as specified by PEP
483,
see `rejected`_ ideas for more details.
``Type[]`` with protocols
-------------------------
Variables and parameters annotated with ``Type[Proto]`` accept only concrete
(non-protocol) subtypes of ``Proto``. The main reason for this is to allow
instantiation of parameters with such type. For example::
class Proto(Protocol):
@abstractmethod
def meth(self) -> int:
...
class Concrete:
def meth(self) -> int:
return 42
def fun(cls: Type[Proto]) -> int:
return cls().meth() # OK
fun(Proto) # Error
fun(Concrete) # OK
The same rule applies to variables::
var: Type[Proto]
var = Proto # Error
var = Concrete # OK
var().meth() # OK
Assigning an ABC or a protocol class to a variable is allowed if it is
not explicitly typed, and such assignment creates a type alias.
For normal (non-abstract) classes, the behavior of ``Type[]`` is
not changed.
``NewType()`` and type aliases
------------------------------
Protocols are essentially anonymous. To emphasize this point, static type
checkers might refuse protocol classes inside ``NewType()`` to avoid an
illusion that a distinct type is provided::
from typing import NewType, Protocol, Iterator
class Id(Protocol):
code: int
secrets: Iterator[bytes]
UserId = NewType('UserId', Id) # Error, can't provide distinct type
In contrast, type aliases are fully supported, including generic type
aliases::
from typing import TypeVar, Reversible, Iterable, Sized
T = TypeVar('T')
class SizedIterable(Iterable[T], Sized, Protocol):
pass
CompatReversible = Union[Reversible[T], SizedIterable[T]]
.. _discussion:
``@runtime`` decorator and narrowing types by ``isinstance()``
--------------------------------------------------------------
The default semantics is that ``isinstance()`` and ``issubclass()`` fail
for protocol types. This is in the spirit of duck typing -- protocols
basically would be used to model duck typing statically, not explicitly
at runtime.
However, it should be possible for protocol types to implement custom
instance and class checks when this makes sense, similar to how ``Iterable``
and other ABCs in ``collections.abc`` and ``typing`` already do it,
but this is limited to non-generic and unsubscripted generic protocols
(``Iterable`` is statically equivalent to ``Iterable[Any]`).
The ``typing`` module will define a special ``@runtime`` class decorator
that provides the same semantics for class and instance checks as for
``collections.abc`` classes, essentially making them "runtime protocols"::
from typing import runtime, Protocol
@runtime
class Closable(Protocol):
def close(self):
...
assert isinstance(open('some/file'), Closable)
Static type checkers will understand ``isinstance(x, Proto)`` and
``issubclass(C, Proto)`` for protocols defined with this decorator (as they
already do for ``Iterable`` etc.). Static type checkers will narrow types
after such checks by the type erased ``Proto`` (i.e. with all variables
having type ``Any`` and all methods having type ``Callable[..., Any]``).
Note that ``isinstance(x, Proto[int])`` etc. will always fail in agreement
with PEP 484. Examples::
from typing import Iterable, Iterator, Sequence
def process(items: Iterable[int]) -> None:
if isinstance(items, Iterator):
# 'items' has type 'Iterator[int]' here
elif isinstance(items, Sequence[int]):
# Error! Can't use 'isinstance()' with subscripted protocols
Note that instance checks are not 100% reliable statically, this is why
this behavior is opt-in, see section on `rejected`_ ideas for examples.
Using Protocols in Python 2.7 - 3.5
===================================
Variable annotation syntax was added in Python 3.6, so that the syntax
for defining protocol variables proposed in `specification`_ section can't
be used if support for earlier versions is needed. To define these
in a manner compatible with older versions of Python one can use properties.
Properties can be settable and/or abstract if needed::
class Foo(Protocol):
@property
def c(self) -> int:
return 42 # Default value can be provided for property...
@abstractproperty
def d(self) -> int: # ... or it can be abstract
return 0
Also function type comments can be used as per PEP 484 (for example
to provide compatibility with Python 2). The ``typing`` module changes
proposed in this PEP will also be backported to earlier versions via the
backport currently available on PyPI.
Runtime Implementation of Protocol Classes
==========================================
Implementation details
----------------------
The runtime implementation could be done in pure Python without any
effects on the core interpreter and standard library except in the
``typing`` module, and a minor update to ``collections.abc``:
* Define class ``typing.Protocol`` similar to ``typing.Generic``.
* Implement metaclass functionality to detect whether a class is
a protocol or not. Add a class attribute ``_is_protocol = True``
if that is the case. Verify that a protocol class only has protocol
base classes in the MRO (except for object).
* Implement ``@runtime`` that allows ``__subclasshook__()`` performing
structural instance and subclass checks as in ``collections.abc`` classes.
* All structural subtyping checks will be performed by static type checkers,
such as ``mypy`` [mypy]_. No additional support for protocol validation
will
be provided at runtime.
* Classes ``Mapping``, ``MutableMapping``, ``Sequence``, and
``MutableSequence`` in ``collections.abc`` module will support structural
instance and subclass checks (like e.g. ``collections.abc.Iterable``).
Changes in the typing module
----------------------------
The following classes in ``typing`` module will be protocols:
* ``Callable``
* ``Awaitable``
* ``Iterable``, ``Iterator``
* ``AsyncIterable``, ``AsyncIterator``
* ``Hashable``
* ``Sized``
* ``Container``
* ``Collection``
* ``Reversible``
* ``Sequence``, ``MutableSequence``
* ``Mapping``, ``MutableMapping``
* ``ContextManager``, ``AsyncContextManager``
* ``SupportsAbs`` (and other ``Supports*`` classes)
Most of these classes are small and conceptually simple. It is easy to see
what are the methods these protocols implement, and immediately recognize
the corresponding runtime protocol counterpart.
Practically, few changes will be needed in ``typing`` since some of these
classes already behave the necessary way at runtime. Most of these will need
to be updated only in the corresponding ``typeshed`` stubs [typeshed]_.
All other concrete generic classes such as ``List``, ``Set``, ``IO``,
``Deque``, etc are sufficiently complex that it makes sense to keep
them non-protocols (i.e. require code to be explicit about them). Also, it
is
too easy to leave some methods unimplemented by accident, and explicitly
marking the subclass relationship allows type checkers to pinpoint the
missing
implementations.
Introspection
-------------
The existing class introspection machinery (``dir``, ``__annotations__``
etc)
can be used with protocols. In addition, all introspection tools implemented
in the ``typing`` module will support protocols. Since all attributes need
to be defined in the class body based on this proposal, protocol classes
will
have even better perspective for introspection than regular classes where
attributes can be defined implicitly -- protocol attributes can't be
initialized in ways that are not visible to introspection
(using ``setattr()``, assignment via ``self``, etc.). Still, some things
like
types of attributes will not be visible at runtime in Python 3.5 and
earlier,
but this looks like a reasonable limitation.
There will be only limited support of ``isinstance()`` and ``issubclass()``
as discussed above (these will *always* fail with ``TypeError`` for
subscripted generic protocols, since a reliable answer could not be given
at runtime in this case). But together with other introspection tools this
give a reasonable perspective for runtime type checking tools.
.. _rejected:
Rejected/Postponed Ideas
========================
The ideas in this section were previously discussed in [several]_
[discussions]_ [elsewhere]_.
Make every class a protocol by default
--------------------------------------
Some languages such as Go make structural subtyping the only or the primary
form of subtyping. We could achieve a similar result by making all classes
protocols by default (or even always). However we believe that it is better
to require classes to be explicitly marked as protocols, for the following
reasons:
* Protocols don't have some properties of regular classes. In particular,
``isinstance()``, as defined for normal classes, is based on the nominal
hierarchy. In order to make everything a protocol by default, and have
``isinstance()`` work would require changing its semantics,
which won't happen.
* Protocol classes should generally not have many method implementations,
as they describe an interface, not an implementation.
Most classes have many method implementations, making them bad protocol
classes.
* Experience suggests that many classes are not practical as protocols
anyway,
mainly because their interfaces are too large, complex or
implementation-oriented (for example, they may include de facto
private attributes and methods without a ``__`` prefix).
* Most actually useful protocols in existing Python code seem to be
implicit.
The ABCs in ``typing`` and ``collections.abc`` are rather an exception,
but
even they are recent additions to Python and most programmers
do not use them yet.
* Many built-in functions only accept concrete instances of ``int``
(and subclass instances), and similarly for other built-in classes. Making
``int`` a structural type wouldn't be safe without major changes to the
Python runtime, which won't happen.
Protocols subclassing normal classes
------------------------------------
The main rationale to prohibit this is to preserve transitivity of
subtyping,
consider this example::
from typing import Protocol
class Base:
attr: str
class Proto(Base, Protocol):
def meth(self) -> int:
...
class C:
attr: str
def meth(self) -> int:
return 0
Now, ``C`` is a subtype of ``Proto``, and ``Proto`` is a subtype of
``Base``.
But ``C`` cannot be a subtype of ``Base`` (since the latter is not
a protocol). This situation would be really weird. In addition, there is
an ambiguity about whether attributes of ``Base`` should become protocol
members of ``Proto``.
Support optional protocol members
---------------------------------
We can come up with examples where it would be handy to be able to say
that a method or data attribute does not need to be present in a class
implementing a protocol, but if it is present, it must conform to a specific
signature or type. One could use a ``hasattr()`` check to determine whether
they can use the attribute on a particular instance.
Languages such as TypeScript have similar features and
apparently they are pretty commonly used. The current realistic potential
use cases for protocols in Python don't require these. In the interest
of simplicity, we propose to not support optional methods or attributes.
We can always revisit this later if there is an actual need.
Allow only protocol methods and force use of getters and setters
----------------------------------------------------------------
One could argue that protocols typically only define methods, but not
variables. However, using getters and setters in cases where only a
simple variable is needed would be quite unpythonic. Moreover, the
widespread
use of properties (that often act as type validators) in large code bases
is partially due to previous absence of static type checkers for Python,
the problem that PEP 484 and this PEP are aiming to solve. For example::
# without static types
class MyClass:
@property
def my_attr(self):
return self._my_attr
@my_attr.setter
def my_attr(self, value):
if not isinstance(value, int):
raise ValidationError("An integer expected for my_attr")
self._my_attr = value
# with static types
class MyClass:
my_attr: int
Support non-protocol members
----------------------------
There was an idea to make some methods "non-protocol" (i.e. not necessary
to implement, and inherited in explicit subclassing), but it was rejected,
since this complicates things. For example, consider this situation::
class Proto(Protocol):
@abstractmethod
def first(self) -> int:
raise NotImplementedError
def second(self) -> int:
return self.first() + 1
def fun(arg: Proto) -> None:
arg.second()
The question is should this be an error? We think most people would expect
this to be valid. Therefore, to be on the safe side, we need to require both
methods to be implemented in implicit subclasses. In addition, if one looks
at definitions in ``collections.abc``, there are very few methods that could
be considered "non-protocol". Therefore, it was decided to not introduce
"non-protocol" methods.
There is only one downside to this: it will require some boilerplate for
implicit subtypes of ``Mapping`` and few other "large" protocols. But, this
applies to few "built-in" protocols (like ``Mapping`` and ``Sequence``) and
people are already subclassing them. Also, such style is discouraged for
user-defined protocols. It is recommended to create compact protocols and
combine them.
Make protocols interoperable with other approaches
--------------------------------------------------
The protocols as described here are basically a minimal extension to
the existing concept of ABCs. We argue that this is the way they should
be understood, instead of as something that *replaces* Zope interfaces,
for example. Attempting such interoperabilities will significantly
complicate both the concept and the implementation.
On the other hand, Zope interfaces are conceptually a superset of protocols
defined here, but using an incompatible syntax to define them,
because before PEP 526 there was no straightforward way to annotate
attributes.
In the 3.6+ world, ``zope.interface`` might potentially adopt the
``Protocol``
syntax. In this case, type checkers could be taught to recognize interfaces
as protocols and make simple structural checks with respect to them.
Use assignments to check explicitly that a class implements a protocol
----------------------------------------------------------------------
In the Go language the explicit checks for implementation are performed
via dummy assignments [golang]_. Such a way is also possible with the
current proposal. Example::
class A:
def __len__(self) -> float:
return ...
_: Sized = A() # Error: A.__len__ doesn't conform to 'Sized'
# (Incompatible return type 'float')
This approach moves the check away from
the class definition and it almost requires a comment as otherwise
the code probably would not make any sense to an average reader
-- it looks like dead code. Besides, in the simplest form it requires one
to construct an instance of ``A``, which could be problematic if this
requires
accessing or allocating some resources such as files or sockets.
We could work around the latter by using a cast, for example, but then
the code would be ugly. Therefore we discourage the use of this pattern.
Support ``isinstance()`` checks by default
------------------------------------------
The problem with this is instance checks could be unreliable, except for
situations where there is a common signature convention such as
``Iterable``.
For example::
class P(Protocol):
def common_method_name(self, x: int) -> int: ...
class X:
<a bunch of methods>
def common_method_name(self) -> None: ... # Note different signature
def do_stuff(o: Union[P, X]) -> int:
if isinstance(o, P):
return o.common_method_name(1) # oops, what if it's an X
instance?
Another potentially problematic case is assignment of attributes
*after* instantiation::
class P(Protocol):
x: int
class C:
def initialize(self) -> None:
self.x = 0
c = C()
isinstance(c1, P) # False
c.initialize()
isinstance(c, P) # True
def f(x: Union[P, int]) -> None:
if isinstance(x, P):
# static type of x is P here
...
else:
# type of x is "int" here?
print(x + 1)
f(C()) # oops
We argue that requiring an explicit class decorator would be better, since
one can then attach warnings about problems like this in the documentation.
The user would be able to evaluate whether the benefits outweigh
the potential for confusion for each protocol and explicitly opt in -- but
the default behavior would be safer. Finally, it will be easy to make this
behavior default if necessary, while it might be problematic to make it
opt-in
after being default.
Provide a special intersection type construct
---------------------------------------------
There was an idea to allow ``Proto = All[Proto1, Proto2, ...]`` as a
shorthand
for::
class Proto(Proto1, Proto2, ..., Protocol):
pass
However, it is not yet clear how popular/useful it will be and implementing
this in type checkers for non-protocol classes could be difficult. Finally,
it
will be very easy to add this later if needed.
Prohibit explicit subclassing of protocols by non-protocols
-----------------------------------------------------------
This was rejected for the following reasons:
* Backward compatibility: People are already using ABCs, including generic
ABCs from ``typing`` module. If we prohibit explicit subclassing of these
ABCs, then quite a lot of code will break.
* Convenience: There are existing protocol-like ABCs (that will be turned
into protocols) that have many useful "mix-in" (non-abstract) methods.
For example in the case of ``Sequence`` one only needs to implement
``__getitem__`` and ``__len__`` in an explicit subclass, and one gets
``__iter__``, ``__contains__``, ``__reversed__``, ``index``, and ``count``
for free.
* Explicit subclassing makes it explicit that a class implements a
particular
protocol, making subtyping relationships easier to see.
* Type checkers can warn about missing protocol members or members with
incompatible types more easily, without having to use hacks like dummy
assignments discussed above in this section.
* Explicit subclassing makes it possible to force a class to be considered
a subtype of a protocol (by using ``# type: ignore`` together with an
explicit base class) when it is not strictly compatible, such as when
it has an unsafe override.
Covariant subtyping of mutable attributes
-----------------------------------------
Rejected because covariant subtyping of mutable attributes is not safe.
Consider this example::
class P(Protocol):
x: float
def f(arg: P) -> None:
arg.x = 0.42
class C:
x: int
c = C()
f(c) # Would typecheck if covariant subtyping
# of mutable attributes were allowed
c.x >> 1 # But this fails at runtime
It was initially proposed to allow this for practical reasons, but it was
subsequently rejected, since this may mask some hard to spot bugs.
Overriding inferred variance of protocol classes
------------------------------------------------
It was proposed to allow declaring protocols as invariant if they are
actually
covariant or contravariant (as it is possible for nominal classes, see PEP
484).
However, it was decided not to do this because of several downsides:
* Declared protocol invariance breaks transitivity of sub-typing. Consider
this situation::
T = TypeVar('T')
class P(Protocol[T]): # Declared as invariant
def meth(self) -> T:
...
class C:
def meth(self) -> float:
...
class D(C):
def meth(self) -> int:
...
Now we have that ``D`` is a subtype of ``C``, and ``C`` is a subtype of
``P[float]``. But ``D`` is *not* a subtype of ``P[float]`` since ``D``
implements ``P[int]``, and ``P`` is invariant. There is a possibility
to "cure" this by looking for protocol implementations in MROs but this
will be too complex in a general case, and this "cure" requires abandoning
simple idea of purely structural subtyping for protocols.
* Subtyping checks will always require type inference for protocols. In the
above example a user may complain: "Why did you infer ``P[int]`` for
my ``D``? It implements ``P[float]``!". Normally, inference can be
overruled
by an explicit annotation, but here this will require explicit
subclassing,
defeating the purpose of using protocols.
* Allowing overriding variance will make impossible more detailed error
messages in type checkers citing particular conflicts in member
type signatures.
* Finally, explicit is better than implicit in this case. Requiring user to
declare correct variance will simplify understanding the code and will
avoid
unexpected errors at the point of use.
Support adapters and adaptation
-------------------------------
Adaptation was proposed by PEP 246 (rejected) and is supported by
``zope.interface``, see https://docs.zope.org/zope.interface/adapter.html.
Adapters is quite an advanced concept, and PEP 484 supports unions and
generic aliases that can be used instead of adapters. This can be
illustrated
with an example of ``Iterable`` protocol, there is another way of supporting
iteration by providing ``__getitem__`` and ``__len__``. If a function
supports both this way and the now standard ``__iter__`` method, then it
could
be annotated by a union type::
class OldIterable(Sized, Protocol[T]):
def __getitem__(self, item: int) -> T: ...
CompatIterable = Union[Iterable[T], OldIterable[T]]
class A:
def __iter__(self) -> Iterator[str]: ...
class B:
def __len__(self) -> int: ...
def __getitem__(self, item: int) -> str: ...
def iterate(it: CompatIterable[str]) -> None:
...
iterate(A()) # OK
iterate(B()) # OK
Since there is a reasonable alternative for such cases with existing
tooling,
it is therefore proposed not to include adaptation in this PEP.
Backwards Compatibility
=======================
This PEP is almost fully backwards compatible. Few collection classes such
as
``Sequence`` and ``Mapping`` will be turned into runtime protocols,
therefore
results of ``isinstance()`` checks are going to change in some edge cases.
For example, a class that implements the ``Sequence`` protocol but does not
explicitly inherit from ``Sequence`` currently returns ``False`` in
corresponding instance and class checks. With this PEP implemented, such
checks will return ``True``.
Implementation
==============
A working implementation of this PEP for ``mypy`` type checker is found on
GitHub repo at https://github.com/ilevkivskyi/mypy/tree/protocols,
corresponding ``typeshed`` stubs for more flavor are found at
https://github.com/ilevkivskyi/typeshed/tree/protocols. Installation steps::
git clone --recurse-submodules https://github.com/ilevkivskyi/mypy/
cd mypy && git checkout protocols && cd typeshed
git remote add proto https://github.com/ilevkivskyi/typeshed
git fetch proto && git checkout proto/protocols
cd .. && git add typeshed && sudo python3 -m pip install -U .
The runtime implementation of protocols in ``typing`` module is
found at https://github.com/ilevkivskyi/typehinting/tree/protocols.
The version of ``collections.abc`` with structural behavior for mappings and
sequences is found at https://github.com/ilevkivskyi/cpython/tree/protocols.
References
==========
.. [typing]
https://docs.python.org/3/library/typing.html
.. [wiki-structural]
https://en.wikipedia.org/wiki/Structural_type_system
.. [zope-interfaces]
https://zopeinterface.readthedocs.io/en/latest/
.. [abstract-classes]
https://docs.python.org/3/library/abc.html
.. [collections-abc]
https://docs.python.org/3/library/collections.abc.html
.. [typescript]
https://www.typescriptlang.org/docs/handbook/interfaces.html
.. [golang]
https://golang.org/doc/effective_go.html#interfaces_and_types
.. [data-model]
https://docs.python.org/3/reference/datamodel.html#special-method-names
.. [typeshed]
https://github.com/python/typeshed/
.. [mypy]
http://github.com/python/mypy/
.. [several]
https://mail.python.org/pipermail/python-ideas/2015-September/thread.html#3…
.. [discussions]
https://github.com/python/typing/issues/11
.. [elsewhere]
https://github.com/python/peps/pull/224
Copyright
=========
This document has been placed in the public domain.
[View Less]
11
22
![](https://secure.gravatar.com/avatar/0ec014f06f1b5e63334f4414a48b3410.jpg?s=120&d=mm&r=g)
Python FTP Injections Allow for Firewall Bypass (oss-security advisory)
by nospam@curso.re June 20, 2017
by nospam@curso.re June 20, 2017
June 20, 2017
Hello,
I have just noticed that an FTP injection advisory has been made public
on the oss-security list.
The author says that he an exploit exists but it won't be published
until the code is patched
You may be already aware, but it would be good to understand what is the
position of the core developers about this.
The advisory is linked below (with some excerpts in this message):
http://blog.blindspotsecurity.com/2017/02/advisory-javapython-ftp-injection…
Protocol injection flaws like …
[View More]this have been an area of research of mine
for the past few couple of years and as it turns out, this FTP protocol
injection allows one to fool a victim's firewall into allowing TCP
connections from the Internet to the vulnerable host's system on any
"high" port (1024-65535). A nearly identical vulnerability exists in
Python's urllib2 and urllib libraries. In the case of Java, this attack
can be carried out against desktop users even if those desktop users do
not have the Java browser plugin enabled.
As of 2017-02-20, the vulnerabilities discussed here have not been patched
by the associated vendors, despite advance warning and ample time to do
so.
[...]
Python's built-in URL fetching library (urllib2 in Python 2 and urllib in
Python 3) is vulnerable to a nearly identical protocol stream injection,
but this injection appears to be limited to attacks via directory names
specified in the URL.
[...]
The Python security team was notified in January 2016. Information
provided included an outline of the possibility of FTP/firewall attacks.
Despite repeated follow-ups, there has been no apparent action on their
part.
Best regards,
-- Stefano
P.S.
I am posting from gmane, I hope that this is OK.
[View Less]
9
9
![](https://secure.gravatar.com/avatar/f3ba3ecffd20251d73749afbfa636786.jpg?s=120&d=mm&r=g)
PEP 538 (review round 2): Coercing the legacy C locale to a UTF-8 based locale
by Nick Coghlan June 17, 2017
by Nick Coghlan June 17, 2017
June 17, 2017
Hi folks,
Enough changes have accumulated in PEP 538 since the start of the
previous thread that it seems sensible to me to start a new thread
specifically covering the current design (which aims to address all
the concerns raised in the previous thread).
I haven't requoted the PEP in full since it's so long, but will
instead refer readers to the web version:
https://www.python.org/dev/peps/pep-0538/
I also generated a diff covered the full changes to the PEP text:
* https://gist.github.com/…
[View More]ncoghlan/1067805fe673b3735ac854e195747493/revisions
(this is the diff covering the last few days of changes
Summarising the key technical changes:
* to make the runtime behaviour independent of whether or not locale
coercion took place, stdin and stderr now always have
"surrogateescape" as their error handler in the potential coercion
target locales. This means Python will behave the same way regardless
of whether the locale gets set externally (e.g. by a parent Python
process or a container image definition) or implicitly during CLI
startup
* for the full locales, the interpreter now sets LC_CTYPE and LANG,
*not* LC_ALL. This means LC_ALL is once again a full locale override,
and also means that CPython won't inadvertently interfere with other
locale categories like LC_MONETARY, LC_NUMERIC, etc
* the reference implementation has been refactored so the bulk of the
new code lives in the shared library and is exposed to the linker via
a couple of underscore prefixed API symbols
(_Py_LegacyLocaleDetected() and _Py_CoerceLegacyLocale()). While the
current PEP still keeps them private, it would be straightforward to
make them public for use in embedding applications if we decided we
wanted to do so.
* locale coercion and warnings are now enabled by default on all
platforms that use the autotools-based build chain - the assumption
that some platforms didn't need them turned out to be incorrect
In addition to being updated to cover the above changes, the Rationale
section of the PEP has also been updated to explain why it doesn't
propose setting PYTHONIOENCODING, and to walk through some examples of
the problems with GNU readlines compatibility when the current locale
isn't set correctly.
The essential related changes to the reference implementation can be seen here:
* Always set "surrogateescape" for coercion target locales,
independently of whether or not coercion occurred:
https://github.com/ncoghlan/cpython/commit/188e7807b6d9e49377aacbb287c074e5…
* Stop setting LC_ALL:
https://github.com/python/peps/commit/2f530ce0d1fd24835ac0c6f984f40db70482a…
(There are also some smaller cleanup commits that can be seen by
browsing that branch on GitHub)
Cheers,
Nick.
--
Nick Coghlan | ncoghlan(a)gmail.com | Brisbane, Australia
[View Less]
4
13
![](https://secure.gravatar.com/avatar/daa45563a98419bb1b6b63904ce71f95.jpg?s=120&d=mm&r=g)
June 16, 2017
Hi
I would like to change struct.Struct.format type from bytes to str. I
don't expect that anyone uses this attribute, and struct.Struct()
constructor accepts both bytes and str.
http://bugs.python.org/issue21071
It's just to be convenient: more functions accept str than bytes in
Python 3. Example: print() (python3 -bb raises an exceptions if you
pass bytes to print).
Is anything opposed to breaking the backward compatibility?
Victor
3
3
![](https://secure.gravatar.com/avatar/daa45563a98419bb1b6b63904ce71f95.jpg?s=120&d=mm&r=g)
June 10, 2017
Hi,
I wrote a PEP based on the previous thread "Backport ssl.MemoryBIO on
Python 2.7?". Thanks for Cory Benfield, Alex Gaynor and Nick Coghlan
who helped me to write it!
HTML version:
https://www.python.org/dev/peps/pep-0546/
Inline verison below.
Victor
PEP: 546
Title: Backport ssl.MemoryBIO and ssl.SSLObject to Python 2.7
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner <victor.stinner(a)gmail.com>,
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
…
[View More]Created: 30-May-2017
Abstract
========
Backport the ssl.MemoryBIO and ssl.SSLObject classes from Python 3 to Python
2.7 to enhance the overall security of Python 2.7.
Rationale
=========
While Python 2.7 is getting closer to its end-of-support date (scheduled for
2020), it is still used on production systems and the Python community is still
responsible for its security. This PEP will help facilitate the future adoption
of :pep:`543` across all supported Python versions, which will improve security
for both Python 2 and Python 3 users.
This PEP does NOT propose a general exception for backporting new
features to Python 2.7 - every new feature proposed for backporting will
still need to be justified independently. In particular, it will need to
be explained why relying on an independently updated backport on the
Python Package Index instead is not an acceptable solution.
PEP 543
-------
:pep:`543` defines a new TLS API for Python which would enhance Python
security by giving Python applications access to the native TLS implementations
on Windows and macOS, instead of using OpenSSL. A side effect is that it gives
access to the system trust store and certificates installed
locally by system administrators, enabling Python applications to use "company
certificates" without having to modify each application and so to correctly
validate TLS certificates (instead of having to ignore or bypass TLS
certificate validation).
For practical reasons, Cory Benfield would like to first implement an
I/O-less class similar to ssl.MemoryBIO and ssl.SSLObject for
:pep:`543`, and to provide a second class based on the first one to use
sockets or file descriptors. This design would help to structure the code
to support more backends and simplify testing and auditing, as well as
implementation. Later, optimized classes using directly sockets or file
descriptors may be added for performance.
While :pep:`543` defines an API, the PEP would only make sense if it
comes with at least one complete and good implementation. The first
implementation would ideally be based on the ``ssl`` module of the Python
standard library, as this is shipped to all users by default and can be used as
a fallback implementation in the absence of anything more targetted.
If this backport is not performed, the only baseline implementation that could
be used would be pyOpenSSL. This is problematic, however, because of the
interaction with pip, which is shipped with CPython on all supported versions.
requests, pip and ensurepip
---------------------------
There are plans afoot to look at moving Requests to a more event-loop-y
model, and doing so basically mandates a MemoryBIO. In the absence of a
Python 2.7 backport, Requests is required to basically use the same
solution that Twisted currently does: namely, a mandatory dependency on
`pyOpenSSL <https://pypi.python.org/pypi/pyOpenSSL>`_.
The `pip <https://pip.pypa.io/>`_ program has to embed all its
dependencies for practical reasons: namely, that it cannot rely on any other
installation method being present. Since pip depends on requests, it means
that it would have to embed a copy of pyOpenSSL. That would imply substantial
usability pain to install pip. Currently, pip doesn't support embedding
C extensions which must be compiled on each platform and so require a C
compiler.
Since Python 2.7.9, Python embeds a copy of pip both for default
installation and for use in virtual environments via the new ``ensurepip``
module. If pip ends up bundling PyOpenSSL, then CPython will end up
bundling PyOpenSSL. Only backporting ``ssl.MemoryBIO`` and
``ssl.SSLObject`` would avoid the need to embed pyOpenSSL, and would fix the
bootstrap issue (python -> ensurepip -> pip -> requests -> MemoryBIO).
Changes
=======
Add ``MemoryBIO`` and ``SSLObject`` classes to the ``ssl`` module of
Python 2.7.
The code will be backported and adapted from the master branch
(Python 3).
The backport also significantly reduced the size of the Python 2/Python
3 difference of the ``_ssl`` module, which make maintenance easier.
Links
=====
* :pep:`543`
* `[backport] ssl.MemoryBIO
<https://bugs.python.org/issue22559>`_: Implementation of this PEP
written by Alex Gaynor (first version written at October 2014)
* :pep:`466`
Discussions
===========
* `[Python-Dev] Backport ssl.MemoryBIO on Python 2.7?
<https://mail.python.org/pipermail/python-dev/2017-May/147981.html>`_
(May 2017)
Copyright
=========
This document has been placed in the public domain.
[View Less]
16
105