Hi,
Pathlib's symlink_to() and link_to() methods have different argument
orders, so:
a.symlink_to(b) # Creates a symlink from A to B
a.link_to(b) # Creates a hard link from B to A
I don't think link_to() was intended to be implemented this way, as the
docs say "Create a hard link pointing to a path named target.". It's also
inconsistent with everything else in pathlib, most obviously symlink_to().
Bug report here: https://bugs.python.org/issue39291
This /really/ irks me. Apparently it's too late to fix link_to(), so I'd
like to suggest we add a new hardlink_to() method that matches the
symlink_to() argument order. link_to() then becomes deprecated/undocumented.
Any thoughts?
Barney
Everyone,
If you've commented and you're worried you haven't been heard, please add
your issue *concisely* to this new thread. Note that the following issues
are already open and will be responded to separately; please don't bother
commenting on these until we've done so:
- Alternative spellings for '|'
- Whether to add an 'else' clause (and how to indent it)
- A different token for wildcards instead of '_'
- What to do about the footgun of 'case foo' vs. 'case .foo'
(Note that the last two could be combined, e.g. '?foo' or 'foo?' to mark a
variable binding and '?' for a wildcard.)
--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-…>
On 06/29/2020 08:13 AM, Keara Berlin wrote:
> Hi all, I didn't mean for there to be significant differences between what I posted here versus in the commit message. Sorry for any confusion around that! Thank you for putting them both in one place here - that is helpful.
To be clear, the proposed change:
> "When writing English, ensure that your comments are clear and easily understandable to other English speakers."
And the commit message:
> Instead of requiring that comments be written in Strunk & White Standard English, require instead that English-language comments be clear and easily understandable by other English speakers. This accomplishes the same goal without upholding relics of white supremacy. Many native English speakers do not use Standard English as their native dialect, so requiring conformation to Standard English centers whiteness in an inappropriate and unnecessary way, and can alienate and put up barriers for people of color and those whose native dialect of English is not Standard English. This change is a simple way to correct that while maintaining the original intent of the requirement.
I find it difficult to express my horror and outrage with this commit message, but let me try: Picture this scene from a movie I watched a long time ago: towards the end of the US Civil War a small band of deserters approach a large home; only one man, his wife, and their baby are home as the man's father and brothers have left to run errands. The leader of the small band approaches the man and asks for water. The man, happily and cheerfully, obliges and draws a bucket of fresh well water for them. When he turns around to give them the bucket of water, the leader runs him through with his saber (stabs him in his guts all the way to the hilt).
That's what it felt like: betrayal.
Before the PEP-8 amendment thread I thought Strunk & White was some popular culture reference, and as such I had no interest in it. However, given the brouhaha that ensued I did some digging to discover for myself what it was. Here is what I have found:
- it has had at least four editions thus far
- it has been modernized as times have changed (the 2000 edition removed the advice
to use masculine pronouns whenever possible, and warns that some will find unnecessary
masculine usage offensive)
- its advice is hotly debated amongst linguists (not surprising)
and perhaps the most relevant:
- White is the last name of the second author.
Of course I don't know if Keara or Guido knew any of this, but it certainly feels to me that the commit message is ostracizing an entire family line because they had the misfortune to have the wrong last name. In fact, it seems like Strunk & White is making changes to be inclusive in its advice -- exactly what I would have thought we wanted on our side ("our side" being the diverse and welcoming side).
According to whichever dictionary Google uses, white supremacy is:
> noun
> the belief that white people are superior to those of all other races, especially the black race, and should therefore dominate society.
Does Keara, Guido, or anyone, have any such examples from Strunk & White?
Finally, what's wrong with having a standard? Communication, especially in written form, is difficult enough without everyone using whatever style/grammar/colloquialisms happen to suit their fancy at the time. As a silly example: when I started using Python having the first parameter of a class method be `self` irked me, so I used `yo` instead (Spanish word for "I") -- it was shorter, and it tickled my fancy. Two years into using Python and I replaced every instance of `yo` in my libraries to `self`; the cognitive dissonance between my code and everyone else's was an unnecessary distraction.
Speaking of unnecessary, I think the change to PEP-8 was unnecessary. I think it was pushed through without any consideration for those against it, and I think the commit message was extremely offensive.
To hopefully stave off some attacks against me:
- I am not white
- I am not Ivy League educated
- Black lives do matter
- Police are terrifying
--
~Ethan~
Hello,
Shouldn't such feedback be also cross-posted to the python-dev mailing
list? Also note the original pull request,
https://github.com/python/peps/pull/1470, and differences of what was
written in the pull request description and what went in the commit
message.
On Sun, 28 Jun 2020 22:10:14 +0200
"Giampaolo Rodola'" <g.rodola(a)gmail.com> wrote:
> From:
> https://github.com/python/peps/commit/0c6427dcec1e98ca0bd46a876a7219ee4a934…
>
> > Instead of requiring that comments be written in Strunk & White
> > Standard
> English, require instead that English-language comments be clear and
> easily understandable by other English speakers. This accomplishes
> the same goal without upholding relics of white supremacy. Many
> native English speakers do not use Standard English as their native
> dialect, so requiring conformation to Standard English centers
> whiteness in an inappropriate and unnecessary way, and can alienate
> and put up barriers for people of color and those whose native
> dialect of English is not Standard English. This change is a simple
> way to correct that while maintaining the original intent of the
> requirement.
>
> This has nothing to do with making the wording "clear and
> understandable" (I agree on that). It's about, once again, bringing
> race-based politics into Python, and spreading hate towards a
> specific group of people: whites. Whether you're aware of it or not,
> there is a term for this: it's racism. I want to remind everyone that
> most of us here simply want to contribute code. We do it for free,
> and don't want to be involved in "this", because frankly it's
> disgusting. Doing something out of passion and for free, and at the
> same time seeing these sorts of things happening on a regular basis,
> looks and feels like an insult, and will only lead to people leaving
> this place.
>
> On Fri, Jun 26, 2020 at 11:27 PM Keara Berlin <kearaberlin(a)gmail.com>
> wrote:
>
> > Hi all, this is a very small change, but I thought I would field it
> > here to see if anyone has suggestions or ideas. Instead of
> > requiring that comments be written in Strunk & White Standard
> > English, PEP-8 should require instead that English-language
> > comments be clear and easily understandable by other English
> > speakers. This accomplishes the same goal without alienating or
> > putting up barriers for people (especially people of color) whose
> > native dialect of English is not Standard English. This change is a
> > simple way to correct that while maintaining the original intent of
> > the requirement. This change may even make the requirement more
> > clear to people who are not familiar with Strunk & White, since for
> > programmers, the main relevant aspect of that standard is "be clear
> > and concise;" simply saying that instead of referencing Strunk &
> > White may communicate this more effectively. Here is the current
> > line in PEP-8: "When writing English, follow Strunk and White."
> > I propose changing this line to "When writing English, ensure that
> > your comments are clear and easily understandable to other English
> > speakers." _______________________________________________
> > Python-ideas mailing list -- python-ideas(a)python.org
> > To unsubscribe send an email to python-ideas-leave(a)python.org
> > https://mail.python.org/mailman3/lists/python-ideas.python.org/
> > Message archived at
> > https://mail.python.org/archives/list/python-ideas@python.org/message/AE2M7…
> > Code of Conduct: http://python.org/psf/codeofconduct/
> >
>
>
> --
> Giampaolo - gmpy.dev <https://gmpy.dev/about>
--
Best regards,
Paul mailto:pmiscml@gmail.com
Hi, all.
Py_UNICODE has been deprecated since PEP 393 (Flexible string representation).
wchar_t* cache in the string object is used only in deprecated APIs.
It waste 1 word (8 bytes on 64bit machine) per string instance.
The deprecated APIs are documented as "Deprecated since version 3.3,
will be removed in version 4.0."
See https://docs.python.org/3/c-api/unicode.html#deprecated-py-unicode-apis
But when PEP 393 is implemented, no one expects 3.10 will be released.
Can we reschedule the removal?
My proposal is, schedule the removal on Python 3.11. But we will postpone
the removal if we can not remove its usage until it.
I grepped the use of the deprecated APIs from top 4000 PyPI packages.
result: https://github.com/methane/notes/blob/master/2020/wchar-cache/deprecated-use
step: https://github.com/methane/notes/blob/master/2020/wchar-cache/README.md
I noticed:
* Most of them are generated by Cython.
* I reported it to Cython so Cython 0.29.21 will fix them. I expect
more than 1 year
between Cython 0.29.21 and Python 3.11rc1.
* Most of them are `PyUnicode_FromUnicode(NULL, 0);`
* We may be able to keep PyUnicode_FromUnicode, but raise error when length>0.
Regards,
--
Inada Naoki <songofacandy(a)gmail.com>
Hi, all.
I proposed PEP 623 to remove Unicode APIs deprecated by PEP 393.
In this thread, I am proposing removal of Py_UNICODE (not Unicode
objects) APIs deprecated by PEP 393.
Please reply for any comments.
## Undocumented, have Py_DEPRECATED
There is no problem to remove them in Python 3.10. I will just do it.
* Py_UNICODE_str*** functions -- already removed in
https://github.com/python/cpython/pull/21164
* PyUnicode_GetMax()
## Documented and have Py_DEPRECATED
* PyLong_FromUnicode
* PyUnicode_AsUnicodeCopy
* PyUnicode_Encode
* PyUnicode_EncodeUTF7
* PyUnicode_EncodeUTF8
* PyUnicode_EncodeUTF16
* PyUnicode_EncodeUTF32
* PyUnicode_EncodeUnicodeEscape
* PyUnicode_EncodeRawUnicodeEscape
* PyUnicode_EncodeLatin1
* PyUnicode_EncodeASCII
* PyUnicode_EncodeCharmap
* PyUnicode_TranslateCharmap
* PyUnicode_EncodeMBCS
These APIs are documented. The document has ``.. deprecated:: 3.3
4.0`` directive.
They have been `Py_DEPRECATED` since Python 3.6 too.
Plan: Change the document to ``.. deprecated:: 3.0 3.10`` and remove
them in Python 3.10.
## PyUnicode_EncodeDecimal
It is not documented. It has not been deprecated by Py_DEPRECATED.
Plan: Add Py_DEPRECATED in Python 3.9 and remove it in 3.11.
## PyUnicode_TransformDecimalToASCII
It is documented, but doesn't have ``deprecated`` directive. It is not
deprecated by Py_DEPRECATED.
Plan: Add Py_DEPRECATED and ``deprecated 3.3 3.11`` directive in 3.9,
and remove it in 3.11.
## _PyUnicode_ToLowercase, _PyUnicode_ToUppercase
They are not deprecated by PEP 393, but bpo-12736.
They are documented as deprecated, but don't have ``Py_DEPRECATED``.
Plan: Add Py_DEPRECATED in 3.9, and remove them in 3.11.
Note: _PyUnicode_ToTitlecase has Py_DEPRECATED. It can be removed in 3.10.
--
Inada Naoki <songofacandy(a)gmail.com>
Hi,
PEP available at: https://www.python.org/dev/peps/pep-0620/
<introduction>
This PEP is the result of 4 years of research work on the C API:
https://pythoncapi.readthedocs.io/
It's the third version. The first version (2017) proposed to add a
"new C API" and advised C extensions maintainers to opt-in for it: it
was basically the same idea as PEP 384 limited C API but in a
different color. Well, I had no idea of what I was doing :-) The
second version (April 2020) proposed to add a new Python runtime built
from the same code base as the regular Python runtime but in a
different build mode, the regular Python would continue to be fully
compatible.
I wrote the third version, the PEP 620, from scratch. It now gives an
explicit and concrete list of incompatible C API changes, and has
better motivation and rationale sections. The main PEP novelty is the
new pythoncapi_compat.h header file distributed with Python to provide
new C API functions to old Python versions, the second novelty is the
process to reduce the number of broken C extensions.
Whereas PEPs are usually implemented in a single Python version, the
implementation of this PEP is expected to be done carefully over
multiple Python versions. The PEP lists many changes which are already
implemented in Python 3.7, 3.8 and 3.9. It defines a process to reduce
the number of broken C extensions when introducing the incompatible C
API changes listed in the PEP. The process dictates the rhythm of
these changes.
</introduction>
PEP: 620
Title: Hide implementation details from the C API
Author: Victor Stinner <vstinner(a)python.org>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 19-June-2020
Python-Version: 3.10
Abstract
========
Introduce C API incompatible changes to hide implementation details.
Once most implementation details will be hidden, evolution of CPython
internals would be less limited by C API backward compatibility issues.
It will be way easier to add new features.
It becomes possible to experiment with more advanced optimizations in CPython
than just micro-optimizations, like tagged pointers.
Define a process to reduce the number of broken C extensions.
The implementation of this PEP is expected to be done carefully over
multiple Python versions. It already started in Python 3.7 and most
changes are already completed. The `Process to reduce the number of
broken C extensions`_ dictates the rhythm.
Motivation
==========
The C API blocks CPython evolutions
-----------------------------------
Adding or removing members of C structures is causing multiple backward
compatibility issues.
Adding a new member breaks the stable ABI (PEP 384), especially for
types declared statically (e.g. ``static PyTypeObject MyType =
{...};``). In Python 3.4, the PEP 442 "Safe object finalization" added
the ``tp_finalize`` member at the end of the ``PyTypeObject`` structure.
For ABI backward compatibility, a new ``Py_TPFLAGS_HAVE_FINALIZE`` type
flag was required to announce if the type structure contains the
``tp_finalize`` member. The flag was removed in Python 3.8 (`bpo-32388
<https://bugs.python.org/issue32388>`_).
The ``PyTypeObject.tp_print`` member, deprecated since Python 3.0
released in 2009, has been removed in the Python 3.8 development cycle.
But the change broke too many C extensions and had to be reverted before
3.8 final release. Finally, the member was removed again in Python 3.9.
C extensions rely on the ability to access directly structure members,
indirectly through the C API, or even directly. Modifying structures
like ``PyListObject`` cannot be even considered.
The ``PyTypeObject`` structure is the one which evolved the most, simply
because there was no other way to evolve CPython than modifying it.
In the C API, all Python objects are passed as ``PyObject*``: a pointer
to a ``PyObject`` structure. Experimenting tagged pointers in CPython is
blocked by the fact that a C extension can technically dereference a
``PyObject*`` pointer and access ``PyObject`` members. Small "objects"
can be stored as a tagged pointer with no concrete ``PyObject``
structure.
Replacing Python garbage collector with a tracing garbage collector
would also need to remove ``PyObject.ob_refcnt`` reference counter,
whereas currently ``Py_INCREF()`` and ``Py_DECREF()`` macros access
directly to ``PyObject.ob_refcnt``.
Same CPython design since 1990: structures and reference counting
-----------------------------------------------------------------
When the CPython project was created, it was written with one principle:
keep the implementation simple enough so it can be maintained by a
single developer. CPython complexity grew a lot and many
micro-optimizations have been implemented, but CPython core design has
not changed.
Members of ``PyObject`` and ``PyTupleObject`` structures have not
changed since the "Initial revision" commit (1990)::
#define OB_HEAD \
unsigned int ob_refcnt; \
struct _typeobject *ob_type;
typedef struct _object {
OB_HEAD
} object;
typedef struct {
OB_VARHEAD
object *ob_item[1];
} tupleobject;
Only names changed: ``object`` was renamed to ``PyObject`` and
``tupleobject`` was renamed to ``PyTupleObject``.
CPython still tracks Python objects lifetime using reference counting
internally and for third party C extensions (through the Python C API).
All Python objects must be allocated on the heap and cannot be moved.
Why is PyPy more efficient than CPython?
----------------------------------------
The PyPy project is a Python implementation which is 4.2x faster than
CPython on average. PyPy developers chose to not fork CPython, but start
from scratch to have more freedom in terms of optimization choices.
PyPy does not use reference counting, but a tracing garbage collector
which moves objects. Objects can be allocated on the stack (or even not
at all), rather than always having to be allocated on the heap.
Objects layouts are designed with performance in mind. For example, a
list strategy stores integers directly as integers, rather than objects.
Moreover, PyPy also has a JIT compiler which emits fast code thanks to
the efficient PyPy design.
PyPy bottleneck: the Python C API
---------------------------------
While PyPy is way more efficient than CPython to run pure Python code,
it is as efficient or slower than CPython to run C extensions.
Since the C API requires ``PyObject*`` and allows to access directly
structure members, PyPy has to associate a CPython object to PyPy
objects and maintain both consistent. Converting a PyPy object to a
CPython object is inefficient. Moreover, reference counting also has to
be implemented on top of PyPy tracing garbage collector.
These conversions are required because the Python C API is too close to
the CPython implementation: there is no high-level abstraction.
For example, structures members are part of the public C API and nothing
prevents a C extension to get or set directly
``PyTupleObject.ob_item[0]`` (the first item of a tuple).
See `Inside cpyext: Why emulating CPython C API is so Hard
<https://morepypy.blogspot.com/2018/09/inside-cpyext-why-emulating-cpython-c…>`_
(Sept 2018) by Antonio Cuni for more details.
Rationale
=========
Hide implementation details
---------------------------
Hiding implementation details from the C API has multiple advantages:
* It becomes possible to experiment with more advanced optimizations in
CPython than just micro-optimizations. For example, tagged pointers,
and replace the garbage collector with a tracing garbage collector
which can move objects.
* Adding new features in CPython becomes easier.
* PyPy should be able to avoid conversions to CPython objects in more
cases: keep efficient PyPy objects.
* It becomes easier to implement the C API for a new Python
implementation.
* More C extensions will be compatible with Python implementations other
than CPython.
Relationship with the limited C API
-----------------------------------
The PEP 384 "Defining a Stable ABI" is in Python 3.4. It introduces the
"limited C API": a subset of the C API. When the limited C API is used,
it becomes possible to build a C extension only once and use it on
multiple Python versions: that's the stable ABI.
The main limitation of the PEP 384 is that C extensions have to opt-in
for the limited C API. Only very few projects made this choice,
usually to ease distribution of binaries, especially on Windows.
This PEP moves the C API towards the limited C API.
Ideally, the C API will become the limited C API and all C extensions
will use the stable ABI, but this is out of this PEP scope.
Specification
=============
Summary
-------
* (**Completed**) Reorganize the C API header files: create
``Include/cpython/`` and
``Include/internal/`` subdirectories.
* (**Completed**) Move private functions exposing implementation
details to the internal
C API.
* (**Completed**) Convert macros to static inline functions.
* (**Completed**) Add new functions ``Py_SET_TYPE()``, ``Py_SET_REFCNT()`` and
``Py_SET_SIZE()``. The ``Py_TYPE()``, ``Py_REFCNT()`` and
``Py_SIZE()`` macros become functions which cannot be used as l-value.
* (**Completed**) New C API functions must not return borrowed
references.
* (**In Progress**) Provide ``pythoncapi_compat.h`` header file.
* (**In Progress**) Make structures opaque, add getter and setter
functions.
* (**Not Started**) Deprecate ``PySequence_Fast_ITEMS()``.
* (**Not Started**) Convert ``PyTuple_GET_ITEM()`` and
``PyList_GET_ITEM()`` macros to static inline functions.
Reorganize the C API header files
---------------------------------
The first consumer of the C API was Python itself. There is no clear
separation between APIs which must not be used outside Python, and API
which are public on purpose.
Header files must be reorganized in 3 API:
* ``Include/`` directory is the limited C API: no implementation
details, structures are opaque. C extensions using it get a stable
ABI.
* ``Include/cpython/`` directory is the CPython C API: less "portable"
API, depends more on the Python version, expose some implementation
details, few incompatible changes can happen.
* ``Internal/internal/`` directory is the internal C API: implementation
details, incompatible changes are likely at each Python release.
The creation of the ``Include/cpython/`` directory is fully backward
compatible. ``Include/cpython/`` header files cannot be included
directly and are included automatically by ``Include/`` header files
when the ``Py_LIMITED_API`` macro is not defined.
The internal C API is installed and can be used for specific usage like
debuggers and profilers which must access structures members without
executing code. C extensions using the internal C API are tightly
coupled to a Python version and must be recompiled at each Python
version.
**STATUS**: Completed (in Python 3.8)
The reorganization of header files started in Python 3.7 and was
completed in Python 3.8:
* `bpo-35134 <https://bugs.python.org/issue35134>`_: Add a new
Include/cpython/ subdirectory for the "CPython API" with
implementation details.
* `bpo-35081 <https://bugs.python.org/issue35081>`_: Move internal
headers to ``Include/internal/``
Move private functions to the internal C API
--------------------------------------------
Private functions which expose implementation details must be moved to
the internal C API.
If a C extension relies on a CPython private function which exposes
CPython implementation details, other Python implementations have to
re-implement this private function to support this C extension.
**STATUS**: Completed (in Python 3.9)
Private functions moved to the internal C API in Python 3.8:
* ``_PyObject_GC_TRACK()``, ``_PyObject_GC_UNTRACK()``
Macros and functions excluded from the limited C API in Python 3.9:
* ``_PyObject_SIZE()``, ``_PyObject_VAR_SIZE()``
* ``PyThreadState_DeleteCurrent()``
* ``PyFPE_START_PROTECT()``, ``PyFPE_END_PROTECT()``
* ``_Py_NewReference()``, ``_Py_ForgetReference()``
* ``_PyTraceMalloc_NewReference()``
* ``_Py_GetRefTotal()``
Private functions moved to the internal C API in Python 3.9:
* GC functions like ``_Py_AS_GC()``, ``_PyObject_GC_IS_TRACKED()``
and ``_PyGCHead_NEXT()``
* ``_Py_AddToAllObjects()`` (not exported)
* ``_PyDebug_PrintTotalRefs()``, ``_Py_PrintReferences()``,
``_Py_PrintReferenceAddresses()`` (not exported)
Public "clear free list" functions moved to the internal C API an
renamed to private functions and in Python 3.9:
* ``PyAsyncGen_ClearFreeLists()``
* ``PyContext_ClearFreeList()``
* ``PyDict_ClearFreeList()``
* ``PyFloat_ClearFreeList()``
* ``PyFrame_ClearFreeList()``
* ``PyList_ClearFreeList()``
* ``PyTuple_ClearFreeList()``
* Functions simply removed:
* ``PyMethod_ClearFreeList()`` and ``PyCFunction_ClearFreeList()``:
bound method free list removed in Python 3.9.
* ``PySet_ClearFreeList()``: set free list removed in Python 3.4.
* ``PyUnicode_ClearFreeList()``: Unicode free list removed
in Python 3.3.
Convert macros to static inline functions
-----------------------------------------
Converting macros to static inline functions have multiple advantages:
* Functions have well defined parameter types and return type.
* Functions can use variables with a well defined scope (the function).
* Debugger can be put breakpoints on functions and profilers can display
the function name in the call stacks. In most cases, it works even
when a static inline function is inlined.
* Functions don't have `macros pitfalls
<https://gcc.gnu.org/onlinedocs/cpp/Macro-Pitfalls.html>`_.
Converting macros to static inline functions should only impact very few
C extensions which use macros in unusual ways.
For backward compatibility, functions must continue to accept any type,
not only ``PyObject*``, to avoid compiler warnings, since most macros
cast their parameters to ``PyObject*``.
Python 3.6 requires C compilers to support static inline functions: the
PEP 7 requires a subset of C99.
**STATUS**: Completed (in Python 3.9)
Macros converted to static inline functions in Python 3.8:
* ``Py_INCREF()``, ``Py_DECREF()``
* ``Py_XINCREF()``, ``Py_XDECREF()``
* ``PyObject_INIT()``, ``PyObject_INIT_VAR()``
* ``_PyObject_GC_TRACK()``, ``_PyObject_GC_UNTRACK()``, ``_Py_Dealloc()``
Macros converted to regular functions in Python 3.9:
* ``Py_EnterRecursiveCall()``, ``Py_LeaveRecursiveCall()``
(added to the limited C API)
* ``PyObject_INIT()``, ``PyObject_INIT_VAR()``
* ``PyObject_GET_WEAKREFS_LISTPTR()``
* ``PyObject_CheckBuffer()``
* ``PyIndex_Check()``
* ``PyObject_IS_GC()``
* ``PyObject_NEW()`` (alias to ``PyObject_New()``),
``PyObject_NEW_VAR()`` (alias to ``PyObject_NewVar()``)
* ``PyType_HasFeature()`` (always call ``PyType_GetFlags()``)
* ``Py_TRASHCAN_BEGIN_CONDITION()`` and ``Py_TRASHCAN_END()`` macros
now call functions which hide implementation details, rather than
accessing directly members of the ``PyThreadState`` structure.
Make structures opaque
----------------------
All structures of the C API should become opaque: C extensions must
use getter or setter functions to get or set structure members. For
example, ``tuple->ob_item[0]`` must be replaced with
``PyTuple_GET_ITEM(tuple, 0)``.
To be able to move away from reference counting, ``PyObject`` must
become opaque. Currently, the reference counter ``PyObject.ob_refcnt``
is exposed in the C API. All structures must become opaque, since they
"inherit" from PyObject. For, ``PyFloatObject`` inherits from
``PyObject``::
typedef struct {
PyObject ob_base;
double ob_fval;
} PyFloatObject;
Making ``PyObject`` fully opaque requires converting ``Py_INCREF()`` and
``Py_DECREF()`` macros to function calls. This change has an impact on
performance. It is likely to be one of the very last changes when making
structures opaque.
Making ``PyTypeObject`` structure opaque breaks C extensions declaring
types statically (e.g. ``static PyTypeObject MyType = {...};``). C
extensions must use ``PyType_FromSpec()`` to allocate types on the heap
instead. Using heap types has other advantages like being compatible
with subinterpreters. Combined with PEP 489 "Multi-phase extension
module initialization", it makes a C extension behavior closer to a
Python module, like allowing to create more than one module instance.
Making ``PyThreadState`` structure opaque requires adding getter and
setter functions for members used by C extensions.
**STATUS**: In Progress (started in Python 3.8)
The ``PyInterpreterState`` structure was made opaque in Python 3.8
(`bpo-35886 <https://bugs.python.org/issue35886>`_) and the
``PyGC_Head`` structure (`bpo-40241
<https://bugs.python.org/issue40241>`_) was made opaque in Python 3.9.
Issues tracking the work to prepare the C API to make following
structures opaque:
* ``PyObject``: `bpo-39573 <https://bugs.python.org/issue39573>`_
* ``PyTypeObject``: `bpo-40170 <https://bugs.python.org/issue40170>`_
* ``PyFrameObject``: `bpo-40421 <https://bugs.python.org/issue40421>`_
* Python 3.9 adds ``PyFrame_GetCode()`` and ``PyFrame_GetBack()``
getter functions, and moves ``PyFrame_GetLineNumber`` to the limited
C API.
* ``PyThreadState``: `bpo-39947 <https://bugs.python.org/issue39947>`_
* Python 3.9 adds 3 getter functions: ``PyThreadState_GetFrame()``,
``PyThreadState_GetID()``, ``PyThreadState_GetInterpreter()``.
Disallow using Py_TYPE() as l-value
-----------------------------------
The ``Py_TYPE()`` function gets an object type, its ``PyObject.ob_type``
member. It is implemented as a macro which can be used as an l-value to
set the type: ``Py_TYPE(obj) = new_type``. This code relies on the
assumption that ``PyObject.ob_type`` can be modified directly. It
prevents making the ``PyObject`` structure opaque.
New setter functions ``Py_SET_TYPE()``, ``Py_SET_REFCNT()`` and
``Py_SET_SIZE()`` are added and must be used instead.
The ``Py_TYPE()``, ``Py_REFCNT()`` and ``Py_SIZE()`` macros must be
converted to static inline functions which can not be used as l-value.
For example, the ``Py_TYPE()`` macro::
#define Py_TYPE(ob) (((PyObject*)(ob))->ob_type)
becomes::
#define _PyObject_CAST_CONST(op) ((const PyObject*)(op))
static inline PyTypeObject* _Py_TYPE(const PyObject *ob) {
return ob->ob_type;
}
#define Py_TYPE(ob) _Py_TYPE(_PyObject_CAST_CONST(ob))
**STATUS**: Completed (in Python 3.10)
New functions ``Py_SET_TYPE()``, ``Py_SET_REFCNT()`` and
``Py_SET_SIZE()`` were added to Python 3.9.
In Python 3.10, ``Py_TYPE()``, ``Py_REFCNT()`` and ``Py_SIZE()`` can no
longer be used as l-value and the new setter functions must be used
instead.
New C API functions must not return borrowed references
-------------------------------------------------------
When a function returns a borrowed reference, Python cannot track when
the caller stops using this reference.
For example, if the Python ``list`` type is specialized for small
integers, store directly "raw" numbers rather than Python objects,
``PyList_GetItem()`` has to create a temporary Python object. The
problem is to decide when it is safe to delete the temporary object.
The general guidelines is to avoid returning borrowed references for new
C API functions.
No function returning borrowed functions is scheduled for removal by
this PEP.
**STATUS**: Completed (in Python 3.9)
In Python 3.9, new C API functions returning Python objects only return
strong references:
* ``PyFrame_GetBack()``
* ``PyFrame_GetCode()``
* ``PyObject_CallNoArgs()``
* ``PyObject_CallOneArg()``
* ``PyThreadState_GetFrame()``
Avoid functions returning PyObject**
------------------------------------
The ``PySequence_Fast_ITEMS()`` function gives a direct access to an
array of ``PyObject*`` objects. The function is deprecated in favor of
``PyTuple_GetItem()`` and ``PyList_GetItem()``.
``PyTuple_GET_ITEM()`` can be abused to access directly the
``PyTupleObject.ob_item`` member::
PyObject **items = &PyTuple_GET_ITEM(0);
The ``PyTuple_GET_ITEM()`` and ``PyList_GET_ITEM()`` macros are
converted to static inline functions to disallow that.
**STATUS**: Not Started
New pythoncapi_compat.h header file
-----------------------------------
Making structures opaque requires modifying C extensions to
use getter and setter functions. The practical issue is how to keep
support for old Python versions which don't have these functions.
For example, in Python 3.10, it is no longer possible to use
``Py_TYPE()`` as an l-value. The new ``Py_SET_TYPE()`` function must be
used instead::
#if PY_VERSION_HEX >= 0x030900A4
Py_SET_TYPE(&MyType, &PyType_Type);
#else
Py_TYPE(&MyType) = &PyType_Type;
#endif
This code may ring a bell to developers who ported their Python code
base from Python 2 to Python 3.
Python will distribute a new ``pythoncapi_compat.h`` header file which
provides new C API functions to old Python versions. Example::
#if PY_VERSION_HEX < 0x030900A4
static inline void
_Py_SET_TYPE(PyObject *ob, PyTypeObject *type)
{
ob->ob_type = type;
}
#define Py_SET_TYPE(ob, type) _Py_SET_TYPE((PyObject*)(ob), type)
#endif // PY_VERSION_HEX < 0x030900A4
Using this header file, ``Py_SET_TYPE()`` can be used on old Python
versions as well.
Developers can copy this file in their project, or even to only
copy/paste the few functions needed by their C extension.
**STATUS**: In Progress (implemented but not distributed by CPython yet)
The ``pythoncapi_compat.h`` header file is currently developer at:
https://github.com/pythoncapi/pythoncapi_compat
Process to reduce the number of broken C extensions
===================================================
Process to reduce the number of broken C extensions when introducing C
API incompatible changes listed in this PEP:
* Estimate how many popular C extensions are affected by the
incompatible change.
* Coordinate with maintainers of broken C extensions to prepare their
code for the future incompatible change.
* Introduce the incompatible changes in Python. The documentation must
explain how to port existing code. It is recommended to merge such
changes at the beginning of a development cycle to have more time for
tests.
* Changes which are the most likely to break a large number of C
extensions should be announced on the capi-sig mailing list to notify
C extensions maintainers to prepare their project for the next Python.
* If the change breaks too many projects, reverting the change should be
discussed, taking in account the number of broken packages, their
importance in the Python community, and the importance of the change.
The coordination usually means reporting issues to the projects, or even
proposing changes. It does not require waiting for a new release including
fixes for every broken project.
Since more and more C extensions are written using Cython, rather
directly using the C API, it is important to ensure that Cython is
prepared in advance for incompatible changes. It gives more time for C
extension maintainers to release a new version with code generated with
the updated Cython (for C extensions distributing the code generated by
Cython).
Future incompatible changes can be announced by deprecating a function
in the documentation and by annotating the function with
``Py_DEPRECATED()``. But making a structure opaque and preventing the
usage of a macro as l-value cannot be deprecated with
``Py_DEPRECATED()``.
The important part is coordination and finding a balance between CPython
evolutions and backward compatibility. For example, breaking a random,
old, obscure and unmaintained C extension on PyPI is less severe than
breaking numpy.
If a change is reverted, we move back to the coordination step to better
prepare the change. Once more C extensions are ready, the incompatible
change can be reconsidered.
Version History
===============
* Version 3, June 2020: PEP rewritten from scratch. Python now
distributes a new ``pythoncapi_compat.h`` header and a process is
defined to reduce the number of broken C extensions when introducing C
API incompatible changes listed in this PEP.
* Version 2, April 2020:
`PEP: Modify the C API to hide implementation details
<https://mail.python.org/archives/list/python-dev@python.org/thread/HKM774XK…>`_.
* Version 1, July 2017:
`PEP: Hide implementation details in the C API
<https://mail.python.org/archives/list/python-ideas@python.org/thread/6XATDG…>`_
sent to python-ideas
Copyright
=========
This document has been placed in the public domain.
--
Night gathers, and now my watch begins. It shall not end until my death.
Regardless of what side you fall on, I think we can agree that emotions are
running very high at the moment. Nothing is going to change in at least the
next 24 hours, so I am personally asking folks to step back for at least
that long and think about:
1. Is what you want to say going to contribute to the discussion?
2. Is it being phrased in a polite, constructive fashion?
If after 24 hours of reflection you feel your email still meets these
criteria then I would say it's reasonable to send that email.
I'm happy to present a new PEP for the python-dev community to review. This
is joint work with Brandt Bucher, Tobias Kohn, Ivan Levkivskyi and Talin.
Many people have thought about extending Python with a form of pattern
matching similar to that found in Scala, Rust, F#, Haskell and other
languages with a functional flavor. The topic has come up regularly on
python-ideas (most recently yesterday :-).
I'll mostly let the PEP speak for itself:
- Published: https://www.python.org/dev/peps/pep-0622/ (*)
- Source: https://github.com/python/peps/blob/master/pep-0622.rst
(*) The published version will hopefully be available soon.
I want to clarify that the design space for such a match statement is
enormous. For many key decisions the authors have clashed, in some cases we
have gone back and forth several times, and a few uncomfortable compromises
were struck. It is quite possible that some major design decisions will
have to be revisited before this PEP can be accepted. Nevertheless, we're
happy with the current proposal, and we have provided ample discussion in
the PEP under the headings of Rejected Ideas and Deferred Ideas. Please
read those before proposing changes!
I'd like to end with the contents of the README of the repo where we've
worked on the draft, which is shorter and gives a gentler introduction than
the PEP itself:
# Pattern Matching
This repo contains a draft PEP proposing a `match` statement.
Origins
-------
The work has several origins:
- Many statically compiled languages (especially functional ones) have
a `match` expression, for example
[Scala](
http://www.scala-lang.org/files/archive/spec/2.11/08-pattern-matching.html),
[Rust](https://doc.rust-lang.org/reference/expressions/match-expr.html),
[F#](
https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/pattern-m…
);
- Several extensive discussions on python-ideas, culminating in a
summarizing
[blog post](
https://tobiaskohn.ch/index.php/2018/09/18/pattern-matching-syntax-in-pytho…
)
by Tobias Kohn;
- An independently developed [draft
PEP](
https://github.com/ilevkivskyi/peps/blob/pattern-matching/pep-9999.rst)
by Ivan Levkivskyi.
Implementation
--------------
A full reference implementation written by Brandt Bucher is available
as a [fork]((https://github.com/brandtbucher/cpython/tree/patma)) of
the CPython repo. This is readily converted to a [pull
request](https://github.com/brandtbucher/cpython/pull/2)).
Examples
--------
Some [example code](
https://github.com/gvanrossum/patma/tree/master/examples/) is available
from this repo.
Tutorial
--------
A `match` statement takes an expression and compares it to successive
patterns given as one or more `case` blocks. This is superficially
similar to a `switch` statement in C, Java or JavaScript (an many
other languages), but much more powerful.
The simplest form compares a target value against one or more literals:
```py
def http_error(status):
match status:
case 400:
return "Bad request"
case 401:
return "Unauthorized"
case 403:
return "Forbidden"
case 404:
return "Not found"
case 418:
return "I'm a teapot"
case _:
return "Something else"
```
Note the last block: the "variable name" `_` acts as a *wildcard* and
never fails to match.
You can combine several literals in a single pattern using `|` ("or"):
```py
case 401|403|404:
return "Not allowed"
```
Patterns can look like unpacking assignments, and can be used to bind
variables:
```py
# The target is an (x, y) tuple
match point:
case (0, 0):
print("Origin")
case (0, y):
print(f"Y={y}")
case (x, 0):
print(f"X={x}")
case (x, y):
print(f"X={x}, Y={y}")
case _:
raise ValueError("Not a point")
```
Study that one carefully! The first pattern has two literals, and can
be thought of as an extension of the literal pattern shown above. But
the next two patterns combine a literal and a variable, and the
variable is *extracted* from the target value (`point`). The fourth
pattern is a double extraction, which makes it conceptually similar to
the unpacking assignment `(x, y) = point`.
If you are using classes to structure your data (e.g. data classes)
you can use the class name followed by an argument list resembling a
constructor, but with the ability to extract variables:
```py
from dataclasses import dataclass
@dataclass
class Point:
x: int
y: int
def whereis(point):
match point:
case Point(0, 0):
print("Origin")
case Point(0, y):
print(f"Y={y}")
case Point(x, 0):
print(f"X={x}")
case Point():
print("Somewhere else")
case _:
print("Not a point")
```
We can use keyword parameters too. The following patterns are all
equivalent (and all bind the `y` attribute to the `var` variable):
```py
Point(1, var)
Point(1, y=var)
Point(x=1, y=var)
Point(y=var, x=1)
```
Patterns can be arbitrarily nested. For example, if we have a short
list of points, we could match it like this:
```py
match points:
case []:
print("No points")
case [Point(0, 0)]:
print("The origin")
case [Point(x, y)]:
print(f"Single point {x}, {y}")
case [Point(0, y1), Point(0, y2)]:
print(f"Two on the Y axis at {y1}, {y2}")
case _:
print("Something else")
```
We can add an `if` clause to a pattern, known as a "guard". If the
guard is false, `match` goes on to try the next `case` block. Note
that variable extraction happens before the guard is evaluated:
```py
match point:
case Point(x, y) if x == y:
print(f"Y=X at {x}")
case Point(x, y):
print(f"Not on the diagonal")
```
Several other key features:
- Like unpacking assignments, tuple and list patterns have exactly the
same meaning and actually match arbitrary sequences. An important
exception is that they don't match iterators or strings.
(Technically, the target must be an instance of
`collections.abc.Sequence`.)
- Sequence patterns support wildcards: `[x, y, *rest]` and `(x, y,
*rest)` work similar to wildcards in unpacking assignments. The
name after `*` may also be `_`, so `(x, y, *_)` matches a sequence
of at least two items without binding the remaining items.
- Mapping patterns: `{"bandwidth": b, "latency": l}` extracts the
`"bandwidth"` and `"latency"` values from a dict. Unlike sequence
patterns, extra keys are ignored. A wildcard `**rest` is also
supported. (But `**_` would be redundant, so it not allowed.)
- Subpatterns may be extracted using the walrus (`:=`) operator:
```py
case (Point(x1, y1), p2 := Point(x2, y2)): ...
```
- Patterns may use named constants. These must be dotted names; a
single name can be made into a constant value by prefixing it with a
dot to prevent it from being interpreted as a variable extraction:
```py
RED, GREEN, BLUE = 0, 1, 2
match color:
case .RED:
print("I see red!")
case .GREEN:
print("Grass is green")
case .BLUE:
print("I'm feeling the blues :(")
```
- Classes can customize how they are matched by defining a
`__match__()` method.
Read the [PEP](
https://github.com/python/peps/blob/master/pep-0622.rst#runtime-specificati…)
for details.
--
--Guido van Rossum (python.org/~guido)
*Pronouns: he/him **(why is my pronoun here?)*
<http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-…>