Hi, folks.
Since the previous discussion was suspended without consensus, I wrote
a new PEP for it. (Thank you Victor for reviewing it!)
This PEP looks very similar to PEP 623 "Remove wstr from Unicode",
but for encoder APIs, not for Unicode object APIs.
URL (not available yet): https://www.python.org/dev/peps/pep-0624/
---
PEP: 624
Title: Remove Py_UNICODE encoder APIs
Author: Inada Naoki <songofacandy(a)gmail.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 06-Jul-2020
Python-Version: 3.11
Abstract
========
This PEP proposes to remove deprecated ``Py_UNICODE`` encoder APIs in
Python 3.11:
* ``PyUnicode_Encode()``
* ``PyUnicode_EncodeASCII()``
* ``PyUnicode_EncodeLatin1()``
* ``PyUnicode_EncodeUTF7()``
* ``PyUnicode_EncodeUTF8()``
* ``PyUnicode_EncodeUTF16()``
* ``PyUnicode_EncodeUTF32()``
* ``PyUnicode_EncodeUnicodeEscape()``
* ``PyUnicode_EncodeRawUnicodeEscape()``
* ``PyUnicode_EncodeCharmap()``
* ``PyUnicode_TranslateCharmap()``
* ``PyUnicode_EncodeDecimal()``
* ``PyUnicode_TransformDecimalToASCII()``
.. note::
`PEP 623 <https://www.python.org/dev/peps/pep-0623/>`_ propose to remove
Unicode object APIs relating to ``Py_UNICODE``. On the other hand, this PEP
is not relating to Unicode object. These PEPs are split because they have
different motivation and need different discussion.
Motivation
==========
In general, reducing the number of APIs that have been deprecated for
a long time and have few users is a good idea for not only it
improves the maintainability of CPython, but it also helps API users
and other Python implementations.
Rationale
=========
Deprecated since Python 3.3
---------------------------
``Py_UNICODE`` and APIs using it are deprecated since Python 3.3.
Inefficient
-----------
All of these APIs are implemented using ``PyUnicode_FromWideChar``.
So these APIs are inefficient when user want to encode Unicode
object.
Not used widely
---------------
When searching from top 4000 PyPI packages [1]_, only pyodbc use
these APIs.
* ``PyUnicode_EncodeUTF8()``
* ``PyUnicode_EncodeUTF16()``
pyodbc uses these APIs to encode Unicode object into bytes object.
So it is easy to fix it. [2]_
Alternative APIs
================
There are alternative APIs to accept ``PyObject *unicode`` instead of
``Py_UNICODE *``. Users can migrate to them.
=========================================
==========================================
Deprecated API Alternative APIs
=========================================
==========================================
``PyUnicode_Encode()`` ``PyUnicode_AsEncodedString()``
``PyUnicode_EncodeASCII()`` ``PyUnicode_AsASCIIString()`` \(1)
``PyUnicode_EncodeLatin1()`` ``PyUnicode_AsLatin1String()`` \(1)
``PyUnicode_EncodeUTF7()`` \(2)
``PyUnicode_EncodeUTF8()`` ``PyUnicode_AsUTF8String()`` \(1)
``PyUnicode_EncodeUTF16()`` ``PyUnicode_AsUTF16String()`` \(3)
``PyUnicode_EncodeUTF32()`` ``PyUnicode_AsUTF32String()`` \(3)
``PyUnicode_EncodeUnicodeEscape()`` ``PyUnicode_AsUnicodeEscapeString()``
``PyUnicode_EncodeRawUnicodeEscape()``
``PyUnicode_AsRawUnicodeEscapeString()``
``PyUnicode_EncodeCharmap()`` ``PyUnicode_AsCharmapString()`` \(1)
``PyUnicode_TranslateCharmap()`` ``PyUnicode_Translate()``
``PyUnicode_EncodeDecimal()`` \(4)
``PyUnicode_TransformDecimalToASCII()`` \(4)
=========================================
==========================================
Notes:
(1)
``const char *errors`` parameter is missing.
(2)
There is no public alternative API. But user can use generic
``PyUnicode_AsEncodedString()`` instead.
(3)
``const char *errors, int byteorder`` parameters are missing.
(4)
There is no direct replacement. But ``Py_UNICODE_TODECIMAL``
can be used instead. CPython uses
``_PyUnicode_TransformDecimalAndSpaceToASCII`` for converting
from Unicode to numbers instead.
Plan
====
Python 3.9
----------
Add ``Py_DEPRECATED(3.3)`` to following APIs. This change is committed
already [3]_. All other APIs have been marked ``Py_DEPRECATED(3.3)``
already.
* ``PyUnicode_EncodeDecimal()``
* ``PyUnicode_TransformDecimalToASCII()``.
Document all APIs as "will be removed in version 3.11".
Python 3.11
-----------
These APIs are removed.
* ``PyUnicode_Encode()``
* ``PyUnicode_EncodeASCII()``
* ``PyUnicode_EncodeLatin1()``
* ``PyUnicode_EncodeUTF7()``
* ``PyUnicode_EncodeUTF8()``
* ``PyUnicode_EncodeUTF16()``
* ``PyUnicode_EncodeUTF32()``
* ``PyUnicode_EncodeUnicodeEscape()``
* ``PyUnicode_EncodeRawUnicodeEscape()``
* ``PyUnicode_EncodeCharmap()``
* ``PyUnicode_TranslateCharmap()``
* ``PyUnicode_EncodeDecimal()``
* ``PyUnicode_TransformDecimalToASCII()``
Alternative ideas
=================
Instead of just removing deprecated APIs, we may be able to use thier
names with different signature.
Make some private APIs public
------------------------------
``PyUnicode_EncodeUTF7()`` doesn't have public alternative APIs.
Some APIs have alternative public APIs. But they are missing
``const char *errors`` or ``int byteorder`` parameters.
We can rename some private APIs and make them public to cover missing
APIs and parameters.
============================= ================================
Rename to Rename from
============================= ================================
``PyUnicode_EncodeASCII()`` ``_PyUnicode_AsASCIIString()``
``PyUnicode_EncodeLatin1()`` ``_PyUnicode_AsLatin1String()``
``PyUnicode_EncodeUTF7()`` ``_PyUnicode_EncodeUTF7()``
``PyUnicode_EncodeUTF8()`` ``_PyUnicode_AsUTF8String()``
``PyUnicode_EncodeUTF16()`` ``_PyUnicode_EncodeUTF16()``
``PyUnicode_EncodeUTF32()`` ``_PyUnicode_EncodeUTF32()``
============================= ================================
Pros:
* We have more consistent API set.
Cons:
* We have more public APIs to maintain.
* Existing public APIs are enough for most use cases, and
``PyUnicode_AsEncodedString()`` can be used in other cases.
Replace ``Py_UNICODE*`` with ``Py_UCS4*``
-----------------------------------------
We can replace ``Py_UNICODE`` (typedef of ``wchar_t``) with
``Py_UCS4``. Since builtin codecs support UCS-4, we don't need to
convert ``Py_UCS4*`` string to Unicode object.
Pros:
* We have more consistent API set.
* User can encode UCS-4 string in C without creating Unicode object.
Cons:
* We have more public APIs to maintain.
* Applications which uses UTF-8 or UTF-32 can not use these APIs
anyway.
* Other Python implementations may not have builtin codec for UCS-4.
* If we change the Unicode internal representation to UTF-8, we need
to keep UCS-4 support only for these APIs.
Replace ``Py_UNICODE*`` with ``wchar_t*``
-----------------------------------------
We can replace ``Py_UNICODE`` to ``wchar_t``.
Pros:
* We have more consistent API set.
* Backward compatible.
Cons:
* We have more public APIs to maintain.
* They are inefficient on platforms ``wchar_t*`` is UTF-16. It is
because built-in codecs supports only UCS-1, UCS-2, and UCS-4
input.
Rejected ideas
==============
Using runtime warning
---------------------
These APIs doesn't release GIL for now. Emitting a warning from
such APIs is not safe. See this example.
.. code-block::
PyObject *u = PyList_GET_ITEM(list, i); // u is borrowed reference.
PyObject *b = PyUnicode_EncodeUTF8(PyUnicode_AS_UNICODE(u),
PyUnicode_GET_SIZE(u), NULL);
// Assumes u is still living reference.
PyObject *t = PyTuple_Pack(2, u, b);
Py_DECREF(b);
return t;
If we emit Python warning from ``PyUnicode_EncodeUTF8()``, warning
filters and other threads may change the ``list`` and ``u`` can be
a dangling reference after ``PyUnicode_EncodeUTF8()`` returned.
Additionally, since we are not changing behavior but removing C APIs,
runtime ``DeprecationWarning`` might not helpful for Python
developers. We should warn to extension developers instead.
Discussions
===========
* `Plan to remove Py_UNICODE APis except PEP 623
<https://mail.python.org/archives/list/python-dev@python.org/thread/S7KW2U6I…>`_
* `bpo-41123: Remove Py_UNICODE APIs except PEP 623:
<https://bugs.python.org/issue41123>`_
References
==========
.. [1] Source package list chosen from top 4000 PyPI packages.
(https://github.com/methane/notes/blob/master/2020/wchar-cache/package_list.…)
.. [2] pyodbc -- Don't use PyUnicode_Encode API #792
(https://github.com/mkleehammer/pyodbc/pull/792)
.. [3] Uncomment Py_DEPRECATED for Py_UNICODE APIs (GH-21318)
(https://github.com/python/cpython/commit/9c3840870814493fed62e140cfa43c2883…)
Copyright
=========
This document has been placed in the public domain.
--
Inada Naoki <songofacandy(a)gmail.com>
Hi all,
Right now, when a debugger is active, the number of local variables can
affect the tracing speed quite a lot.
For instance, having tracing setup in a program such as the one below takes
4.64 seconds to run, yet, changing all the variables to have the same name
-- i.e.: change all assignments to `a = 1` (such that there's only a single
variable in the namespace), it takes 1.47 seconds (in my machine)... the
higher the number of variables, the slower the tracing becomes.
```
import time
t = time.time()
def call():
a = 1
b = 1
c = 1
d = 1
e = 1
f = 1
def noop(frame, event, arg):
return noop
import sys
sys.settrace(noop)
for i in range(1_000_000):
call()
print('%.2fs' % (time.time() - t,))
```
This happens because `PyFrame_FastToLocalsWithError` and
`PyFrame_LocalsToFast` are called inside the `call_trampoline` (
https://github.com/python/cpython/blob/master/Python/sysmodule.c#L946).
So, I'd like to simply remove those calls.
Debuggers can call `PyFrame_LocalsToFast` when needed -- otherwise
mutating non-current frames doesn't work anyways. As a note, pydevd already
has such a call:
https://github.com/fabioz/PyDev.Debugger/blob/0d4d210f01a1c0a8647178b2e665b…
and PyPy also has a counterpart.
As for `PyFrame_FastToLocalsWithError`, I don't really see any reason to
call it at all.
i.e.: something as the code below prints the `a` variable from the `main()`
frame regardless of that and I checked all pydevd tests and nothing seems
to be affected (it seems that accessing f_locals already does this:
https://github.com/python/cpython/blob/cb9879b948a19c9434316f8ab6aba9c4601a…,
so, I don't see much reason to call it at all).
```
def call():
import sys
frame = sys._getframe()
print(frame.f_back.f_locals)
def main():
a = 1
call()
if __name__ == '__main__':
main()
```
Does anyone see any issue with this?
If it's non controversial, is a PEP needed or just an issue to track it
would be enough to remove those 2 lines?
Thanks,
Fabio
Hi,
Pathlib's symlink_to() and link_to() methods have different argument
orders, so:
a.symlink_to(b) # Creates a symlink from A to B
a.link_to(b) # Creates a hard link from B to A
I don't think link_to() was intended to be implemented this way, as the
docs say "Create a hard link pointing to a path named target.". It's also
inconsistent with everything else in pathlib, most obviously symlink_to().
Bug report here: https://bugs.python.org/issue39291
This /really/ irks me. Apparently it's too late to fix link_to(), so I'd
like to suggest we add a new hardlink_to() method that matches the
symlink_to() argument order. link_to() then becomes deprecated/undocumented.
Any thoughts?
Barney
Take as an example a function designed to process a tree of nodes similar to that which might be output by a JSON parser. There are 4 types of node:
- A node representing JSON strings
- A node representing JSON numbers
- A node representing JSON arrays
- A node representing JSON dictionaries
The function transforms a tree of nodes, beginning at the root node, and proceeding recursively through each child node in turn. The result is a Python object, with the following transformation applied to each node type:
- A JSON string `->` Python `str`
- A JSON number `->` Python `float`
- A JSON array `->` Python `list`
- A JSON dictionary `->` Python `dict`
I have implemented this function using 3 different approaches:
- The visitor pattern
- `isinstance` checks against the node type
- Pattern matching
Here is the implementation using the visitor pattern:
```
from typing import List, Tuple
class NodeVisitor:
def visit_string_node(self, node: StringNode):
pass
def visit_number_node(self, node: NumberNode):
pass
def visit_list_node(self, node: ListNode):
pass
def visit_dict_node(self, node: DictNode):
pass
class Node:
def visit(visitor: NodeVisitor):
raise NotImplementedError()
class StringNode(Node):
value: str
def visit(self, visitor: NodeVisitor):
visitor.visit_string_node(self)
class NumberNode(Node):
value: str
def visit(self, visitor: NodeVisitor):
visitor.visit_number_node(self)
class ListNode(Node):
children: List[Node]
def visit(self, visitor: NodeVisitor):
visitor.visit_list_node(self)
class DictNode(Node):
children: List[Tuple[str, Node]]
def visit(self, visitor: NodeVisitor):
visitor.visit_dict_node(self)
class Processor(NodeVisitor):
def process(root_node: Node):
return root_node.visit(self)
def visit_string_node(self, node: StringNode):
return node.value
def visit_number_node(self, node: NumberNode):
return float(node.value)
def visit_list_node(self, node: ListNode):
return [child_node.visit(self) for child_node in node.children]
def visit_dict_node(self, node: DictNode):
return {key: child_node.visit(self) for key, child_node in node.children}
def process(root_node: Node):
processor = Processor()
return processor.process(root_node)
```
Here is the implementation using `isinstance` checks against the node type:
```
from typing import List, Tuple
class Node:
pass
class StringNode(Node):
value: str
class NumberNode(Node):
value: str
class ListNode(Node):
children: List[Node]
class DictNode(Node):
children: List[Tuple[str, Node]]
def process(root_node: Node):
def process_node(node: Node):
if isinstance(node, StringNode):
return node.value
elif isinstance(node, NumberNode):
return float(node.value)
elif isinstance(node, ListNode):
return [process_node(child_node) for child_node in node.children]
elif isinstance(node, DictNode):
return {key: process_node(child_node) for key, child_node in node.children}
else:
raise Exception('Unexpected node')
return process_node(root_node)
```
Finally here is the implementation using pattern matching:
```
from typing import List, Tuple
class Node:
pass
class StringNode(Node):
value: str
class NumberNode(Node):
value: str
class ListNode(Node):
children: List[Node]
class DictNode(Node):
children: List[Tuple[str, Node]]
def process(root_node: Node):
def process_node(node: Node):
match node:
case StringNode(value=str_value):
return str_value
case NumberNode(value=number_value):
return float(number_value)
case ListNode(children=child_nodes):
return [process_node(child_node) for child_node in child_nodes]
case DictNode(children=child_nodes):
return {key: process_node(child_node) for key, child_node in child_nodes}
case _:
raise Exception('Unexpected node')
return process_node(root_node)
```
Here are the lengths of the different implementations:
- Pattern matching `->` 37 lines
- `isinstance` checks `->` 36 lines
- The visitor pattern `->` 69 lines
The visitor pattern implementation is by far the most verbose solution, weighing in at almost twice the length of the alternative implementations due to the large amount of boilerplate that is necessary to achieve double dispatch. The pattern matching and `isinstance` check implementations are very similar in length for this trivial example.
In each implementation, there are 2 operations performed on each node.
- Determine the type of the node
- Destructure the node to extract the desired data
The visitor pattern and `isinstance` check implementations separate these 2 operations, whereas the pattern matching approach combines the operations together. I believe that it is the declarative nature of pattern matching, where the operations of determining the type of the node and destructuring the node are combined into a single clause, which allows pattern matching to express a concise solution to the problem. In this trivial example, the advantage of pattern matching over the alternative of using a sequence of `if`-`elif`-`else` statements is not as obvious as it would be when compared to a more complex example, where a sub-tree of nodes might be matched based on their type and be destructured in a single clause.
I have seen elsewhere an argument that pattern matching should not be accepted into Python as it introduces a pseudo-DSL that is separate from the rest of the language. I agree that pattern matching might be viewed as a pseudo-DSL, but I believe that it is a good thing, if it allows the solution to certain classes of problems to be expressed in a concise manner. People often raise similar objections to operator overloading in other languages, whereas the presence of operator overloading in Python allows mathematical expressions involving custom numeric types such as vectors to be expressed in a natural way. Furthermore, Python has a regular expression module which implements it's own DSL for the purpose of matching string patterns. Regular expressions, in a similar way to pattern matching, allow string patterns to be expressed in a concise and declarative manner.
I really hope that the Steering Council accepts pattern matching into Python. I think that it allows for processing of heterogeneous graphs of objects using recursion in a concise, declarative manner. I would like to thank the authors of the Structural Pattern Matching PEP for their hard work in designing this feature and developing an implementation of it. I believe that it will be a wonderful addition to the language that I am very much looking forward to using.
Hi folks,
I know this is a high-volume list so sorry for being yet another voice
screaming into your mailbox, but I do not know how else to handle this.
A few months ago, I opened a pull request fixing packed bitfields in
ctypes struct/union, which right now are being incorrectly created. The
offsets and size are all wrong in some cases, fields which should be
packed end up with padding between them.
bpo: https://bugs.python.org/issue29753
PR: https://github.com/python/cpython/pull/19850
Since ctypes has no maintainer it has been very hard tracking down
someone who is up to review this. So, if this issue sparks some
interest to you, or you just want to help me out, please have a look :)
Cheers,
Filipe Laíns
Hi everyone,
PEP 634/5/6 presents a possible implementation of pattern matching for
Python.
Much of the discussion around PEP 634, and PEP 622 before it, seems to
imply that PEP 634 is synonymous with pattern matching; that if you
reject PEP 634 then you are rejecting pattern matching.
That simply isn't true.
Can we discuss whether we want pattern matching in Python and
the broader semantics first, before dealing with low level details?
Do we want pattern matching in Python at all?
---------------------------------------------
Pattern matching works really well in statically typed, functional
languages.
The lack of mutability, constrained scope and the ability of the
compiler to distinguish let variables from constants means that pattern
matching code has fewer errors, and can be compiled efficiently.
Pattern matching works less well in dynamically-typed, functional
languages and statically-typed, procedural languages.
Nevertheless, it works well enough for it to be a popular feature in
both erlang and rust.
In dynamically-typed, procedural languages, however, it is not clear (at
least not to me) that it works well enough to be worthwhile.
That is not say that pattern matching could never be of value in Python,
but PEP 635 fails to demonstrate that it can (although it does a better
job than PEP 622).
Should match be an expression, or a statement?
----------------------------------------------
Do we want a fancy switch statement, or a powerful expression?
Expressions have the advantage of not leaking (like comprehensions in
Python 3), but statements are easier to work with.
Can pattern matching make it clear what is assigned?
----------------------------------------------------
Embedding the variables to be assigned into a pattern, makes the pattern
concise, but requires discarding normal Python syntax and inventing a
new sub-language. Could we make patterns fit Python better?
Is it possible to make assignment to variables clear, and unambiguous,
and allow the use of symbolic constants at the same time?
I think it is, but PEP 634 fails to do this.
How should pattern matching be integrated with the object model?
----------------------------------------------------------------
What special method(s) should be added? How and when should they be called?
PEP 634 largely disregards the object model, meaning it has many special
cases, and is inefficient.
The semantics must be well defined.
-----------------------------------
Language extensions PEPs should define the semantics of those
extensions. For example, PEP 343 and PEP 380 both did.
https://www.python.org/dev/peps/pep-0343/#specification-the-with-statementhttps://www.python.org/dev/peps/pep-0380/#formal-semantics
PEP 634 just waves its hands and talks about undefined behavior, which
horrifies me.
In summary,
I would ask anyone who wants pattern matching adding to Python, to not
support PEP 634.
PEP 634 just isn't a good fit for Python, and we deserve something better.
Cheers,
Mark.
On behalf of the PyPA and the pip team, I am pleased to announce that we have just released pip 20.3, a new version of pip. You can install it by running `python -m pip install --upgrade pip`.
This is an important and disruptive release -- we [explained why in a blog post last year](https://pyfound.blogspot.com/2019/12/moss-czi-support-pip.html). We
even made [a video about it](https://www.youtube.com/watch?v=B4GQCBBsuNU).
## Highlights
* **DISRUPTION**: Switch to the new dependency resolver by default. (#9019) Watch out for changes in handling editable
installs, constraints files, and more:
https://pip.pypa.io/en/latest/user_guide/#changes-to-the-pip-dependency-res…
* **DEPRECATION**: Deprecate support for Python 3.5 (to be removed in pip 21.0) (#8181)
* **DEPRECATION**: pip freeze will stop filtering the pip, setuptools, distribute and wheel packages from pip freeze output in a future version. To keep the previous behavior, users should use the new `--exclude` option. (#4256)
* Substantial improvements in new resolver for performance, output and
error messages, avoiding infinite loops, and support for constraints files.
* Support for PEP 600: Future ‘manylinux’ Platform Tags for Portable
Linux Built Distributions. (#9077)
* Documentation improvements: Resolver migration guide, quickstart
guide, and new documentation theme.
* Add support for MacOS Big Sur compatibility tags. (#9138)
The new resolver is now *on by default*. It is significantly stricter
and more consistent when it receives incompatible instructions, and
reduces support for certain kinds of constraints files, so some
workarounds and workflows may break. Please see [our guide on how to
test and migrate, and how to report issues](https://pip.pypa.io/en/latest/user_guide/#changes-to-the-pip-dependency-resolver-in-20-3-2020). You
can use the deprecated (old) resolver, using the flag
`--use-deprecated=legacy-resolver`, until we remove it in the pip 21.0
release in January 2021.
You can find more details (including deprecations and removals) [in the
changelog](https://pip.pypa.io/en/stable/news/).
## User experience
Command-line output for this version of pip, and documentation to help
with errors, is significantly better, because you worked with our
experts to test and improve it. [Contribute to our user experience work: sign up to become a member of the UX Studies group](https://bit.ly/pip-ux-studies) (after you join, we'll notify you about future UX surveys and interviews).
## What to expect in 20.1
We aim to release pip 20.1 in January 2021, per our [usual release cadence](https://pip.pypa.io/en/latest/development/release-process/#release-cadence). You can expect:
* Removal of [Python 2.7](https://pip.pypa.io/en/latest/development/release-process/#python-2-support) and 3.5 support
* Further improvements in the new resolver
* Removal of legacy resolver support
## Thanks
As with all pip releases, a significant amount of the work was
contributed by pip's user community. Huge thanks to all who have
contributed, whether through code, documentation, issue reports and/or
discussion. Your help keeps pip improving, and is hugely appreciated.
Specific thanks go to Mozilla (through its [Mozilla Open Source
Support](https://www.mozilla.org/en-US/moss/) Awards) and to the [Chan
Zuckerberg Initiative](https://chanzuckerberg.com/eoss/) DAF, an
advised fund of Silicon Valley Community Foundation, for their funding
that enabled substantial work on the new resolver.
That funding went to [Simply Secure](https://simplysecure.org/)
(specifically Georgia Bullen, Bernard Tyers, Nicole Harris, Ngọc
Triệu, and Karissa McKelvey), [Changeset
Consulting](https://changeset.nyc/) (Sumana Harihareswara),
[Atos](https://www.atos.net) (Paul F. Moore), [Tzu-ping
Chung](https://uranusjr.com), [Pradyun Gedam](https://pradyunsg.me/),
and Ilan Schnell. Thanks also to Ernest W. Durbin III at the Python
Software Foundation for liaising with the project.
-Sumana Harihareswara, pip project manager
(Context: Continuing to prepare for the core dev sprint next week. Since
the sprint is near, *I'd greatly appreciate any quick comments, feedback
and ideas!*)
Following up my collection of past beginning contributor experiences, I've
collected these experiences in a dedicated GitHub repo[1] and written a
(subjective!) summary of main themes that I recognize in the stories, which
I've also included in the repo[2].
A "TL;DR" bullet list of those main themes:
* Slow/no responsiveness
* Long, slow process
* Hard to find where to contribute
* Mentorship helps a lot, but is scarce
* A lot to learn to get started
* It's intimidating
More specifically, something that has come up often is that maintaining
momentum for new contributors is crucial for them to become long-term
contributors. Most often, this comes up in relation to the first two
points: Suggestions or PRs are completely receive no attention at all
("ignored") or stop receiving attention at some point ("lost to the void").
Unfortunately, the probability of this is pretty high for any issue/PR, so
for a new contributor this is almost guaranteed to happen while working on
one of their first few contributions. I've seen this happen many times, and
have found that I have to personally follow promising contributors' work to
ensure that this doesn't happen to them. I've also seen contributors learn
to actively seek out core devs when these situations arise, which is often
a successful tactic, but shouldn't be necessary so often.
Now, this is in large part a result of the fact that us core devs are not a
very large group, made up almost entirely of volunteers working on this in
their spare time. Last I checked, the total amount of paid development time
dedicated to developing Python is less than 3 full-time (i.e. ~100 hours a
week).
The situation being problematic is clear enough that the PSF had concrete
plans to hire paid developers to review issues and PRs. However, those
plans have been put on hold indefinitely, since the PSF's funding has
shrunk dramatically since the COVID-19 outbreak (no PyCon!).
So, what can be done? Besides raising more funds (see a note on this
below), I think we can find ways to reduce how often issues/PRs become
"stalled". Here are some ideas:
1. *Generate reminders for reviewers when an issue or PR becomes "stalled'
due to them.* Personally, I've found that both b.p.o. and GitHub make it
relatively hard to remember to follow up on all of the many issues/PRs
you've taken part in reviewing. It takes considerable attention and
discipline to do so consistently, and reminders like these would have
helped me. Many (many!) times, all it took to get an issue/PR moving
forward (or closed) was a simple "ping?" comment.
2. *Generate reminders for contributors when an issue or PR becomes
"stalled" due to them.* Similar to the above, but I consider these separate.
3. *Advertise something like a "2-for-1" standing offer for reviews.* This
would give contributors an "official", acceptable way to get attention for
their issue/PR, other than "begging" for attention on a mailing list. There
are good ways for new contributors to be of significant help despite being
new to the project, such as checking whether old bugs are still relevant,
searching for duplicate issues, or applying old patches to the current code
and creating a PR. (This would be similar to Martin v. Löwis's 5-for-1
offer in 2012[3], which had little success but lead to some interesting
followup discussion[4]).
4. *Encourage core devs to dedicate some of their time to working through
issues/PRs which are "ignored" or "stalled".* This would require first
generating reliable lists of issues/PRs in such states. This could be in
various forms, such as predefined GitHub/b.p.o. queries, a dedicated
web-page, a periodic message similar to b.p.o.'s "weekly summary" email, or
dedicated tags/labels for issues/PRs. (Perhaps prioritize "stalled" over
"ignored".)
- Tal Einat
[1]: https://github.com/taleinat/python-contribution-feedback
[2]:
https://github.com/taleinat/python-contribution-feedback/blob/master/Takeaw…
[3]:
https://mail.python.org/archives/list/python-dev@python.org/message/7DLUN4Y…
[4]:
https://mail.python.org/archives/list/python-dev@python.org/thread/N4MMHXXO…
Hello,
As was mentioned many times on the list, PEP634-PEP636 are thoroughly
prepared and good materials, many thanks to their authors. PEP635
"Motivation and Rationale" (https://www.python.org/dev/peps/pep-0635/)
stands out among the 3 however: while reading it, chances that you'll
get a feeling of "residue", accumulating a section over section. By the
end of reading, you may get a well-formed feeling that you've read a
half-technical, half-marketing material, which is intended to "sell" a
particular idea among many other very viable ideas, by shoehorning some
concepts, downplaying other ideas, and at the same time, light-heartedly
presenting drawbacks of its subject one.
Just to give one example, literally at the very beginning, at the
"Pattern Matching and OO" section (3rd heading) it says:
> Pattern matching is complimentary to the object-oriented paradigm.
It's not until the very end of document, in the "History and Context" it
tells the whole truth:
> With its emphasis on abstraction and encapsulation, object-oriented
> programming posed a serious challenge to pattern matching.
You may wonder how "complimentary" and "posed a serious challenge"
relate to each other. While they're definitely not contradictory,
starting the document with light-hearted "complimentary" can be seen as
trying to set the stage where readers don't pay enough attention to the
problem. And it kinda worked: only now [1] wider community discovers the
implications of "Class Patterns" choices. (As a note, PEP635 does well
on explaining them, and I'm personally sold on that, but it's *tough*
choice, not the *obvious* choice).
There're many more examples like that in the PEP635, would take too
much space to list them all. However, PEP635 refers to the paper:
> Kohn et al., Dynamic Pattern Matching with Python
> https://doi.org/10.1145/3426422.3426983 (Accepted by DLS 2020. The
> link will go live after Nov. 17; a preview PDF can be obtained from
> the first author.)
As that citation suggests, the paper is not directly linked from the
PEP635. But the preprint is now accessible from the conference page,
https://conf.researchr.org/home/dls-2020?#event-overview (direct link
as of now: https://gvanrossum.github.io//docs/PyPatternMatching.pdf).
That paper is written at much higher academic standard, and a pleasure
to read. I recommend it to everyone who read PEP635 (note that it was
written with PEP622 in mind, but conceptual differences are minor). With
it, I noticed just 2 obvious issues:
Section 4.3. Named Constants
> It would clearly be desirable to allow named constants in patterns
> as a replacement and extension of literals. However, Python has no
> concept of a constant, i.e. all variables are mutable (even where
> the values themselves are immutable).
So, unlike PEP635, the paper pinpoints right the root of PEP634's
problems: lack of constants in Python (on the language level). This is
just the point which was raised on the mailing list either
(https://mail.python.org/archives/list/python-dev@python.org/message/WV2UA4A…).
Under strict academic peer review, the paper would have been returned
for elaboration, with a note like: "Given that nowadays many dynamic
languages (PHP, JavaScript, Lua, etc.) have support for constants, and
lack of constants poses a serious usability challenge to your proposal,
please explain why you chose to proceed anyway (and apply workarounds),
instead of first introducing the concept of constants to the language.
(Given that amount of work to implement pattern matching is certainly
an order of magnitude larger than to introduce constants)."
But the paper wasn't returned for elaboration, so we'll keep wondering
why the authors chose such a backward process.
Section 6.1. Scope
> The granularity of the scope of local variables is at the level of
> functions and frames. [...]
> The only way to restrict the scope of a variable to part of a
> function’s body (such as a case clause) would be to actively delete
> the variable when leaving the block. This would, however, not restore
> any previous value of a local variable in the function’s scope.
This is a misconception ("the only way") which is repeated almost one
to one on PEP635 either. If anything, the above describes how
pseudo-scoping is currently implemented for exception vars in "except
Exception as e:" clause (more info:
https://mail.python.org/pipermail/python-dev/2019-January/155991.html),
which is hardly a "best practice", and definitely not the only way.
How to support multiple variable scopes in one stack frame is not a
rocket science at all. One just need to remember how C did that since
~forever. And that's: (for unoptimized programs) variables in different
scopes live in disjoint subsections of a frame. (For optimized
programs, variables with disjoint liveness can share the same
locations in a frame).
The only reasons for not implementing the same solution in Python would
be intertia of thought and "but it's not done anywhere else in
Python". Yeah, but nobody del'eted local variables behind users' backs
either, before somebody started to do that for exception clause
variables. And the whole of pattern matching is such that either one
thing, or another, but will be done for the first time in Python. For
example, nobody before could imagine that one can write "Point(x, y)",
and get x and y assigned, and now we're facing just that [1]. (Again,
I personally love it, though think that "Point(>x, >y)" is an
interesting option to address the tensions).
In that regard, the current PEP634 and friends miss too many interesting
and useful opportunities (constants in language core and true scoping
for match'es, to name a few). Well, that happens. But they try to
shoehorn too much of "we didn't do" into "it's not possible" or "it
doesn't make sense", or "let's workaround it in adhoc ways" and that
raises eyebrows, leading to concerns of whether the proposals are
actually "raw" as of yet.
[1] People expressing surprise at "Class Patterns" syntax:
https://mail.python.org/archives/list/python-dev@python.org/message/F66J72J…https://mail.python.org/archives/list/python-dev@python.org/message/Q2ARJUL…
--
Best regards,
Paul mailto:pmiscml@gmail.com
> I'd love to have an easy way to keep them in the loop.
I'm one of the maintainers on https://github.com/docker-library/python
(which is what results in https://hub.docker.com/_/python), and I'd
love to have an easy way to keep myself in the loop too! O:)
Is there a lower-frequency mailing list where things like this are
normally posted that I could follow?
(I don't want to be a burden, although we'd certainly really love to
have more upstream collaboration on that repo -- we do our best to
represent upstream as correctly/accurately as possible, but we're not
experts!)
> would it make sense to add a packaging section to our documentation or
> to write an informational PEP?
FWIW, I love the idea of an explicit "packaging" section in the docs
(or a PEP), but I've maintained that for other projects before and
know it's not always easy or obvious. :)
♥,
- Tianon
4096R / B42F 6819 007F 00F8 8E36 4FD4 036A 9C25 BF35 7DD4
PS. thanks doko for giving me a link to this thread! :D