Hi,
Pathlib's symlink_to() and link_to() methods have different argument
orders, so:
a.symlink_to(b) # Creates a symlink from A to B
a.link_to(b) # Creates a hard link from B to A
I don't think link_to() was intended to be implemented this way, as the
docs say "Create a hard link pointing to a path named target.". It's also
inconsistent with everything else in pathlib, most obviously symlink_to().
Bug report here: https://bugs.python.org/issue39291
This /really/ irks me. Apparently it's too late to fix link_to(), so I'd
like to suggest we add a new hardlink_to() method that matches the
symlink_to() argument order. link_to() then becomes deprecated/undocumented.
Any thoughts?
Barney
Hi all,
It's finally time to schedule the last releases in Python 2's life. There will be two more releases of Python 2.7: Python 2.7.17 and Python 2.7.18.
Python 2.7.17 release candidate 1 will happen on October 5th followed by the final release on October 19th.
I'm going to time Python 2.7.18 to coincide with PyCon 2020 in April, so attendees can enjoy some collective catharsis. We'll still say January 1st is the official EOL date.
Thanks to Sumana Harihareswara, there's now a FAQ about the Python 2 sunset on the website: https://www.python.org/doc/sunset-python-2/
Regards,
Benjamin
Browser Link: https://www.python.org/dev/peps/pep-0616/
PEP: 616
Title: String methods to remove prefixes and suffixes
Author: Dennis Sweeney <sweeney.dennis650(a)gmail.com>
Sponsor: Eric V. Smith <eric(a)trueblade.com>
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 19-Mar-2020
Python-Version: 3.9
Post-History: 30-Aug-2002
Abstract
========
This is a proposal to add two new methods, ``cutprefix`` and
``cutsuffix``, to the APIs of Python's various string objects. In
particular, the methods would be added to Unicode ``str`` objects,
binary ``bytes`` and ``bytearray`` objects, and
``collections.UserString``.
If ``s`` is one these objects, and ``s`` has ``pre`` as a prefix, then
``s.cutprefix(pre)`` returns a copy of ``s`` in which that prefix has
been removed. If ``s`` does not have ``pre`` as a prefix, an
unchanged copy of ``s`` is returned. In summary, ``s.cutprefix(pre)``
is roughly equivalent to ``s[len(pre):] if s.startswith(pre) else s``.
The behavior of ``cutsuffix`` is analogous: ``s.cutsuffix(suf)`` is
roughly equivalent to
``s[:-len(suf)] if suf and s.endswith(suf) else s``.
Rationale
=========
There have been repeated issues [#confusion]_ on the Bug Tracker
and StackOverflow related to user confusion about the existing
``str.lstrip`` and ``str.rstrip`` methods. These users are typically
expecting the behavior of ``cutprefix`` and ``cutsuffix``, but they
are surprised that the parameter for ``lstrip`` is interpreted as a
set of characters, not a substring. This repeated issue is evidence
that these methods are useful, and the new methods allow a cleaner
redirection of users to the desired behavior.
As another testimonial for the usefulness of these methods, several
users on Python-Ideas [#pyid]_ reported frequently including similar
functions in their own code for productivity. The implementation
often contained subtle mistakes regarding the handling of the empty
string (see `Specification`_).
Specification
=============
The builtin ``str`` class will gain two new methods with roughly the
following behavior::
def cutprefix(self: str, pre: str, /) -> str:
if self.startswith(pre):
return self[len(pre):]
return self[:]
def cutsuffix(self: str, suf: str, /) -> str:
if suf and self.endswith(suf):
return self[:-len(suf)]
return self[:]
The only difference between the real implementation and the above is
that, as with other string methods like ``replace``, the
methods will raise a ``TypeError`` if any of ``self``, ``pre`` or
``suf`` is not an instace of ``str``, and will cast subclasses of
``str`` to builtin ``str`` objects.
Note that without the check for the truthyness of ``suf``,
``s.cutsuffix('')`` would be mishandled and always return the empty
string due to the unintended evaluation of ``self[:-0]``.
Methods with the corresponding semantics will be added to the builtin
``bytes`` and ``bytearray`` objects. If ``b`` is either a ``bytes``
or ``bytearray`` object, then ``b.cutsuffix()`` and ``b.cutprefix()``
will accept any bytes-like object as an argument.
Note that the ``bytearray`` methods return a copy of ``self``; they do
not operate in place.
The following behavior is considered a CPython implementation detail,
but is not guaranteed by this specification::
>>> x = 'foobar' * 10**6
>>> x.cutprefix('baz') is x is x.cutsuffix('baz')
True
>>> x.cutprefix('') is x is x.cutsuffix('')
True
That is, for CPython's immutable ``str`` and ``bytes`` objects, the
methods return the original object when the affix is not found or if
the affix is empty. Because these types test for equality using
shortcuts for identity and length, the following equivalent
expressions are evaluated at approximately the same speed, for any
``str`` objects (or ``bytes`` objects) ``x`` and ``y``::
>>> (True, x[len(y):]) if x.startswith(y) else (False, x)
>>> (True, z) if x != (z := x.cutprefix(y)) else (False, x)
The two methods will also be added to ``collections.UserString``,
where they rely on the implementation of the new ``str`` methods.
Motivating examples from the Python standard library
====================================================
The examples below demonstrate how the proposed methods can make code
one or more of the following:
Less fragile:
The code will not depend on the user to count the length of a
literal.
More performant:
The code does not require a call to the Python built-in
``len`` function.
More descriptive:
The methods give a higher-level API for code readability, as
opposed to the traditional method of string slicing.
refactor.py
-----------
- Current::
if fix_name.startswith(self.FILE_PREFIX):
fix_name = fix_name[len(self.FILE_PREFIX):]
- Improved::
fix_name = fix_name.cutprefix(self.FILE_PREFIX)
c_annotations.py:
-----------------
- Current::
if name.startswith("c."):
name = name[2:]
- Improved::
name = name.cutprefix("c.")
find_recursionlimit.py
----------------------
- Current::
if test_func_name.startswith("test_"):
print(test_func_name[5:])
else:
print(test_func_name)
- Improved::
print(test_finc_name.cutprefix("test_"))
deccheck.py
-----------
This is an interesting case because the author chose to use the
``str.replace`` method in a situation where only a prefix was
intended to be removed.
- Current::
if funcname.startswith("context."):
self.funcname = funcname.replace("context.", "")
self.contextfunc = True
else:
self.funcname = funcname
self.contextfunc = False
- Improved::
if funcname.startswith("context."):
self.funcname = funcname.cutprefix("context.")
self.contextfunc = True
else:
self.funcname = funcname
self.contextfunc = False
- Arguably further improved::
self.contextfunc = funcname.startswith("context.")
self.funcname = funcname.cutprefix("context.")
test_i18n.py
------------
- Current::
if test_func_name.startswith("test_"):
print(test_func_name[5:])
else:
print(test_func_name)
- Improved::
print(test_finc_name.cutprefix("test_"))
- Current::
if creationDate.endswith('\\n'):
creationDate = creationDate[:-len('\\n')]
- Improved::
creationDate = creationDate.cutsuffix('\\n')
shared_memory.py
----------------
- Current::
reported_name = self._name
if _USE_POSIX and self._prepend_leading_slash:
if self._name.startswith("/"):
reported_name = self._name[1:]
return reported_name
- Improved::
if _USE_POSIX and self._prepend_leading_slash:
return self._name.cutprefix("/")
return self._name
build-installer.py
------------------
- Current::
if archiveName.endswith('.tar.gz'):
retval = os.path.basename(archiveName[:-7])
if ((retval.startswith('tcl') or retval.startswith('tk'))
and retval.endswith('-src')):
retval = retval[:-4]
- Improved::
if archiveName.endswith('.tar.gz'):
retval = os.path.basename(archiveName[:-7])
if retval.startswith(('tcl', 'tk')):
retval = retval.cutsuffix('-src')
Depending on personal style, ``archiveName[:-7]`` could also be
changed to ``archiveName.cutsuffix('.tar.gz')``.
test_core.py
------------
- Current::
if output.endswith("\n"):
output = output[:-1]
- Improved::
output = output.cutsuffix("\n")
cookiejar.py
------------
- Current::
def strip_quotes(text):
if text.startswith('"'):
text = text[1:]
if text.endswith('"'):
text = text[:-1]
return text
- Improved::
def strip_quotes(text):
return text.cutprefix('"').cutsuffix('"')
- Current::
if line.endswith("\n"): line = line[:-1]
- Improved::
line = line.cutsuffix("\n")
fixdiv.py
---------
- Current::
def chop(line):
if line.endswith("\n"):
return line[:-1]
else:
return line
- Improved::
def chop(line):
return line.cutsuffix("\n")
test_concurrent_futures.py
--------------------------
In the following example, the meaning of the code changes slightly,
but in context, it behaves the same.
- Current::
if name.endswith(('Mixin', 'Tests')):
return name[:-5]
elif name.endswith('Test'):
return name[:-4]
else:
return name
- Improved::
return name.cutsuffix('Mixin').cutsuffix('Tests').cutsuffix('Test')
msvc9compiler.py
----------------
- Current::
if value.endswith(os.pathsep):
value = value[:-1]
- Improved::
value = value.cutsuffix(os.pathsep)
test_pathlib.py
---------------
- Current::
self.assertTrue(r.startswith(clsname + '('), r)
self.assertTrue(r.endswith(')'), r)
inner = r[len(clsname) + 1 : -1]
- Improved::
self.assertTrue(r.startswith(clsname + '('), r)
self.assertTrue(r.endswith(')'), r)
inner = r.cutprefix(clsname + '(').cutsuffix(')')
Rejected Ideas
==============
Expand the lstrip and rstrip APIs
---------------------------------
Because ``lstrip`` takes a string as its argument, it could be viewed
as taking an iterable of length-1 strings. The API could therefore be
generalized to accept any iterable of strings, which would be
successively removed as prefixes. While this behavior would be
consistent, it would not be obvious for users to have to call
``'foobar'.cutprefix(('foo,))`` for the common use case of a
single prefix.
Allow multiple prefixes
-----------------------
Some users discussed the desire to be able to remove multiple
prefixes, calling, for example, ``s.cutprefix('From: ', 'CC: ')``.
However, this adds ambiguity about the order in which the prefixes are
removed, especially in cases like ``s.cutprefix('Foo', 'FooBar')``.
After this proposal, this can be spelled explicitly as
``s.cutprefix('Foo').cutprefix('FooBar')``.
Remove multiple copies of a prefix
----------------------------------
This is the behavior that would be consistent with the aforementioned
expansion of the ``lstrip/rstrip`` API -- repeatedly applying the
function until the argument is unchanged. This behavior is attainable
from the proposed behavior via the following::
>>> s = 'foo' * 100 + 'bar'
>>> while s != (s := s.cutprefix("foo")): pass
>>> s
'bar'
The above can be modififed by chaining multiple ``cutprefix`` calls
together to achieve the full behavior of the ``lstrip``/``rstrip``
generalization, while being explicit in the order of removal.
While the proposed API could later be extended to include some of
these use cases, to do so before any observation of how these methods
are used in practice would be premature and may lead to choosing the
wrong behavior.
Raising an exception when not found
-----------------------------------
There was a suggestion that ``s.cutprefix(pre)`` should raise an
exception if ``not s.startswith(pre)``. However, this does not match
with the behavior and feel of other string methods. There could be
``required=False`` keyword added, but this violates the KISS
principle.
Alternative Method Names
------------------------
Several alternatives method names have been proposed. Some are listed
below, along with commentary for why they should be rejected in favor
of ``cutprefix`` (the same arguments hold for ``cutsuffix``)
``ltrim``
"Trim" does in other languages (e.g. JavaScript, Java, Go,
PHP) what ``strip`` methods do in Python.
``lstrip(string=...)``
This would avoid adding a new method, but for different
behavior, it's better to have two different methods than one
method with a keyword argument that select the behavior.
``cut_prefix``
All of the other methods of the string API, e.g.
``str.startswith()``, use ``lowercase`` rather than
``lower_case_with_underscores``.
``cutleft``, ``leftcut``, or ``lcut``
The explicitness of "prefix" is preferred.
``removeprefix``, ``deleteprefix``, ``withoutprefix``, etc.
All of these might have been acceptable, but they have more
characters than ``cut``. Some suggested that the verb "cut"
implies mutability, but the string API already contains verbs
like "replace", "strip", "split", and "swapcase".
``stripprefix``
Users may benefit from the mnemonic that "strip" means working
with sets of characters, while other methods work with
substrings, so re-using "strip" here should be avoided.
Reference Implementation
========================
See the pull request on GitHub [#pr]_.
References
==========
.. [#pr] GitHub pull request with implementation
(https://github.com/python/cpython/pull/18939)
.. [#pyid] Discussion on Python-Ideas
(https://mail.python.org/archives/list/python-ideas@python.org/thread/RJARZS…)
.. [#confusion] Comment listing Bug Tracker and StackOverflow issues
(https://mail.python.org/archives/list/python-ideas@python.org/message/GRGAF…)
Copyright
=========
This document is placed in the public domain or under the
CC0-1.0-Universal license, whichever is more permissive.
..
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
coding: utf-8
End:
Hi,
As an experiment, I thought I would try moving the thread state (what
you get from _PyThreadState_GET() ) to TLS.
https://github.com/python/cpython/compare/master...markshannon:threadstate_…
It works, passing all the tests, and seems sound.
It is a small patch (< 50 lines) and doesn't increase the overall code size.
My branch is GCC/Clang only, so will need a bit of extra code for
Windows. It should only need a few more lines; I haven't done it as I
don't have a Windows machine to test it on.
This is a *much* cleaner approach to removing the global variable than
adding lots of extra parameters all over the place.
Cheers,
Mark.
Hi,
The Python Steering Council accepts PEP 585 "Type Hinting Generics In
Standard Collections":
https://www.python.org/dev/peps/pep-0585/
Congrats Łukasz Langa for your tenacity! (PEP written one year ago.)
Thanks also to everyone who was involved in the discussion to help to
get a better PEP ;-)
It seems Guido van Rossum is already working on implementing the PEP:
* https://github.com/python/cpython/pull/18239
* https://bugs.python.org/issue39481
Victor
--
Night gathers, and now my watch begins. It shall not end until my death.
The steering council wants to remind folks that if you have witnessed or
experienced any conduct that you think may go against the PSF Code of
Conduct to please report those incidents to conduct(a)python.org. This
includes reporting micro-aggressions like feeling dismissed so that any
pattern of such behaviour can be detected and handled as a larger issue. If
you are on the fence of reporting something we encourage you to report the
incident and let the Conduct WG make the decision as to how to handle the
report.
https://www.python.org/psf/conduct/https://www.python.org/psf/conduct/reporting/
Serhiy had the idea of having Enum._convert also modify the __str__ and
__repr__ of newly created enumerations to display the module name instead
of the enumeration name (https://bugs.python.org/msg325007):
--> socket.AF_UNIX
<AddressFamily.AF_UNIX: 1> ==> <socket.AF_UNIX: 1>
--> print(socket.AF_UNIX)
AddressFamily.AF_UNIX ==> socket.AF_UNIX
Thoughts?
--
~Ethan~