I'm not sure if this has been asked / suggested before.
I'm wondering if there is any interest in conditional or loop-based `with`
statements. I think it could be done without a syntax change.
### Napkin proposal
Context managers could define `__if__` or `__for__`, and if those dunder
methods were defined, then the `with` statement would either behave like a
conditional or a loop.
If `__if__` was defined then
```
with Ctx():
print('hi')
```
would only print `hi` if `__if__` returned True. This doesn't require a
syntax change.
The `__for__` variant would likely need a minor syntax change.
```
with item in Ctx():
print(item)
```
The `__for__` method is a generator that generates arguments of a loop. The
item will be printed as many times as there are items generated by
`__for__`.
### Use Cases
This would simplify usage of my Timerit and ubelt library.
The timerit module defines ``timerit.Timerit``, which is an object that is
iterable. It has an ``__iter__`` method that generates
``timerit.TimerTimer``
objects, which are context managers.
>>> import math
>>> from timerit import Timerit
>>> for timer in Timerit(num=200, verbose=2):
>>> with timer:
>>> math.factorial(10000)
The timer context manager measures how much time the body of it takes by
"tic"-ing ``__enter__`` and "toc"-ing on ``__exit__``. The underlying object
has access to the context manager, so it is able to read its measurement.
These
measurements are stored and then we compute some statistics on them. Notably
the minimum, mean, and standard-deviation of grouped (batched) running
times.
Unfortunately the syntax is one line and one indent bulker than I would
prefer.
However, a more consice version of the synax is available.
>>> import math
>>> from timerit import Timerit
>>> for _ in Timerit(num=200, verbose=2):
>>> math.factorial(10000)
In this case the measurement is made in the `__iter__` method ``Timerit``
object itself, which I believe contains slightly more overhead than the
with-statement version. (I should test to determine if this is the case).
In the case where it does make a difference, a cool syntax might look like:
>>> import math
>>> from timerit import Timerit
>>> with timer in Timerit(num=200, verbose=2):
>>> math.factorial(10000)
The other case is that my ``ubelt.Cacher`` library. Currently it requires 4
lines of boilerplate syntax.
>>> import ubelt as ub
>>> # Defines a cache name and dependencies, note the use of
`ub.hash_data`.
>>> cacher = ub.Cacher('name', cfgstr=ub.hash_data('dependencies'))
# boilerplate:1
>>> # Calling tryload will return your data on a hit and None on a miss
>>> data = cacher.tryload()
# boilerplate:2
>>> # Check if you need to recompute your data
>>> if data is None:
# boilerplate:3
>>> # Your code to recompute data goes here (this is not
boilerplate).
>>> data = 'mydata'
>>> # Cache the computation result (pickle is used by default)
>>> cacher.save(data)
# boilerplate:4
But a conditional ``with`` syntax would reduce boilerplate to 3 lines.
>>> import ubelt as ub
>>> with ub.Cacher('name', cfgstr=ub.hash_data('dependencies')) as
cacher:
>>> data = 'mydata'
>>> cacher.save(data)
>>> data = cacher.data
I'm sure there are a lot of viable syntax variations, but does the idea of
a conditional or loop aware "with" statement seem like a reasonable
language proposal?
--
-Dr. Jon Crall (him)
Forgive me if this idea has been discussed before, I searched the mailing lists, the CPython repo, and the issue tracker and was unable to find anything.
I have found myself a few times in a position where I have a repeated argument that uses the `append` action, along with some convenience arguments that append a specific const to that same dest (eg: `--filter-x` being made equivalent to `--filter x` via `append_const`). This is particularly useful in cli apps that expose some kind of powerful-but-verbose filtering capability, while also providing shorter aliases for common invocations. I'm sure there are other use cases, but this is the one I'm most familiar with.
The natural extension to this filtering idea are convenience args that set two const values (eg: `--filter x --filter y` being equivalent to `--filter-x-y`), but there is no `extend_const` action to enable this.
While this is possible (and rather straight forward) to add via a custom action, I feel like this should be a built-in action instead. `append` has `append_const`, it seems intuitive and reasonable to expect `extend` to have `extend_const` too (my anecdotal experience the first time I came across this need was that I simply tried using `extend_const` without checking the docs, assuming it already existed).
Please see this gist for a working example that may help explain the idea and intended use case more clearly: https://gist.github.com/roganartu/7c2ec129d868ecda95acfbd655ef0ab2
The fact that Python does not use UTF-8 as the default encoding when
opening text files is an obstacle for many Windows users, especially
beginners in programming.
If you search for UnicodeDecodeError, you will see that many Windows
users have encountered the problem.
This list is only part of many search results.
* https://qiita.com/Yuu94/items/9ffdfcb2c26d6b33792e
* https://www.mikan-partners.com/archives/3212
* https://teratail.com/questions/268749
* https://github.com/neovim/pynvim/issues/443
* https://www.coder.work/article/1284080
* https://teratail.com/questions/271375
* https://qiita.com/shiroutosan/items/51358b24b0c3defc0f58
* https://github.com/jsvine/pdfplumber/issues/304
* https://ja.stackoverflow.com/questions/69281/73612
* https://trend-tracer.com/pip-error/
Looking at the errors, the following are the most common cases.
* UnicodeDecodeError is raised when trying to open a text file written
in UTF-8, such as JSON.
* UnicodeEncodeError is raised when trying to save text data retrieved
from the web, etc.
* User run `pip install` and `setup.py` reads README.md or LICENSE
file written in UTF-8 without `encoding="UTF-8"`
Users can use UTF-8 mode to solve these problems.
I wrote a section for UTF-8 mode in the "3. Using Python on Windows" document.
https://docs.python.org/3/using/windows.html#utf-8-mode
However, UTF-8 mode is still not very well known. How can we make
UTF-8 mode more user-friendly?
Right now, UTF-8 mode can be enabled using the `-Xutf8` option or the
`PYTHONUTF8` environment variable. This is a hurdle for beginners. In
particular, Jupyter users may not use the command line at all.
Is it possible to enable UTF-8 mode in a configuration file like `pyvenv.cfg`?
* User can enable UTF-8 mode per-install, and per-venv.
* But difficult to write the setting file when Python is installed for
system (not for user), or Windows Store Python
* User can still enable UTF-8 mode in venv. But many beginners don't
need venv.
Is it possible to make it easier to configure?
* Put a checkbox in the installer?
* Provide a small tool to allow configuration after installation?
* python3 -m utf8mode enable|disable?
* Accessible only for CLI user
* Add "Enable UTF-8 mode" and "Disable UTF-8 mode" to Start menu?
Any ideas are welcome.
--
Inada Naoki <songofacandy(a)gmail.com>
Hello,
I'm Francis, a beginner programmer from Ghana, West Africa.
I was wondering why the list .index() method doesn't return negative
indices as well as well as positive indices.
Although ambiguity will make it difficult to implement, in certain cases,
it would be useful if it could return the negative index.
For instance, if one creates an if statement that checks whether an element
is the last item in a list as follows:
listy = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
if listy.index(10) == -1:
print("Monty Python")
I understand that the same effect can be achieved with the index notation -
as in
if listy[-1] == 10:
print("Monty Python")
- but the way that came naturally to me was to use the .index method rather
than index notation, and it took a very long time for me to figure out why
my code was not working(mostly because I'm a beginner).
So do what you will, I guess.:)
Hello,
This is a great PEP. It turns out to be applicable in a variety of scenarios.
Case in point: Matthew Rahtz and I are working on PEP 646: Variadic Generics (https://www.python.org/dev/peps/pep-0646/; discussion on typing-sig). It is a type system feature that allows specifying an arbitrary tuple of type variables instead of a single type variable.
We plan to use your proposed syntax to represent unpacking a tuple type. This would be analogous to `*` for unpacking a tuple value:
+ `Tensor[int, *Ts, str]` and `Tensor[*Ts1, *Ts2]`
+ such variadic classes would be declared as `class Tensor(Generic[T, *Ts, T2]):`
Questions:
1. Does PEP 637 support unpacking multiple `*` arguments?
- e.g., Tensor[int, *Ts, str, *Ts2]
2. Does PEP 637 allow a positional argument after a `*`?
- e.g., Generic[T, *Ts, T2]
PEP 637 says "Sequence unpacking is allowed inside subscripts", so it looks like these should be allowed (just as in function calls). But I wanted to confirm it explicitly since this is our core use case and there was no example with multiple sequences being unpacked.
3. We also wanted to ask - how's your implementation going?
We'll be starting implementation in typing.py soon. Since there's some overlap we wanted to make sure we're not duplicating your work, and that there won't be any merge conflicts later. Do you have a fork we might be able to get early access to? We're also targeting the 3.10 release for our implementation.
I'd be happy to provide additional details if needed.
Best,
Pradeep Kumar Srinivasan
Matthew Rahtz
Currently, python allows variable documentation via PEP 526
<https://www.python.org/dev/peps/pep-0526/>. For most functions with short
parameter lists that can fit in a reasonable column limit, I prefer the
traditional declaration style with Google-style doc strings:
*def connect_to_next_port(self, minimum: int) => int: *
"""Connects to the next available port.
Args:
minimum: A port value greater or equal to 1024.
Returns:
The new minimum port.
Raises:
ConnectionError: If no available port is found.
"""
...code...
However, when a signature gets too long, I prefer to list the parameters
vertically:
*def request(*
* method: Method, url: Str,*
* params: Dict = None, data: Dict = None, json: Str =
None, headers: Dict = None, cookies: Dict = None,
files: Dict = None, ...) => Response: """*
*Constructs and sends a Request*
* Args: ... """ *
In which case, it would be nice to in-line some documentation instead of
repeating the whole parameter list in the doc string. Something like:
*def request(*
* method: Method*
* #method for the new Request: ``GET``,``POST``, etc.*
* , url: Str*
* #URL for the request ,*
* params: Dict = None*
* ...**) => Response:*
* """**Constructs and sends a Request*
*"""*
Hi, all.
I am rewriting PEP 597 to introduce a new EncodingWarning, which
subclass of DeprecationWarning and used to warn about future default
encoding change.
But I don't think we can change the default encoding of
`io.TextIOWrapper` and built-in `open()` anytime soon. It is
disruptive change. It may take 10 or more years.
To ease the pain caused by "default encoding is not UTF-8 (almost)
only on Windows" (*), I came up with another idea. This idea is not
mutually exclusive with PEP 597, but I want to include it in the PEP
because both ideas use EncodingWarning.
(*) Imagine that a new Python user writes a text file with notepad.exe
(default encoding is UTF-8 without BOM already) or VS Code, and try to
read it in Jupyter Notebook. They will see UnicodeDecodeError. They
might not know about what encoding yet.
## 1. Add `io.open_text()`, builtin `open_text()`, and
`pathlib.Path.open_text()`.
All functions are same to `io.open()` or `Path.open()`, except:
* Default encoding is "utf-8".
* "b" is not allowed in the mode option.
These functions have two benefits:
* `open_text(filename)` is shorter than `open(filename,
encoding="utf-8")`. Its easy to type especially with autocompletion.
* Type annotation for returned value is simple than `open`. It is
always TextIOWrapper.
## 2. Change the default encoding of `pathlib.Path.read_text()`.
For convenience and consistency with `Path.open_text()`, change the
default encoding of `Path.read_text()` to "utf-8" with regular
deprecation period.
* Python 3.10: `Path.read_text()` emits EncodingWarning when the
encoding option is omitted.
* Python 3.13: `Path.read_text()` change the default encoding to "utf-8".
If PEP 597 is accepted, users can pass `encoding="locale"` instead of
`encoding=locale.getpreferredencoding(False)` when they need to use
locale encoding.
We might change more places where the default encoding is used. But it
should be done slowly and carefully.
---
How do you think about this idea? Is this worth enough to add a new
built-in function?
Regards,
--
Inada Naoki <songofacandy(a)gmail.com>
Sorry for posting multiple threads so quickly.
Microsoft provides UTF-8 code page for process. It can be enabled by
manifest file.
https://docs.microsoft.com/ja-jp/windows/uwp/design/globalizing/use-utf8-co…
How about providing Python binaris both of "UTF-8 version" and "ANSI version"?
This idea can provide a more smooth transition of the default encoding.
1. Provide UTF-8 version since Python 3.10
2. (Some years later) Recommend UTF-8 version
3. (Some years later) Provide only UTF-8 version
4. (Some years later, maybe) Change the default encoding
The upsides of this idea are:
* We don't need to emit a warning for `open(filename)`.
* We can see the download stats.
Especially, the last point is a huge advantage compared to current
UTF-8 mode (e.g. PYTHONUTF8=1).
We can know how many users need legacy behavior in new Python
versions. That is a very important information for us.
Of course, there are some downsides:
* Windows team needs to maintain more versions.
* More divisions for "Python on Windows" environment.
Regards,
--
Inada Naoki <songofacandy(a)gmail.com>
I've created a helper class in my own library that enhances the
existing dataclass:
a) __init__ accepts keyword-only arguments,
b) Optional[...] attribute without a specified default value would
default to None in __init__.
I think this could be useful in stdlib. I'm thinking a dataclass
decorator parameter like "init_kwonly" (default=False to provide
backward compatibility) that if True would implement this behavior.
Thoughts?
My previous thread is hijacked about "auto guessing" idea, so I split
this thread for pathlib.
Path.open() was added in Python 3.4. Path.read_text() and
Path.write_text() was added in Python 3.5.
Their history is shorter than built-in open(). Changing its default
encoding should be easier than built-in open and TextIOWrapper.
New default encodings are:
* read_text() default encoding is "utf-8-sig"
* write_text() default encoding is "utf-8"
* open() default encoding is "utf-8-sig" when mode is "r" or None,
"utf-8" otherwise.
Of course, we need a regular deprecation period.
When encoding is omitted, they emit DeprecationWarning (or
EncodingWarning which is a subclass of DeprecationWarning) in three
versions (Python 3.10~3.12).
How do you think this idea?
Should we "change all at once" rather than "step-by-step"?
Regards,
--
Inada Naoki <songofacandy(a)gmail.com>