Hi,
I was refactoring some code today and ran into an issue that always bugs me with
Python modules. It bugged me enough this time that I spent an hour banging out this
potential proposal to add a new contextual keyword. Let me know what you think!
Theia
--------------------------------------------------------------------------------
A typical pattern for a python module is to have an __init__.py that looks
something like:
from .foo import (
A,
B,
C,
)
from .bar import (
D,
E,
)
def baz():
pass
__all__ = [
"A",
"B",
"C",
"D",
"E",
"baz",
]
This is annoying for a few reasons:
1. It requires name duplication
a. It's easy for the top-level imports to get out of sync with __all__,
meaning that __all__, instead of being useful for documentation, is
actively misleading
b. This encourages people to do `from .bar import *`, which screws up many
linting tools like flake8, since they can't introspect the names, and
also potentially allows definitions that have been deleted to
accidentally persist in __all__.
2. Many symbol-renaming tools won't pick up on the names in __all__, as they're
strings.
Prior art:
================================================================================
# Rust
Rust distinguishes between "use", which is a private import, "pub use", which is
a globally public import, and "pub(crate) use", which is a library-internal
import ("crate" is Rust's word for library)
# Javascript
In Javascript modules, there's an "export" keyword:
export function foo() { ... }
And there's a pattern called the "barrel export" that looks similar to a Python
import, but additionally exports the imported names:
export * from "./foo"; // re-exports all of foo's definitions
Additionally, a module can be gathered and exported by name, but not in one line:
import * as foo from "./foo";
export { foo };
# Python decorators
People have written utility Python decorators that allow exporting a single
function, such as this SO answer: https://stackoverflow.com/a/35710527/1159735
import sys
def export(fn):
mod = sys.modules[fn.__module__]
if hasattr(mod, '__all__'):
mod.__all__.append(fn.__name__)
else:
mod.__all__ = [fn.__name__]
return fn
, which allows you to write:
@export
def foo():
pass
# __all__ == ["foo"]
, but this doesn't allow re-exporting imported values.
# Python implicit behavior
Python already has a rule that, if __all__ isn't declared, all
non-underscore-prefixed names are automatically exported. This is /ok/, but it's
not very explicit (Zen) -- it's easy to accidentally "import sys" instead of
"import sys as _sys" -- it makes doing the wrong thing the default state.
Proposal:
================================================================================
Add a contextual keyword "export" that has meaning in three places:
1. Preceding an "import" statement, which directs all names imported by that
statement to be added to __all__:
import sys
export import .foo
export import (
A,
B,
C,
D
) from .bar
# __all__ == ["foo", "A", "B", "C", "D"]
2. Preceding a "def", "async def", or "class" keyword, directing that function
or class's name to be added to __all__:
def private(): pass
export def foo(): pass
export async def async_foo(): pass
export class Foo: pass
# __all__ == ["foo", "async_foo", "Foo"]
3. Preceding a bare name at top-level, directing that name to be added to
__all__:
x = 1
y = 2
export y
# __all__ == ["y"]
# Big Caveat
For this scheme to work, __all__ needs to not be auto-populated with names.
While the behavior is possibly suprising, I think the best way to handle this is
to have __all__ not auto-populate if an "export" keyword appears in the file.
While this is somewhat-implicit behavior, it seems reasonable to me to expect that
if a user uses "export", they are opting in to the new way of managing __all__.
Likewise, I think manually assigning __all__ when using "export" should raise
an error, as it would overwrite all previous exports and be very confusing.
Hello,
I've been chasing down various synchronization bugs in a large codebase
I'm working on.
In the process I began to realize how useful it would be to have some
sort of descriptor (a name if you will) attached to some of my
primitives.
In this code base, I've a number of threading.Event objects that get
passed around to check for conditions. In a perfect world, every
developer would have used the same nomenclature and been consistent
everywhere. Alas....
Currently there isn't a great way built into the language for me to say
"which Event is this <threading.Event object at 0x7f66da73ed60> ?"
unless I already have a mapping of the addresses to variable names in
my logs.
Would a `name=` keywordonly argument for things like Lock, Event,
Condition, Barrier, Semaphore, etc be welcome (for threading,
multiprocess, async, etc)?
Inside the code base, such an attribute would let me do things like:
mycondition.wait()
print(f'Met condition {mycondition.name}')
or
print(f'Waiting on barrier {mybarrier.name}')
mybarrier.wait()
And similar.
Simplifying deep introspection into which sync primitives are which
feels like it would be a benefit as the async world grows within
Python.
Thoughts?
Pat
I have just been writing some code in which I define a custom log level with the logging module. I wanted to call the new log level as an attribute but currently the logging library lacks a nice way to do this. I know that in the docs https://docs.python.org/3/howto/logging.html#custom-levels it discourages creating custom log levels so perhaps this was why there is no way to do this. This https://stackoverflow.com/questions/2183233/how-to-add-a-custom-loglevel-to… StackOverflow solution allows for the syntax I was after, but it is hard to understand for someone looking at the code.
import logging
CUSTOM_LEVEL = 31
CUSTOM_LEVEL_NAME = 'CUSTOM_WARNING'
def custom_warning(self, message, *args, **kwargs):
if logger.isEnabledFor(CUSTOM_LEVEL):
logger._log(CUSTOM_LEVEL, message, args, **kwargs)
logging.Logger.custom_warning = custom_warning
logger = logging.getLogger(__name__)
logger.warning('a warning here!')
logger.custom_warning('a custom warning!')
I would propose something like
logger.addCustomLevel(CUSTOM_LEVEL, CUSTOM_LEVEL_NAME)
There are some obvious limitations to this approach, all level names would also have to be valid function names otherwise something like
logger.addCustomLevel(CUSTOM_LEVEL, "1 2 3")
would be an issue
what does everybody think?
Frequently, while globbing, one needs to work with multiple extensions. I’d
like to propose for fnmatch.filter to handle a tuple of patterns (while
preserving the single str argument functionality, alas str.endswith), as a
first step for glob.i?glob to accept multiple patterns as well.
Here is the implementation I came up with:
https://github.com/python/cpython/compare/master...andresdelfino:fnmatch-mu…
If this is deemed reasonable, I’ll write tests and documentation updates.
Any opinion?
Hi,
Python supports IPv4-mapped IPv6 addresses as defined by RFC 4038:
"the IPv6 address ::FFFF:x.y.z.w represents the IPv4 address x.y.z.w.”
The current behavior is as follows:
from ipaddress import ip_address
addr = ip_address('::ffff:8.8.4.4') # IPv6Address('::ffff:808:404')
addr.ipv4_mapped # IPv4Address('8.8.4.4')
Note that the textual representation of the IPv6Address is *not* in IPv4-mapped format.
It prints ::ffff:808:404 instead of ::ffff:8.8.4.4.
This is technically correct, but it’s somewhat frustrating as it makes it harder to read IPv4s embedded in IPv6 addresses.
My proposal would be to check, in __str__, if an IPv6 is an IPv4-mapped, and to return the appropriate representation :
from ipaddress import ip_address
addr = ip_address('::ffff:8.8.4.4')
# Current behavior
str(addr) # '::ffff:808:404'
repr(addr) # IPv6Address('::ffff:808:404')
# Proposed behavior
str(addr) # '::ffff:8.8.4.4'
repr(addr) # IPv6Address('::ffff:8.8.4.4')
A few data points:
- Julia prints ::ffff:808:404 (current behavior)
- C (glibc) and ClickHouse prints ::ffff:8.8.4.4 (proposed behavior)
Any thoughts?
Maxime
Hi all,
Deno is JavaScript runtime that has very nice feature like Top Level Await, I think it would be also nice to have such feature in Python, it will make using async/await more convenient
What do you think ? Share your ideas lets discuss ...
Back in the late 90s (!) I worked on a reimagining of the Python
virtual machine as a register-based VM based on 1.5.2. I got part of
the way with that, but never completed it. In the early 2010s, Victor
Stinner got much further using 3.4 as a base. The idea (and dormant
code) has been laying around in my mind (and computers) these past
couple decades, so I took another swing at it starting in late 2019
after retirement, mostly as a way to keep my head in the game. While I
got a fair bit of the way, it stalled. I've picked it up and put it
down a number of times in the past year, often needing to resolve
conflicts because of churn in the current Python virtual machine.
Though I kept getting things back in sync, I realize this is not a
one-person project, at least not this one person. There are several
huge chunks of Python I've ignored over the past 20 years, and not
just the internals. (I've never used async anything, for example.) If
it is ever to truly be a viable demonstration of the concept, I will
need help. I forked the CPython repo and have a branch (register2) of
said fork which is currently synced up with the 3.10 (currently
master) branch:
https://github.com/smontanaro/cpython/tree/register2
I started on what could only very generously be called a PEP which you
can read here. It includes some of the history of this work as well as
details about what I've managed to do so far:
https://github.com/smontanaro/cpython/blob/register2/pep-9999.rst
If you think any of this is remotely interesting (whether or not you
think you'd like to help), please have a look at the "PEP". Because
this covers a fair bit of the CPython implementation, chances to
contribute in a number of areas exist, even if you have never delved
into Python's internals. Questions/comments/pull requests welcome.
Skip Montanaro
Hi,
(This is great language, sincere thanks to all!)
I took a quick glance of https://devguide.python.org/langchanges/ (20.3.
Suggesting new features and language changes), while writing this post.
Please ignore this, if similar proposals are discussed earlier.
Suggestion:
1. Big Set, Big Dictionary -
During design time, i am considering this question. With limited RAM
(example 4 GB), how can I store a dictionary/set of size greater than RAM.
Is it possible to introduce Big Set / Big Dictionary which stores data in
hard disk (based on some partition mechanism - like big data)? (i.e.Big
Set/Big Dictionary with set access mechanism identical to current set and
dictionary).
Background:
I am working on machine learning NLP and would like to use such feature in
data generator.
Regards,
Vijay
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campai…>
Virus-free.
www.avast.com
<https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campai…>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
I often find that python lacks a nice way to say "only pass an argument
under this condition". (See previous python-list email in "Idea: Deferred
Default Arguments?")
Example 1: Defining a list with conditional elements
include_bd = True
current_way = ['a'] + (['b'] if include_bd else [])+['c']+(['d'] if
include_bd else [])
new_way = ['a', 'b' if include_bd, 'c', 'd' if include_bd]
also_new_way = list('a', 'b' if include_bd, 'c', 'd' if include_bd)
Example 2: Deferring to defaults of called functions
def is_close(a, b, precicion=1e-9):
return abs(a-b) < precision
def approach(pose, target, step=0.1, precision=None):
# Defers to default precision if not otherwise specified:
velocity = step*(target-pose) \
if not is_close(pose, target, precision if precision is not
None) \
else 0
return velocity
Not sure if this has been discussed, but I cannot see any clear downside to
adding this, and it has some clear benefits (duplicated default arguments
and **kwargs are the scourge of many real world code-bases)