The thread on operators as first-class citizens keeps getting vague ideas about assignment overloading that wouldn't actually work, or don't even make sense. I think it's worth writing down the simplest design that would actually work, so people can see why it's not a good idea (or explain why they think it would be anyway).
in pseudocode, just as x += y means this:
xval = globals()['x'] try: result = xval.__iadd__(y) except AttributeError: result = xval.__add__(y) globals()['x'] = result
… x = y would mean this:
xval = globals()['x'] result = xval.__iassign__(y)
except (LookupErrorr, AttributeError): result = y globals()['x'] = result
If you don't understand why this would work or why it wouldn't be a great idea (or want to nitpick details), read on; otherwise, you can skip the rest of this message.
First, why is there even a problem? Because Python doesn't even have "variables" in the same sense that languages like C++ that allow assignment overloading do.
In C++, a variable is an "lvalue", a location with identity and type, and an object is just a value that lives in a location. So assignment is an operation on variables: x = 2 is the same as XClass::operator=(&x, y).
In Python, an object is a value that lives wherever it wants, with identity and type, and a variable is just a name that can be bound to a value in a namespace. So assignment is an operation on namespaces, not on variables: x = 2 is the same as dict.__settem__(globals(), 'x', 2).
The same thing is true for more complicated assignments. For example, a.x = 2 is just an operation on a's namespace instead of the global namespace: type(a).__setattr__(a, 'x', 2). Likewise, a.b['x'] = 2 is type(a.b).__setitem__(a.b, 'x', 2), And so on,
But Python allows overloading augmented assignment. How does that work? There's a perfectly normal namespace lookup at the start and namespace store at the end—but in between, the existing value of the target gets to specify the value being assigned.
Immutable types like int don't define __iadd__, and __add__ creates and returns a new object. So, x += y ends up the same as x = x + y.
But mutable types like list define an __iadd__ that mutates self in-place and then returns self, so x gets harmlessly rebound to the same object it was already bound to. So x += y ends up the same as x.extend(y); x = x.
The exact same technique would work for overloading normal assignment. The only difference is that x += y is illegal if x is unbound, while x = y obviously has to be legal (and mean there is no value to intercept the assignment). So, the fallback happens when xval doesn't define __iassign__, but also when x isn't bound at all.
So, for immutable types like eint, and almost all mutable types like list—and when x is unbound—x = y does the same thing it always did.
But special types that want to act like transparent mutable handles define an __iassign__ that mutates self in place and returns self, so x gets harmlessly rebound to the same object. So x = y ends up the same as, say, x.set_target(y); x = x.
This all works the same if the variables are local rather than global, or for more complicated targets like attribution or subscription, and even for target lists; the intercept still happens the same way, between the (more complicated) lookup and storage steps.
Now, why is this a bad idea?
First, the benefit of __iassign__ is a lot smaller than __iadd__. A sizable fraction of "x += y" statements are for mutable "x" values, but only a rare handful of "x = y" statements would be for special handle "x" values. Even the same cost for a much smaller benefit would be a much harder sell.
But the runtime performance cost difference is huge. If augmented assignment weren't overloadable, it would still have to lookup the value, lookup and call a special method on it, and store the value. The only cost overloading adds is trying two special methods instead of one, which is tiny. But regular assignment doesn't have to do a value lookup or a special method call at all, only a store; adding those steps would roughly double the cost of every new variable assignment, and even more for every reassignment. And assignments are very common in Python, even within inner loops, so we're talking about a huge slowdown to almost every program out there.
Also, the fact that assignment always means assignment makes Python code easier both for humans to skim, and for automated programs to process. Consider, for example, a static type checker like mypy. Today, x = 2 means that x must now be an int, always. But if x could be a Signal object with an overloaded __iassign__, then, x = 2 might mean that x must now be an int, or it might mean that x must now be whatever type(x).__iassign__ returns.
Finally, the complexity of __iassign__ is at least a little higher than __iadd__. Notice that in my pseudocode above, I cheated—obviously the xval = and result = lines are not supposed to recursively call the same pseudocode, but to directly store a value in new temporary local variable. In the real implementation, there wouldn't even be such a temporary variable (in CPython, the values would just be pushed on the stack), but for documenting the behavior, teaching it to students, etc., that doesn't matter. Being precise here wouldn't be hugely difficult, but it is a little more difficult than with __iadd__, where there's no similar potential confusion even possible. On Wednesday, June 19, 2019, 10:54:04 AM PDT, Andrew Barnert via Python-ideas <python-ideas(a)python.org> wrote:
On Jun 18, 2019, at 12:43, nate lust <natelust(a)linux.com> wrote:
I have been following this discussion for a long time, and coincidentally I recently started working on a project that could make use of assignment overloading. (As an aside it is a configuration system for a astronomical data analysis pipeline that makes heavy use of descriptors to work around historical decisions and backward compatibility). Our system makes use of nested chains of objects and descriptors and proxy object to manage where state is actually stored. The whole system could collapse down nicely if there were assignment overloading. However, this works OK most of the time, but sometimes at the end of the chain things can become quite complicated. I was new to this code base and tasked with making some additions to it, and wished for an assignment operator, but knew the data binding model of python was incompatible from p.
This got me thinking. I didnt actually need to overload assignment per-say, data binding could stay just how it was, but if there was a magic method that worked similar to how __get__ works for descriptors but would be called on any variable lookup (if the method was defined) it would allow for something akin to assignment.
What counts as “variable lookup”? In particular:
class Foo: def __init__(self): self.value = 6 self.myself = weakref.ref(self) def important_work(self): print(self.value)
… why doesn’t every one of those “self” lookups call self.__get_self__()? It’s a local variable being looked up by name, just like your “foo” below, and it finds the same value, which has the same __get_self__ method on its type.
The only viable answer seems to that it does. So, to avoid infinite circularity, your class needs to use the same kind of workaround used for attribute lookup in classes that define __getattribute__ and/or __setattr__:
def important_work(self): print(object.__get_self__(self).value)
def __get_self__(self): return object.__get_self__(self).myself
But even that won’t work here, because you still have to look up self to call the superclass method on it. I think it would require some new syntax, or at least something horrible involving locals(), to allow you to write the appropriate methods.
def __get_self__(self): return self.myself
Besides recursively calling itself for that “self” lookup, why doesn’t this also call weakref.ref.__get_self__ for that “myself” lookup? It’s an attribute lookup rather than a local namespace lookup, but surely you need that to work too, or as soon as you store a Foo instance in another object it stops overloading.
For this case there’s at least an obvious answer: because weakref.ref doesn’t override that method, the variable lookup doesn’t get intercepted. But notice that this means every single value access in Python now has to do an extra special-method lookup that almost always does nothing, which is going to be very expensive.
def __setattr__(self, name, value): self.value = value
You can’t write __setattr__ methods this way. That assignment statement just calls self.__setattr__(‘value’, value), which will endlessly recurse. That’s why you need something like the object method call to break the circularity.
Also, this will take over the attribute assignments in your __init__ method. And, because it ignores the name and always sets the value attribute, it means that self.myself = is just going to override value rather than setting myself.
To solve both of these problems, you want a standard __setattr__ body here:
def __setattr__(self, name, value): object.__setattr__(self, name, value)
But that immediately makes it obvious that your __setattr__ isn’t actually doing anything, and could just be left out entirely.
foo = Foo() # Create an instancefoo # The interpreter would return foo.myselffoo.value # The interpreter would return foo.myself.value
foo = 19 # The interpreter would run foo.myself = 6 which would invoke foo.__setattr__('myself', 19)
For this last one, why would it do that? There’s no lookup here at all, only an assignment.
The only way to make this work would be for the interpreter to lookup the current value of the target on every assignment before assigning to it, so that lookup could be overloaded. If that were doable, then assignment would already be overloadable, and this whole discussion wouldn’t exist.
But, even if you did add that, __get_self__ is just returning the value self.myself, not some kind of reference to it. How can the interpreter figure out that the weakref.ref value it got came from looking up the name “myself” on the Foo instance? (This is the same reason __getattr__ can’t help you override attribute setting, and a separate method __setattr__ is needed.) To make this work, you’d need a __set_self__ to go along with __get_self__. Otherwise, your changes not only don’t provide a way to do assignment overloading, they’d break assignment overloading if it existed.
Also, all of the extra stuff you’re trying to add on top of assignment overloading can already be done today. You just want a transparent proxy: a class whose instances act like a reference to some other object, and delegate all methods (and maybe attribute lookups and assignments) to it. This is already pretty easy; you can define __getattr__ (and __setattr__) to do it dynamically, or you can do some clever stuff to create static delegating methods (and properties) explicitly at object-creation or class-creation time. Then foo.value returns foo.myself.value, foo.important_work() calls the Foo method but foo.__str__() calls foo.myself.__str__(), you can even make it pass isinstance checks if you want. The only thing it can’t do is overload assignment.
I think the real problem here is that you’re thinking about references to variables rather than values, and overloading operators on variables rather than values, and neither of those makes sense in Python. Looking up, or assigning to, a local variable named “foo” is not an operation on “the foo variable”, because there is no such thing; it’s an operation on the locals namespace._______________________________________________
Python-ideas mailing list -- python-ideas(a)python.org
To unsubscribe send an email to python-ideas-leave(a)python.org
Message archived at https://email@example.com/message/4JMNZ…
Code of Conduct: http://python.org/psf/codeofconduct/
As suggested by Toshio Kuratomi at https://bugs.python.org/issue36656, I
am raising this here for inclusion in the shutil module.
Mimicking POSIX, os.symlink() will raise FileExistsError if the link
name to be created already exists.
A common use case is overwriting an existing file (often a symlink) with
a symlink. Naively, one would delete the file named link_name file if it
exists, then call symlink(). This "solution" is already 3 lines of code,
and without exception handling it introduces the race condition of a
file named link_name being created between unlink and symlink.
Depending on the functionality required, I suggest:
* os.symlink() - the new link name is expected to NOT exist
* shutil.symlink() - the new symlink replaces an existing file
Handling all possible race conditions (some detailed in issue36656) is
non-trivial, however this is the best that I have come up with so far:
import os, tempfile
def symlink(target, link_name):
'''Create a symbolic link link_name pointing to target.
Overwrites link_name if it exists. '''
# os.replace() may fail if files are on different filesystems
link_dir = os.path.dirname(link_name)
# Link to a temporary filename that doesn't exist
temp_link_name = tempfile.mktemp(dir=link_dir)
# os.* functions mimic as closely as possible system functions
# The POSIX symlink() returns EEXIST if link_name already exists
# Replace link_name with temp_link_name
# Pre-empt os.replace on a directory with a nicer message
raise IsADirectoryError(f"Cannot symlink over existing
The documentation (https://docs.python.org/3/library/shutil.html) I
suggest for this is:
Create a symbolic link named link_name pointing to target, overwriting
target if it exists. If link_name is a directory, IsADirectoryError is
raised. To not overwrite target, use os.symlink()
It would be tempting to do:
But this has a race condition when replacing a symlink should should
*always* exist, eg:
/lib/critical.so -> /lib/critical.so.1.2
When upgrading by:
There is a point in time when /lib/critical.so doesn't exist.
One issue I see with my suggested code is that the file at
temp_link_name could be changed before target is replaced with it. This
is mitigated by the randomness introduced by mktemp().
While it is far less likely that a file is accessed with a random and
unknown name than with an existing known name, I seek input on a
solution if this is an unacceptable risk.
* https://bugs.python.org/issue36656 (already mentioned above)
> On Jun 26, 2019, at 7:13 PM, Chris Angelico <rosuav(a)gmail.com> wrote:
> The main advantage of sscanf over a regular expression is that it
> performs a single left-to-right pass over the format string and the
> target string simultaneously, with no backtracking. (This is also its
> main DISadvantage compared to a regular expression.) A tiny amount of
> look-ahead in the format string is the sole exception (for instance,
> format string "%s$%d" would collect a string up until it finds a
> dollar sign, which would otherwise have to be written "%[^$]$%d").
> There is significant value in having an extremely simple parsing tool
> available; the question is, is it worth complicating matters with yet
> another way to parse strings? (We still have fewer ways to parse than
> ways to format strings. I think.)
I agree. Python should have an equivalent of scanf, but perhaps it should have some extensions:
%P - read pickled object
%J - read JSON object
%M - read msgpack object
On Fri, Jun 28, 2019 at 02:44:28AM +1000, Chris Angelico wrote:
> If it's ALWAYS called, then it's almost useless. The wrapper object
> will vanish the moment you attempt to do anything with it, devolving
> instantly to the result of getself.
I don't understand why it is useless. If the wrapper object is no longer
needed, then getself will return the object which is needed, and the
wrapper is superfluous and should be garbage collected.
If the wrapper object is needed, then getself will return self, and it
Are we talking past each other?
As Python focuses on readability, why not use % sign for actual percentages?
rate = 0.1058 # float
rate = 10.58% # percent, similar to above
It does not interfere with modulo operator as modulo follows a different
a = x % y
This looks like a small feature but it will surely set Python a level
higher in terms of readability.
Thanks a lot!
On Mon, Jun 24, 2019 at 12:37:48PM -0700, Andrew Barnert wrote:
> Since you bring up “feeble mail clients”:
> Good mail clients can be configured to collapse and expand quotes, and
> to automatically start long nested quotes collapsed.
Indeed you are correct, and my own mail client supports that as an
optional add-on, which I've tried and discarded as more annoying than
helpful. In my experience the problem is that trying to solve the
problem of quoting via technology is too crude: all the ones I've see
are all or nothing, hide all context or show all context.
Its really a *human* problem, not a technology problem: as the reader, I
want to see *relevant* context, but not the entire history of the
discussion. I cannot see any automated tool being able to guess what is
relevant any time soon.
> In fact, most
> feeble clients (like Gmail and Yahoo, and the builtin iOS and Android
> apps) do this whether you want it or not.
I can't speak for the others, but my recollection of Gmail is that
replies default to top posting with the quoted text below your response.
> But you’re not going to convince everyone to do it traditionally all
> the time
I'm not trying to do that. But surely it's not too hard to ask that
every few posts somebody in the thread trims quoting to keep it under
I'm not bitching because somebody quoted seven lines rather than three,
I'm pointing out that when you have something like five pages of
quoted text to a paragraph or three of new text, quoting is out of
control. If we can't keep the quoting short in every post, we can hopely
cull some of it periodically.
We're programmers. We've learned by hard experience to delete dead code,
not to leave it sitting in the program commented out, and not just
because it makes reading the source code harder. It makes it harder to
search, it makes files and diffs larger, it increases the ratio of noise
to signal. Excessive quoting in email is the same, whether your mail
client hides it by default or not, the noise is still there.
> And the one-time hassle of figuring out how to configure your MUA, or
> even switching to a better one
"Better" is subjective, and just because a client is arguably better
in one regard doesn't make it better in others.
`if-unless` expressions in Python
if condition1 expr unless condition2
is an expression that roughly reduces to
expr if condition1 and not condition2 else EMPTY
This definition means that expr is only evaluated if `condition1 and not
condition2` evaluates to true. It also means `not condition2` is only
evaluated if `condition1` is true.
EMPTY is not actually a real Python value-- it's a value that collapses
into nothing when used inside a statement expression:
if False never_called() unless False,
if False never_called() unless False,
]) # => 
if False never_called() unless False,
if False never_called() unless False,
if True 5 unless False,
]) # => [3, 2, 5, 4]
EMPTY is neither a constant exposed to the Python runtime nor a symbol.
It's a compiler-internal value.
# Use cases
assert if condition1 predicate(object) unless condition2
(This would be more readable with assert expressions.)
# Equivalent syntax in existing Python
As a statement:
if condition1 and not condition2: predicate(object)
predicate(object) if condition1 and not condition2
# Backward compatibility
The `unless` word is only recognized as special inside `if-unless`
statements. The continued use of the word as a variable name is discouraged.