Mailman 3 Move semantics - Python-ideas

Move semantics

3mir33＠gmail.com

Nov. 26, 2020

3:44 a.m.

Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d d = {1: 2} f(d) print(d[1]) # mistake, using of moved value

Show replies by date

William Pickard

November 2020

8:15 a.m.

A little bit more detail may be required to understand why you want this. As I currently read it, you're suggesting something that is mere visual only. def f(d): d['a'] = 2 return d d = {1: 2} a = f(d) print(f'Is d and a the same? {d is a}') # "Is d and a the same? True"

3mir33＠gmail.com

10:59 p.m.

Some examples for what I have in mind: def read_and_close_file(file: Move[FileIO]): """ File will be closed, so we can not use after """ file.read() file.close() @dataclass class Model: x: List[int] def make_list_from_model(model: Move[Model]) -> List[int]: """ This is not a pure function, which is bad, but if we move model here and won't use it after, then it's kinda pure """ model.x.append(1) # x is big and we don't want to copy it, instead we just push another element return model.x def run_thread_unsafe_process(writer: Move[BytesIO]) -> Thread: thread = Thread(target=thread_unsafe_process, args=(writer,)) thread.start() return thread def thread_unsafe_process(writer: BytesIO): while True: sleep(1) writer.write(b'Hello!') file = open('move.py') read_and_close_file(file) file.read() # Mistake, file is already closed, static analyzer cah check that model = Model([1, 2, 3]) y = make_list_from_model(model) print(model.x) # Using of corrupted value, it might be unexpected for caller and cause some mistakes writer = open('buffer.tmp', 'wb') run_thread_unsafe_process(writer) writer.write(b'world') # mistake, can lead to data-race (in this case, two messages, "hello" and "world", might mix up)

Steven D'Aprano

2:01 p.m.

On Fri, Nov 27, 2020 at 06:59:40AM -0000, 3mir33@gmail.com wrote:

...

Some examples for what I have in mind: [...]

...

file = open('move.py') read_and_close_file(file) file.read() # Mistake, file is already closed, static analyzer cah check that

That's not a good example for this proposed Move semantics, because it's not access to `file` itself that is a problem. There would be nothing wrong with this: file = open('move.py') read_and_close(file) assert file.closed() print(file.name) It's only read (or write) after close that is problematic. We already have this hypothetical issue, with no "move" or transfer of ownership required: file = open('move.py') file.close() file.read() So your Move semantics is both *too strict* (prohibiting any access to file, when only a couple of methods are a problem) and *insufficient* (it does nothing to prevent the same use-after-close error in slightly different scenarios). Your example also makes my hackles rise: to my mind, it's just bad, overly fragile code. You are splitting the responsibility for a single conceptual action over distant (in time and space) pieces of code, e.g. - caller opens the file - callee reads and closes it which is just an invitation for trouble. There's a very good reason that the Python community has pretty much settled on the `with open(...) as f` idiom as best practice for file handling, it eliminates this sort of use-after-close entirely. -- Steve

Guido van Rossum

9:07 a.m.

This reminds me of something in C++. Does it exist in other languages? Do you have a more realistic example where this would catch an important type error? On Thu, Nov 26, 2020 at 5:42 AM <3mir33@gmail.com> wrote:

...

Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d

d = {1: 2} f(d) print(d[1]) # mistake, using of moved value _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/WJXFEV... Code of Conduct: http://python.org/psf/codeofconduct/

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

William Pickard

9:34 a.m.

C++ has official move syntax via the R-Value Reference (&&). C#/.NET is the only other language I know of where it can be emulated (via Constructors/Assignment operators). I'm not fluent in Perl and Ruby to try to guess if they can support it or not.

nate lust

9:52 a.m.

All, Most of rust is based around the idea of move semantics. It's really useful in cases wherever you want the object to be mutable, want to avoid side effects from something else still having a reference unintentionally, and avoid a lot of deep copies of the underlying resources. On the project I work on (rubin observatory) we created something that lets us somewhat emulate this in python and it has been useful for debugging and some safety. We used it on a method-chaining interface on query objects; it comes up when you have a system where you want to convert one complex object into a slightly different complex object, and you could do that more efficiently when you know the original doesn't need to exist anymore - e.g. by taking an internal list and appending to it, rather than copying the whole thing and then appending. On Thu, Nov 26, 2020, 12:35 PM William Pickard <lollol222gg@gmail.com> wrote:

...

C++ has official move syntax via the R-Value Reference (&&). C#/.NET is the only other language I know of where it can be emulated (via Constructors/Assignment operators).

I'm not fluent in Perl and Ruby to try to guess if they can support it or not. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BATDZV... Code of Conduct: http://python.org/psf/codeofconduct/

Antoine Pitrou

10:06 a.m.

Why are you trying to replicate move semantics in Python? The Python ownership model is entirely different from C++. In C++ terms, every Python object is like a shared_ptr<> (but with additional support for tracking and collecting reference cycles). Generally, when a Python function wants to "take ownership" of a mutable object (say, to mutate it without the caller being surprised), it makes a copy of the object (e.g. dict.copy()). Regards Antoine. On Thu, 26 Nov 2020 17:34:52 -0000 "William Pickard" <lollol222gg@gmail.com> wrote:

...

C++ has official move syntax via the R-Value Reference (&&). C#/.NET is the only other language I know of where it can be emulated (via Constructors/Assignment operators).

I'm not fluent in Perl and Ruby to try to guess if they can support it or not. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BATDZV... Code of Conduct: http://python.org/psf/codeofconduct/

nate lust

10:17 a.m.

Yeah, in most places we just use python as you described. In a few localized places it turned out to be useful for detecting issues and working with external resources that were exposed through python objects. I would not say if this proposal were included with typing that it should be taught as common practice. There are certainly other ways to do this as well, but sometimes it would be convenient. On Thu, Nov 26, 2020, 1:08 PM Antoine Pitrou <solipsis@pitrou.net> wrote:

...

Why are you trying to replicate move semantics in Python? The Python ownership model is entirely different from C++. In C++ terms, every Python object is like a shared_ptr<> (but with additional support for tracking and collecting reference cycles).

Generally, when a Python function wants to "take ownership" of a mutable object (say, to mutate it without the caller being surprised), it makes a copy of the object (e.g. dict.copy()).

Regards

Antoine.

On Thu, 26 Nov 2020 17:34:52 -0000 "William Pickard" <lollol222gg@gmail.com> wrote:

...
C++ has official move syntax via the R-Value Reference (&&). C#/.NET is the only other language I know of where it can be emulated (via Constructors/Assignment operators).

I'm not fluent in Perl and Ruby to try to guess if they can support it or not. _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/BATDZV... Code of Conduct: http://python.org/psf/codeofconduct/

_______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/7GHJS3... Code of Conduct: http://python.org/psf/codeofconduct/

Richard Damon

10:40 a.m.

On 11/26/20 12:07 PM, Guido van Rossum wrote:

...

This reminds me of something in C++. Does it exist in other languages?

A key point is that C++ needs 'move' behavior as its variables are buckets of bits, and assigning one variable to another, like in a call, requires copying that whole bucket of bits, and if it refers to other buckets of bits, may need to copy them too. One thing that can be done when making this assignment, if the original isn't going to be needed is to 'steal' those buckets that the first object pointed to, possibly saving a lot of work. Python doesn't need to do this, as it isn't based on a bucket of bits type model, but its names are just bound to objects, and can readily share. Another way to think of the issue is it is mostly an issue with using Call By Value, which is the basic way to pass parameters in C and C++ (there is also a 'Call By Reverence' which is really just a Call by Value, where the value is a pointer, maybe with syntactic sugar to hide the pointerness) Python doesn't use Call by Value, but its method is better described as Call by Binding, the parameters in the function are bound to the pbjects specified in the call. If those objects are mutable, the function can change the object that the caller gave, but can't 'rebind' any variable names in the call statement to new objects. -- Richard Damon

nate lust

11:15 a.m.

I agree that is how python is implemented, and from a memory perspective, this is not needed. What I was talking about was more from a conceptual resource point of view, and not just memory as a resource. Consider a generator, you can have many different bindings to the object, but when it is exhausted, it may not make sense to access that that generator though one of the other bindings. Of course accessing it can raise an exception, or you can have objects you can ask of the resources are still "valid" for whatever the resource is (in this case a generator), rebind a name after a function call with additional checks, etc. It would be convenient if you could mark in source code that you intended a resource to be "moved" and any further access through other bindings are not what was intended. This would help catch logic bugs during type checking, instead of hitting this issue at runtime. On Thu, Nov 26, 2020, 1:42 PM Richard Damon <Richard@damon-family.org> wrote:

...

On 11/26/20 12:07 PM, Guido van Rossum wrote:

...
This reminds me of something in C++. Does it exist in other languages?

A key point is that C++ needs 'move' behavior as its variables are buckets of bits, and assigning one variable to another, like in a call, requires copying that whole bucket of bits, and if it refers to other buckets of bits, may need to copy them too.

One thing that can be done when making this assignment, if the original isn't going to be needed is to 'steal' those buckets that the first object pointed to, possibly saving a lot of work.

Python doesn't need to do this, as it isn't based on a bucket of bits type model, but its names are just bound to objects, and can readily share.

Another way to think of the issue is it is mostly an issue with using Call By Value, which is the basic way to pass parameters in C and C++ (there is also a 'Call By Reverence' which is really just a Call by Value, where the value is a pointer, maybe with syntactic sugar to hide the pointerness)

Python doesn't use Call by Value, but its method is better described as Call by Binding, the parameters in the function are bound to the pbjects specified in the call. If those objects are mutable, the function can change the object that the caller gave, but can't 'rebind' any variable names in the call statement to new objects.

-- Richard Damon _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/5VKCC6... Code of Conduct: http://python.org/psf/codeofconduct/

Steven D'Aprano

12:50 p.m.

On Thu, Nov 26, 2020 at 02:15:50PM -0500, nate lust wrote:

...

It would be convenient if you could mark in source code that you intended a resource to be "moved" and any further access through other bindings are not what was intended. This would help catch logic bugs during type checking, instead of hitting this issue at runtime.

I can't help but feel that if your code relies on the concept of "moving" an object, your concept is already at odds with Python's object and execution model. But putting that aside, is this is even technically possible? Can type checkers track the "owner" of a resource in this way? obj = {} # obj is owned by this code block process(obj) # still owned by this block print(obj) # so this is okay capture(obj) # but now ownership is stolen by capture() print(obj) # and this is a type-error (?!) Maybe I'm just not familiar enough with the type-checking state of art, but I would be amazed if this was possible. By the way, I have intentionally used the term "stolen" rather than "taken" or "transfered to" with respect to moving ownership of `obj`. At the caller's site, there is no obvious difference between the calls to process() which doesn't move obj, and capture() which does. Any innocuous-looking function could "move" ownership, with *or without* the knowledge of the caller. It's true that the Python interpreter itself has no concept of "ownership" in this sense, but to the degree that the coder cares about such ownership, having it moved without their knowledge and consent is surely "theft" :-) -- Steve

nate lust

1:34 p.m.

Steve, I take your point, and like indicated in another part of this thread, for most parts of our stack its standard practices. It is only when managing certain external resources that we take a bit of extra care, and that is in a library that is not outwardly exposed. I agree with you also that I don't even know if this is a thing that is possible in python unless the type checker could assume that a caller was meaning to move a variable when they called a function declared with move. And of course this would still be at the type checking level, and not at execution time. For our use case, we put a move indicator at the point the function is called, and not in the signature of the function itself (because of what you indicated above), signaling the intention was to move it. It's not really much different than f(x); del x (though of course there are details). And so python does provide everything needed already to ensure one logical binding (and of course there are N other solutions people can think of, guards, exceptions, etc). My point was more that if there was a way I could indicate to a type checker that it may be unsafe for anything to use a reference to this object after this function call it would be helpful(as I might have forgotten behavior of functions, or simply not been paying attention). On Thu, Nov 26, 2020 at 3:53 PM Steven D'Aprano <steve@pearwood.info> wrote:

...

On Thu, Nov 26, 2020 at 02:15:50PM -0500, nate lust wrote:

...
It would be convenient if you could mark in source code that you intended a resource to be "moved" and any further access through other bindings are not what was intended. This would help catch logic bugs during type checking, instead of hitting this issue at runtime.

I can't help but feel that if your code relies on the concept of "moving" an object, your concept is already at odds with Python's object and execution model.

But putting that aside, is this is even technically possible? Can type checkers track the "owner" of a resource in this way?

obj = {} # obj is owned by this code block process(obj) # still owned by this block print(obj) # so this is okay capture(obj) # but now ownership is stolen by capture() print(obj) # and this is a type-error (?!)

Maybe I'm just not familiar enough with the type-checking state of art, but I would be amazed if this was possible.

By the way, I have intentionally used the term "stolen" rather than "taken" or "transfered to" with respect to moving ownership of `obj`. At the caller's site, there is no obvious difference between the calls to process() which doesn't move obj, and capture() which does.

Any innocuous-looking function could "move" ownership, with *or without* the knowledge of the caller. It's true that the Python interpreter itself has no concept of "ownership" in this sense, but to the degree that the coder cares about such ownership, having it moved without their knowledge and consent is surely "theft" :-)

-- Steve _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/MNWGVX... Code of Conduct: http://python.org/psf/codeofconduct/

-- Nate Lust, PhD. Astrophysics Dept. Princeton University

Antoine Pitrou

4:10 p.m.

On Fri, 27 Nov 2020 07:50:20 +1100 Steven D'Aprano <steve@pearwood.info> wrote:

...

On Thu, Nov 26, 2020 at 02:15:50PM -0500, nate lust wrote:

...
It would be convenient if you could mark in source code that you intended a resource to be "moved" and any further access through other bindings are not what was intended. This would help catch logic bugs during type checking, instead of hitting this issue at runtime.

I can't help but feel that if your code relies on the concept of "moving" an object, your concept is already at odds with Python's object and execution model.

Lots of code probably relies on it. As soon as you wrap a file-like object in a higher-level construct, for example a ZipFile, you'd better not do anything again with the file-like object (*) but let the ZipFile be the sole owner of that object. Same if you pass an open socket to a HTTP library, or anything similar. (*) except closing it at the end, perhaps. Regards Antoine.

Serhiy Storchaka

12:38 a.m.

26.11.20 19:07, Guido van Rossum пише:

...

This reminds me of something in C++. Does it exist in other languages?

Indeed, this is a term from C++. In older C++ you could pass argument by value, which makes a copy, and by reference, which is equivalent to passing a pointer by value plus syntax sugar. Passing by value may be inefficient if you need to copy more than several bytes. Passing by reference has other problems. It was a source of errors, so passing object created on stack by non-constant reference was forbidden. If you need to modify argument passed by constant reference in the function you need to create a copy, even if the object is never used after passing to the function (like in foo(bar()), the returned value of bar() cannot be used outside of foo(), so it could be allow to modify it). To resolve this problem the move semantic was introduced.

...

Do you have a more realistic example where this would catch an important type error?

I seen and used the following example (with variations) multiple times: class Client: def send(self, msg, token, headers=None): if headers is None: headers = {} # headers = headers.copy() headers["Authorization"] = f"Bearer {token}" headers["Content-Type"] = "text/plain" headers["X-Hash"] = md5sum(msg.encode()) with self.request('POST', self.url, headers=headers, data=msg) as resp: ... client.send('Hello!', mytoken, headers={'X-Type': 'Spam'}) The function sets some standard headers, but you can pass also additional headers. If you only pass a just created dictionary, the function can modify it in-place because its value is not used outside. But if you create the dict once and use it in other places, modifying it inside the function is a non-desired effect. You need to make a copy in the function just because you don't know how it will be used in the user code. If you document that the argument cannot be used after calling the function and add some marks for static type checking, you could avoid copying. Personally I think Python has a little need in such feature. Copied dicts and lists are usually small, and it adds not much overhead in comparison with other overhead that Python does have. And if you need to pass and modify a large collection, documenting this should be enough.

3mir33＠gmail.com

12:55 a.m.

Actually, there is more examples, like working with thread-unsafe objects or file-like protocols. https://pastebin.com/nrRNjFsN

2QdxY4RzWzUUiLuE＠potatochowder.com

4:32 a.m.

On 2020-11-27 at 10:38:06 +0200, Serhiy Storchaka <storchaka@gmail.com> wrote:

...

26.11.20 19:07, Guido van Rossum пише:

...
This reminds me of something in C++. Does it exist in other languages?

...

Indeed, this is a term from C++. In older C++ you could pass argument by value, which makes a copy, and by reference, which is equivalent to passing a pointer by value plus syntax sugar. Passing by value may be inefficient if you need to copy more than several bytes. Passing by reference has other problems. It was a source of errors, so passing object created on stack by non-constant reference was forbidden. If you need to modify argument passed by constant reference in the function you need to create a copy, even if the object is never used after passing to the function (like in foo(bar()), the returned value of bar() cannot be used outside of foo(), so it could be allow to modify it). To resolve this problem the move semantic was introduced.

...
Do you have a more realistic example where this would catch an important type error?

I seen and used the following example (with variations) multiple times:

class Client: def send(self, msg, token, headers=None): if headers is None: headers = {} # headers = headers.copy() headers["Authorization"] = f"Bearer {token}" headers["Content-Type"] = "text/plain" headers["X-Hash"] = md5sum(msg.encode()) with self.request('POST', self.url, headers=headers, data=msg) as resp: ...

client.send('Hello!', mytoken, headers={'X-Type': 'Spam'})

The function sets some standard headers, but you can pass also additional headers. If you only pass a just created dictionary, the function can modify it in-place because its value is not used outside. But if you create the dict once and use it in other places, modifying it inside the function is a non-desired effect. You need to make a copy in the function just because you don't know how it will be used in the user code. If you document that the argument cannot be used after calling the function and add some marks for static type checking, you could avoid copying.

Personally I think Python has a little need in such feature. Copied dicts and lists are usually small, and it adds not much overhead in comparison with other overhead that Python does have. And if you need to pass and modify a large collection, documenting this should be enough.

For an opposing point of view, why not write that method as follows: class Client: def send(self, msg, token, additional_headers=None): headers = {"Authorization" : f"Bearer {token}", "Content-Type" : "text/plain", "X-Hash" : md5sum(msg.encode()), **(additional_headers or {})} ... Then I only modify a just created dictionary and never some dictionary of questionable origin and/or future. Obviously, there's a difference in behavior if additional_headers contains Authorization or X-Hash or Content-Type, but I can't say which behavior is correct without more context, and I can't say which is more efficient (in time or in space) without knowing how often additional_headers is not None or how big it usually is. I come from old(er) school (1980s, 1990s) embedded systems, and who "owns" a particular mutable data structure and how/where it gets mutated always came up long before we wrote any code. No, I'm not claiming that pre-ansi C and assembler are more productive or less runtime error prone than newer languages, but is this feature only necessary because "modern" software development no longer includes a design phase or adequate documentation? I understand the advantages of automated systems (the compiler and/or static analyzer) over manual ones (code review, documentation, writing unit tests), but ISTM that if you're writing code and don't know whether or not you can mutate a given dictionary, or you're calling a function and don't know whether or not that function might mutate that dictionary, then the battle is already lost. Memory management implementation details is a long way from executable pseudo code. (30 years is a long time, too.)

Antoine Pitrou

5:32 a.m.

On Fri, 27 Nov 2020 07:32:17 -0500 2QdxY4RzWzUUiLuE@potatochowder.com wrote:

...

I come from old(er) school (1980s, 1990s) embedded systems, and who "owns" a particular mutable data structure and how/where it gets mutated always came up long before we wrote any code. No, I'm not claiming that pre-ansi C and assembler are more productive or less runtime error prone than newer languages, but is this feature only necessary because "modern" software development no longer includes a design phase or adequate documentation?

"Modern" software development is just like older software development in that regard: sometimes it includes a design phase and/or adequate (i.e. sufficiently precise) documentation, sometimes it doesn't.

...

Memory management implementation details is a long way from executable pseudo code. (30 years is a long time, too.)

This isn't really about memory management, though. Regards Antoine.

Ned Batchelder

5:56 a.m.

On 11/27/20 8:32 AM, Antoine Pitrou wrote:

...

On Fri, 27 Nov 2020 07:32:17 -0500 2QdxY4RzWzUUiLuE@potatochowder.com wrote:

...
I come from old(er) school (1980s, 1990s) embedded systems, and who "owns" a particular mutable data structure and how/where it gets mutated always came up long before we wrote any code. No, I'm not claiming that pre-ansi C and assembler are more productive or less runtime error prone than newer languages, but is this feature only necessary because "modern" software development no longer includes a design phase or adequate documentation? "Modern" software development is just like older software development in that regard: sometimes it includes a design phase and/or adequate (i.e. sufficiently precise) documentation, sometimes it doesn't.

...
Memory management implementation details is a long way from executable pseudo code. (30 years is a long time, too.) This isn't really about memory management, though.

Maybe it would help to clarify what it *is* about. The original proposal makes no mention of the problem being solved. --Ned.

Antoine Pitrou

6 a.m.

On Fri, 27 Nov 2020 08:56:18 -0500 Ned Batchelder <ned@nedbatchelder.com> wrote:

...

On 11/27/20 8:32 AM, Antoine Pitrou wrote:

...
On Fri, 27 Nov 2020 07:32:17 -0500 2QdxY4RzWzUUiLuE@potatochowder.com wrote:

...
I come from old(er) school (1980s, 1990s) embedded systems, and who "owns" a particular mutable data structure and how/where it gets mutated always came up long before we wrote any code. No, I'm not claiming that pre-ansi C and assembler are more productive or less runtime error prone than newer languages, but is this feature only necessary because "modern" software development no longer includes a design phase or adequate documentation? "Modern" software development is just like older software development in that regard: sometimes it includes a design phase and/or adequate (i.e. sufficiently precise) documentation, sometimes it doesn't.

...
Memory management implementation details is a long way from executable pseudo code. (30 years is a long time, too.) This isn't really about memory management, though.

Maybe it would help to clarify what it *is* about. The original proposal makes no mention of the problem being solved.

Perhaps you could start by reading other messages posted by Serhiy and I in this thread, for example. I realize that trying to avoid redundant discussions is against the ethics and traditions of python-ideas, but still. Regards Antoine.

2QdxY4RzWzUUiLuE＠potatochowder.com

6:07 a.m.

On 2020-11-27 at 14:32:11 +0100, Antoine Pitrou <solipsis@pitrou.net> wrote:

...

On Fri, 27 Nov 2020 07:32:17 -0500 2QdxY4RzWzUUiLuE@potatochowder.com wrote:

...
I come from old(er) school (1980s, 1990s) embedded systems, and who "owns" a particular mutable data structure and how/where it gets mutated always came up long before we wrote any code. No, I'm not claiming that pre-ansi C and assembler are more productive or less runtime error prone than newer languages, but is this feature only necessary because "modern" software development no longer includes a design phase or adequate documentation?

"Modern" software development is just like older software development in that regard: sometimes it includes a design phase and/or adequate (i.e. sufficiently precise) documentation, sometimes it doesn't.

Fair enough. :-) (In my case, memory was a scarce resource and we emphasized documentation about it.)

...

...
Memory management implementation details is a long way from executable pseudo code. (30 years is a long time, too.)

This isn't really about memory management, though.

It reminds me of "use after free" errors. Whenever Rust comes up, I think of memory management. Call me biased. :-) Also, in Serhiy's example,¹ if Client.send "moved" headers, then it could also free the memory associated with that dictionary at the end (although I realize that that may not always be the case). Many times, freeing memory sooner is better. ¹ https://mail.python.org/archives/list/python-ideas@python.org/message/VBB3XG...

Bruce Leban

1:50 p.m.

On Thu, Nov 26, 2020 at 5:43 AM <3mir33@gmail.com> wrote:

...

Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ```

You say "moved and cannot be used" which is either incomplete, ambiguous, contradictory, or some combination of those. Moved where and who can't use it? What you really are talking about is "taking ownership." But then you're omitting the key detail that it also gives ownership back via the return value. Otherwise what you would want to indicate is that the function destroys the value. Since the semantics of Python are that objects are always passed by reference, it's implied that control of the object is passed to the called function. I think What might be useful are declarations that: - The object is not modified (i.e., read only). If a function that declares a parameter as read-only passes that parameter to one that does not use that declaration, that can be identified as an error. - The object is destroyed or consumed by the function. Any use of the object after calling the function could be identified as an error. --- Bruce

Guido van Rossum

2:14 p.m.

On Thu, Nov 26, 2020 at 1:51 PM Bruce Leban <bruce@leban.us> wrote:

...

What might be useful are declarations that:

- The object is not modified (i.e., read only). If a function that declares a parameter as read-only passes that parameter to one that does not use that declaration, that can be identified as an error. - The object is destroyed or consumed by the function. Any use of the object after calling the function could be identified as an error.

I don't think Python is ready for const arguments (the first bullet). "Const propagation" can be a real problem. The second seems to be what the OP is after, and I now can see the use case -- in fact in libraries for scientific computing and/or machine learning I think Iv'e seen are a bunch of hacks to imply transfer of ownership that might benefit from this. But I'm not yet sufficiently familiar with that field to be able to point you to examples. Hopefully there are readers here who can. (Nate?) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

nate lust

2:30 p.m.

Sorry I think I was not being clear due to differing understandings or using of the definitions of terms (likely on my part). The second built point is what I was considering, "ownership" of an object is moved out of the scope/frame/however you think about it, into the one you are calling. The only point I would say is that it does not need imply it is destroyed or consumed the function itself may return "ownership" with a return value, just that it is an error to access it through that binding, whatever the underlying implementation is doing (shared references, actual shuffling of pointers on a stack etc). Rust uses this behavior extensively in its compiler to reason about lifetimes of objects. I will see if I can dig up some of the hacks I have seen on this to use as examples of behaviors over this holliday. I can't remember which package I saw it in, but I remember one did it so It could efficiently reuse things like large numerical arrays since it could be sure nothing else is going to try to use the original object and potentially get nonsense as result. On Thu, Nov 26, 2020 at 5:15 PM Guido van Rossum <guido@python.org> wrote:

...

On Thu, Nov 26, 2020 at 1:51 PM Bruce Leban <bruce@leban.us> wrote:

...
What might be useful are declarations that:

- The object is not modified (i.e., read only). If a function that declares a parameter as read-only passes that parameter to one that does not use that declaration, that can be identified as an error. - The object is destroyed or consumed by the function. Any use of the object after calling the function could be identified as an error.

I don't think Python is ready for const arguments (the first bullet). "Const propagation" can be a real problem.

The second seems to be what the OP is after, and I now can see the use case -- in fact in libraries for scientific computing and/or machine learning I think Iv'e seen are a bunch of hacks to imply transfer of ownership that might benefit from this. But I'm not yet sufficiently familiar with that field to be able to point you to examples. Hopefully there are readers here who can. (Nate?)

-- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...> _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/VPD2Q5... Code of Conduct: http://python.org/psf/codeofconduct/

-- Nate Lust, PhD. Astrophysics Dept. Princeton University

Ned Batchelder

5:12 p.m.

On 11/26/20 6:44 AM, 3mir33@gmail.com wrote:

...

Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d

d = {1: 2} f(d) print(d[1]) # mistake, using of moved value

Maybe I'm behind the times on the latest thinking in programming languages. I'm not familiar with "move semantics". Can someone explain what is wrong with the code above? What is the mistake? `print(d[1])` will work perfectly fine. The function changes the 'a' key, and then we access the 1 key. Should the example have used the same key in both places? Thanks, --Ned.

William Pickard

5:47 p.m.

To help you understand what they're requesting, do you at least understand C++'s move schematics. If not, here's a good place to read about it: https://www.fluentcpp.com/2018/02/06/understanding-lvalues-rvalues-and-their...

MRAB

6:02 p.m.

On 2020-11-27 01:12, Ned Batchelder wrote:

...

On 11/26/20 6:44 AM, 3mir33@gmail.com wrote:

...
Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d

d = {1: 2} f(d) print(d[1]) # mistake, using of moved value

Maybe I'm behind the times on the latest thinking in programming languages. I'm not familiar with "move semantics". Can someone explain what is wrong with the code above? What is the mistake? `print(d[1])` will work perfectly fine. The function changes the 'a' key, and then we access the 1 key. Should the example have used the same key in both places?

d is moved into function f. f does return d when it exits, moving it out back again, but as the caller doesn't bind to it, it's discarded. Move semantics is a way of ensuring that there's only one "owner", which makes garbage collection simpler; when an object is discarded (dereferenced) by its the owner, it can be garbage-collected.

Ned Batchelder

6:30 p.m.

On 11/26/20 9:02 PM, MRAB wrote:

...

On 2020-11-27 01:12, Ned Batchelder wrote:

...
On 11/26/20 6:44 AM, 3mir33@gmail.com wrote:

...
Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d

d = {1: 2} f(d) print(d[1]) # mistake, using of moved value

Maybe I'm behind the times on the latest thinking in programming languages. I'm not familiar with "move semantics". Can someone explain what is wrong with the code above? What is the mistake? `print(d[1])` will work perfectly fine. The function changes the 'a' key, and then we access the 1 key. Should the example have used the same key in both places?

d is moved into function f. f does return d when it exits, moving it out back again, but as the caller doesn't bind to it, it's discarded.

It's not discarded, it's still referenced by d in the outer scope. --Ned.

MRAB

7:41 p.m.

On 2020-11-27 02:30, Ned Batchelder wrote:

...

On 11/26/20 9:02 PM, MRAB wrote:

...
On 2020-11-27 01:12, Ned Batchelder wrote:

...
On 11/26/20 6:44 AM, 3mir33@gmail.com wrote:

...
Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d

d = {1: 2} f(d) print(d[1]) # mistake, using of moved value

Maybe I'm behind the times on the latest thinking in programming languages. I'm not familiar with "move semantics". Can someone explain what is wrong with the code above? What is the mistake? `print(d[1])` will work perfectly fine. The function changes the 'a' key, and then we access the 1 key. Should the example have used the same key in both places?

d is moved into function f. f does return d when it exits, moving it out back again, but as the caller doesn't bind to it, it's discarded.

It's not discarded, it's still referenced by d in the outer scope.

No, it's not any more, and that's the point. It was _moved_ into the function, and although the function returned it, it was discarded because the caller didn't bind it to keep hold of it.

Brendan Barnwell

8:39 p.m.

On 2020-11-26 19:41, MRAB wrote:

...

On 2020-11-27 02:30, Ned Batchelder wrote:

...
On 11/26/20 9:02 PM, MRAB wrote:

...
On 2020-11-27 01:12, Ned Batchelder wrote:

...
On 11/26/20 6:44 AM, 3mir33@gmail.com wrote:

...
Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d

d = {1: 2} f(d) print(d[1]) # mistake, using of moved value

Maybe I'm behind the times on the latest thinking in programming languages. I'm not familiar with "move semantics". Can someone explain what is wrong with the code above? What is the mistake? `print(d[1])` will work perfectly fine. The function changes the 'a' key, and then we access the 1 key. Should the example have used the same key in both places?

d is moved into function f. f does return d when it exits, moving it out back again, but as the caller doesn't bind to it, it's discarded.

It's not discarded, it's still referenced by d in the outer scope.

No, it's not any more, and that's the point. It was _moved_ into the function, and although the function returned it, it was discarded because the caller didn't bind it to keep hold of it.

This doesn't make sense as a description of the actual semantics of the Python code. The value isn't "moved", it is passed like all arguments, and the function mutates it, so the mutated dict IS accessible from the calling code. If we're talking about adding something to typing module that would make this raise an error in type checking, that's one thing, but (I hope) no one is suggesting that that would have any impact on what the code actually does. There's just (hypothetically) a type hint saying `d` "shouldn't" be used after the call. To say it "was moved" and "is not [referenced] any more" is incorrect. Those are just things that are type-hinted, not things that actually happen. If you have: def f(x: int) -> str: x = "hello" return 3 f("goodbye") . . .it's not correct to say that you "didn't" call f with a string value or that a new string value "wasn't" assigned or that an int "wasn't" returned. All of those things happened. They're a violation of the type hints, but those are just hints and what happens still happens. Type hints don't say anything about what happens, only what is "supposed" to happen. -- Brendan Barnwell "Do not follow where the path may lead. Go, instead, where there is no path, and leave a trail." --author unknown

Guido van Rossum

8:45 p.m.

On Thu, Nov 26, 2020 at 7:45 PM MRAB <python@mrabarnett.plus.com> wrote:

...

...
It's not discarded, it's still referenced by d in the outer scope.

No, it's not any more, and that's the point. It was _moved_ into the function, and although the function returned it, it was discarded because the caller didn't bind it to keep hold of it.

Sounds like one of you is describing current semantics and the other is explaining the proposed new semantics. :-) -- --Guido van Rossum (python.org/~guido) *Pronouns: he/him **(why is my pronoun here?)* <http://feministing.com/2015/02/03/how-using-they-as-a-singular-pronoun-can-c...>

Ned Batchelder

3:25 a.m.

On 11/26/20 11:45 PM, Guido van Rossum wrote:

...

On Thu, Nov 26, 2020 at 7:45 PM MRAB <python@mrabarnett.plus.com <mailto:python@mrabarnett.plus.com>> wrote:

> It's not discarded, it's still referenced by d in the outer scope. > No, it's not any more, and that's the point. It was _moved_ into the function, and although the function returned it, it was discarded because the caller didn't bind it to keep hold of it.

Sounds like one of you is describing current semantics and the other is explaining the proposed new semantics. :-)

Yes, I see that now. As Chris points out elsewhere in the thread, this proposal would have the type annotations change the actual behavior of the code. Has this been done before? It also seems that it has to change the behavior of the caller of the function, when the compiler won't have access to the definition of the function. This proposal seems to run counter to a number of fundamental Python dynamics. --Ned.

Serhiy Storchaka

3:54 a.m.

...

On 11/26/20 11:45 PM, Guido van Rossum wrote: Yes, I see that now. As Chris points out elsewhere in the thread, this proposal would have the type annotations change the actual behavior of the code. No, it will not change the runtime behavior. But it can make a restriction which was in the documentation only to be enforced by the

27.11.20 13:25, Ned Batchelder пише: linter. If it is documented, that you should never use a dict after passing it as argument to f(), the code that uses it has a bug. With the proposed feature MyPy could warn you about this bug. It is discussable how much useful this feature is. Python is not C++, and some microoptimizations important in C++ are not worth in Python.

Ned Batchelder

4:44 a.m.

On 11/27/20 6:54 AM, Serhiy Storchaka wrote:

...

...
On 11/26/20 11:45 PM, Guido van Rossum wrote: Yes, I see that now. As Chris points out elsewhere in the thread, this proposal would have the type annotations change the actual behavior of the code. No, it will not change the runtime behavior. But it can make a restriction which was in the documentation only to be enforced by the

27.11.20 13:25, Ned Batchelder пише: linter.

If it is documented, that you should never use a dict after passing it as argument to f(), the code that uses it has a bug. With the proposed feature MyPy could warn you about this bug.

If I understand what you are saying, this would be a dramatic change in Python semantics which 1) would break many projects, and 2) wouldn't need a type annotation because it isn't something you could turn on and off. So mypy wouldn't warn you about it, pylint would. I think I mistook this proposal for a simple thing, when it is far from simple. Thanks for the clarification. --Ned.

Chris Angelico

8:53 p.m.

On Fri, Nov 27, 2020 at 2:46 PM MRAB <python@mrabarnett.plus.com> wrote:

...

On 2020-11-27 02:30, Ned Batchelder wrote:

...
On 11/26/20 9:02 PM, MRAB wrote:

...
On 2020-11-27 01:12, Ned Batchelder wrote:

...
On 11/26/20 6:44 AM, 3mir33@gmail.com wrote:

...
Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d

d = {1: 2} f(d) print(d[1]) # mistake, using of moved value

Maybe I'm behind the times on the latest thinking in programming languages. I'm not familiar with "move semantics". Can someone explain what is wrong with the code above? What is the mistake? `print(d[1])` will work perfectly fine. The function changes the 'a' key, and then we access the 1 key. Should the example have used the same key in both places?

d is moved into function f. f does return d when it exits, moving it out back again, but as the caller doesn't bind to it, it's discarded.

It's not discarded, it's still referenced by d in the outer scope.

No, it's not any more, and that's the point. It was _moved_ into the function, and although the function returned it, it was discarded because the caller didn't bind it to keep hold of it.

Currently, type hints and annotations don't change actual behaviour like this. Is this proposal to have it actually get discarded, or merely to have a type checker flag a warning if you continue to use it? I strongly disagree with the former. When you call a function, it doesn't matter what the function is, you just call it. It's the function's responsibility to do anything special. That means there can't be an unbind operation in the caller on the basis of a flag on the function. The latter is also more closely aligned with the way that Python currently handles "oops you used this after it should have been discarded". Consider:

...

...
...
d = {1:2, 3:4} i = iter(d) next(i) 1 del d[1] next(i) Traceback (most recent call last): File "<stdin>", line 1, in <module> RuntimeError: dictionary changed size during iteration i <dict_keyiterator object at 0x7f28c19244f0>

The iterator still exists, but has been internally flagged as "hey, this thing is broken now, don't use it". If that's the meaning of "move semantics" (imagine that "del d[1]" is replaced with some function that is listed as 'consuming' i), then that does at least make reasonable sense. ChrisA

Valentin Berlier

2:39 a.m.

This thread is a mess. Move semantics is nothing more than creating a shallow copy that steals the inner state of a previous instance. It's an optimization, and moving out of a variable never makes the previous instance unusable: void f1(std::vector<int>&& vec); void f2() { std::vector<int> vec; f1(std::move(vec)); vec.push_back(2); // This compiles just fine } Since move semantics are a pass-by-value optimization, they can't exist in python where calling any kind of function simply shares bindings. In python no matter how hard you try you'll never run into the "hollow" instances that move semantics leave behind. Some of the use-cases described in this thread are actually about borrow checking: making it possible to terminate the semantic lifetime of a variable by passing it to a function. Manually propagating lifetime restrictions through type annotations is definitely possible, and could be supported with a mypy plugin, but it will be very verbose, and without any kind of "constness" in the language the added safety will be pretty limited.

3mir33＠gmail.com

3:27 a.m.

Your example is specific to C++ only. In Rust it is not just an optimization, it's about ownership. Example from Rust: fn f1(vec: Vec<i32>) {} fn f2() { let mut vec = vec![1]; f1(vec); vec.push(2); // This will not compile (borrow of moved value), because f1 took ownership of vec }

Valentin Berlier

3:55 a.m.

The C++ example specifically shows that if you're talking about ownership and lifetimes, you're not talking about move semantics. As you pointed out, the example wouldn't work in Rust specifically because Rust has a borrow checker, and not just move semantics. A compiler with a borrow checker will perform move optimizations, which at runtime result in behavior similar to C++ move semantics. So I'm pointing out that in this thread, we're really talking about borrow checking with declarative lifetimes more than move semantics.

Serhiy Storchaka

3:56 a.m.

26.11.20 13:44, 3mir33@gmail.com пише:

...

Add something like Move type hint to typing module. It will tell the analyzer that the input parameter of the function is moved and can not be used after. For example: ``` def f(d: Move[dict]) -> dict: d['a'] = 2 return d

d = {1: 2} f(d) print(d[1]) # mistake, using of moved value

See also discussion "add a list.swap() method" https://mail.python.org/archives/list/python-ideas@python.org/thread/WIBV45J... That proposition was not passed, but if that feature be added, it would need a type annotation proposed in this thread.

1528

Age (days ago)

1529

Last active (days ago)

List overview

Download

38 comments

15 participants

participants (15)

2QdxY4RzWzUUiLuE＠potatochowder.com
3mir33＠gmail.com
Antoine Pitrou
Brendan Barnwell
Bruce Leban
Chris Angelico
Guido van Rossum
MRAB
nate lust
Ned Batchelder
Richard Damon
Serhiy Storchaka
Steven D'Aprano
Valentin Berlier
William Pickard

Move semantics

tags

participants (15)