On 2020-11-27 at 10:38:06 +0200,
Serhiy Storchaka
26.11.20 19:07, Guido van Rossum пише:
This reminds me of something in C++. Does it exist in other languages?
Indeed, this is a term from C++. In older C++ you could pass argument by value, which makes a copy, and by reference, which is equivalent to passing a pointer by value plus syntax sugar. Passing by value may be inefficient if you need to copy more than several bytes. Passing by reference has other problems. It was a source of errors, so passing object created on stack by non-constant reference was forbidden. If you need to modify argument passed by constant reference in the function you need to create a copy, even if the object is never used after passing to the function (like in foo(bar()), the returned value of bar() cannot be used outside of foo(), so it could be allow to modify it). To resolve this problem the move semantic was introduced.
Do you have a more realistic example where this would catch an important type error?
I seen and used the following example (with variations) multiple times:
class Client: def send(self, msg, token, headers=None): if headers is None: headers = {} # headers = headers.copy() headers["Authorization"] = f"Bearer {token}" headers["Content-Type"] = "text/plain" headers["X-Hash"] = md5sum(msg.encode()) with self.request('POST', self.url, headers=headers, data=msg) as resp: ...
client.send('Hello!', mytoken, headers={'X-Type': 'Spam'})
The function sets some standard headers, but you can pass also additional headers. If you only pass a just created dictionary, the function can modify it in-place because its value is not used outside. But if you create the dict once and use it in other places, modifying it inside the function is a non-desired effect. You need to make a copy in the function just because you don't know how it will be used in the user code. If you document that the argument cannot be used after calling the function and add some marks for static type checking, you could avoid copying.
Personally I think Python has a little need in such feature. Copied dicts and lists are usually small, and it adds not much overhead in comparison with other overhead that Python does have. And if you need to pass and modify a large collection, documenting this should be enough.
For an opposing point of view, why not write that method as follows: class Client: def send(self, msg, token, additional_headers=None): headers = {"Authorization" : f"Bearer {token}", "Content-Type" : "text/plain", "X-Hash" : md5sum(msg.encode()), **(additional_headers or {})} ... Then I only modify a just created dictionary and never some dictionary of questionable origin and/or future. Obviously, there's a difference in behavior if additional_headers contains Authorization or X-Hash or Content-Type, but I can't say which behavior is correct without more context, and I can't say which is more efficient (in time or in space) without knowing how often additional_headers is not None or how big it usually is. I come from old(er) school (1980s, 1990s) embedded systems, and who "owns" a particular mutable data structure and how/where it gets mutated always came up long before we wrote any code. No, I'm not claiming that pre-ansi C and assembler are more productive or less runtime error prone than newer languages, but is this feature only necessary because "modern" software development no longer includes a design phase or adequate documentation? I understand the advantages of automated systems (the compiler and/or static analyzer) over manual ones (code review, documentation, writing unit tests), but ISTM that if you're writing code and don't know whether or not you can mutate a given dictionary, or you're calling a function and don't know whether or not that function might mutate that dictionary, then the battle is already lost. Memory management implementation details is a long way from executable pseudo code. (30 years is a long time, too.)