
Step 1. get rid of + for strings, lists, etc. (string/list concatenation is not addition) Step 2. add concatenation operator for strings, lists, and basically anything that can be iterated. effectively an overloadable itertools.chain. (list cat list = new list, not iterator, but effectively makes itertools.chain useless.) Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23, 22 cat 33 = 2233, etc. if you need bitwise concatenation, you're already in bitwise "hack" land so do it yourself. (no idea why bitwise is considered hacky as I use it all the time, but oh well) Step 4. make it into python 4, since it breaks backwards compatibility.

On Fri, Jun 30, 2017 at 9:33 AM, Soni L. <fakedme+py@gmail.com> wrote:
Nope. Practicality beats purity. Something like this exists in REXX ("+" means addition, and "||" means concatenation), and it doesn't help (though it's necessary there as REXX doesn't have distinct data types for strings and numbers). String concatenation might as well be addition. It's close enough and it makes perfect sense. Get a bunch of people together and ask them this: "If 5+6 means 11, then what does 'hello' + 'world' mean?". Most of them will assume it means concatenation. ChrisA

On Thu, Jun 29, 2017 at 08:33:12PM -0300, Soni L. wrote:
Step 1. get rid of + for strings, lists, etc. (string/list concatenation is not addition)
I agree that using + for concatenation is sub-optimal, & is a better choice, but we're stuck with it. And honestly it's not *that* big a deal that I would break backwards compatibility for this. Fixing the "problem" is more of a pain than just living with it.
Chaining is not concatenation. Being able to concatenate two strings (or two tuples, two lists) and get an actual string rather than a chained iterator is a good thing. word = (stem + suffix).upper() Being able to chain arbitrary iterables and get an iterator is also a good thing: chain(astring, alist, atuple) If we had a chaining operator, it too would have to accept arbitrary iterables and return an iterator.
Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23,
When would you need that? What use-case for concatenation of numbers is there, and why is it important enough to use an operator instead of a custom function? The second part is the most critical -- I'm sure there are uses for concatenating digits to get integers, although I can't think of any right now -- but ASCII operators are in short supply, why are we using one for such a specialised and rarely used function? Things would be different if we had a dedicated concatenation operator, then we could allow things like 1 & '1' returns '11' say but we don't and I don't expect that allowing this is important enough to force the backwards compatibility break.
Python 4 will not be a major backwards incompatible version like Python 3 was. It will be just a regular evolutionary (rather than revolutionary) upgrade from 3.9. When I want to talk about major backwards incompatibilities, I talk about "Python 5000", by analogy to "Python 3000". -- Steve

On 2017-06-29 09:48 PM, Steven D'Aprano wrote:
astring cat alist is undefined for string (since strings are very specific about types), so it would return a list. alist cat atuple would return a list, because the list comes first. This is *EFFECTIVELY* equivalent to chaining, since iterating the results of these concatenations produces the *exact* same results as iterating their chainings. (And don't say "performance" - CPython has a GIL, and Python makes many convenience-over-performance tradeoffs like this.)
Since we'd have a concatenation operator, why not extend them to integers? No reason not to, really. In practice tho, it would never be used. This was never about integers, even if I did mention them.
This isn't a *major* backwards incompatibility. Unlike with unicode/strings, a dumb static analysis program can trivially replace + with the concatenation operator, whatever that may be. Technically, nothing forces us to remove + from strings and such and the itertools stuff - we could just make them deprecated in python 4, and remove them in python 5. (PS: I don't propose using literally "cat" for concatenation. That was just a placeholder.)

On Fri, Jun 30, 2017 at 12:14 PM, Soni L. <fakedme+py@gmail.com> wrote:
It wouldn't be quite that trivial though. If all you do is replace "+" with "&", you've broken anything that uses numeric addition. Since this is a semantic change, you can't defer it to run-time (the way a JIT compiler like PyPy could), and you can't afford to have it be "mostly right but might have edge cases" like something based on type hints would be. So there'd be some human work involved, as with the bytes/text distinction. What you're wanting to do is take one operator ("+") and split it into two roles (addition and concatenation). That means getting into the programmer's head, so it can't be completely automated. Even if it CAN be fully automated, though, what would you gain? You've made str+str no longer valid - to what end? Here's a counter-proposal: Start with your step 2, and create a new __concat__ magic method and corresponding operator. Then str gets a special case: class str(str): # let's pretend def __concat__(self, other): return self + str(other) And tuple gets a special case: class tuple(tuple): # pretend again def __concat__(self, other): return *self, *other And maybe a few others (list, set, possibly dict). For everything else, object() will handle them: class object(object): # mind-bending def __concat__(self, other): return itertools.chain(iter(self), iter(other)) Since this isn't *changing* the meaning of anything, it's backwards compatible. You gain an explicit concatenation operator, the default case is handled by Python's standard mechanisms, the special cases are handled by Python's standard mechanisms, and it's all exactly what people would expect. Then the use of '+' to concatenate strings can be deprecated without removal (or, more likely, kept fully supported by the language but deprecated in style guides), and you've mostly achieved what you sought. Your challenge: Find a suitable operator to use. It wants to be ASCII, and it has to be illegal syntax in current Python versions. It doesn't have to be a single character, but it should be short (two is okay, three is the number thou shalt stop at, four thou shalt not count, and five is right out) and easily typed, since string concatenation is incredibly common. It should ideally evoke "concatenation", but that isn't strictly necessary (the link between "@" and "matrix multiplication" is tenuous at best). Good luck. :) For my part, I'm -0.5 on my own counter-proposal, but that's a fair slab better than the -1000 that I am on the version that breaks backward compatibility for minimal real gain. ChrisA

On Thu, Jun 29, 2017 at 11:14:46PM -0300, Soni L. wrote:
This would be strongly unacceptable to me. If iterating over the items was the *only* thing people ever did with sequences, that *might* be acceptable, but it isn't. We do lots of other things, depending on what the sequence is: - we sort lists, append to them, slice them, etc; - we convert strings to uppercase, search them, etc. It is important to know that if you concatenate something to a string, it will either give a string, or noisily fail, rather than silently convert to a different type that doesn't support string operations. Now admittedly that rule can be broken by third-party classes (since they can overload operators to do anything), but that's more of a problem in theory than in practice. Your suggestion would make it a problem for builtins as well as (badly-behaved?) third-party classes. For when you don't care about the type, you just want it to be an iterator, that's where chaining is useful, and whether it is a chain function or a chain operator, it should accept any iterable and return an iterator. Concatenation is not the same as general chaining, although they are related. Concatenation should return the same type as its operands. Chaining can just return an arbitrary iterator.
(And don't say "performance" - CPython has a GIL, and Python makes many convenience-over-performance tradeoffs like this.)
Are you aware that CPython doesn't just have a GIL because the core devs think it would be funny to slow the language down? The GIL actually makes CPython faster: so far, all attempts to remove the GIL have made CPython slower. So your "performance" tradeoff goes the other way: without the GIL, Python code would be slower. (Perhaps the Gilectomy will change that in the future, but at the moment, it is fair to say that the GIL is an optimization that makes Python faster, not slower.)
Since we'd have a concatenation operator, why not extend them to integers? No reason not to, really.
That's the wrong question. Never ask "why not add this to the language?", the right question is "why should we add this?". We don't just add bloat and confusing, useless features to the language because nobody can think of a reason not to. Features have a cost: they cost developer effort to program and maintain, they cost effort to maintain the tests and documentation and to fix bugs, they cost users effort to learn about them and deal with them. Every feature has to pay its own way: the benefits have to outweigh the costs. -- Steve

On 30 Jun 2017, at 03:14, Soni L. <fakedme+py@gmail.com> wrote:
This isn't a *major* backwards incompatibility. Unlike with unicode/strings, a dumb static analysis program can trivially replace + with the concatenation operator, whatever that may be. Technically, nothing forces us to remove + from strings and such and the itertools stuff - we could just make them deprecated in python 4, and remove them in python 5.
No it can’t, not unless you’re defining concatenation as identical to numeric addition (which I saw in your original post you are not). For example: def mymethod(a, b): return a + b What should the static analysis program do here? Naturally, it’s unclear. The only way to be even remotely sure in the current Python world where type hinting is optional and gradual is to do what PyPy does, which is to run the entire program and JIT it, and even then PyPy puts in guards to confirm that it doesn’t get caught out if and when an assumption is wrong. So yes, I’d say this is at least as bad as the unicode/bytes divide in terms of static analysis: unless you make type hinting mandatory for any function including the symbol “+”, there is no automatic transformation that can be made here. Cory

On 30 June 2017 at 09:33, Soni L. <fakedme+py@gmail.com> wrote:
Step 4. make it into python 4, since it breaks backwards compatibility.
If a Python 4.0 ever happens, it will abide by the usual feature release compatibility restrictions (i.e. anything that it drops will have gone through programmatic deprecation in preceding 3.x releases). This means there won't be any abrupt changes in syntax or semantics the way there were for the 3.0 transition. http://www.curiousefficiency.org/posts/2014/08/python-4000.html goes into more detail on that topic (although some time after I wrote that article, we decided that there probably *will* just be a 3.10, rather than switching the numbering to 4.0) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Just as an aside, if a concatenation operator *was* included, a suitable operator would be "++", this is the concatenation operator in languages like Haskell (for strings) and the majority of Scala cases. Alternatively "<>" is an alternative, being the monoidal append operator in Haskell, which retains a certain similarly. I suggest these purely for their accepted usage, which means they should be more reasonable to identify. Jamie On 30 Jun 2017 12:35 pm, "Victor Stinner" <victor.stinner@gmail.com> wrote:

On Fri, Jun 30, 2017 at 12:51:26PM +0100, Jamie Willis wrote:
Just as an aside, if a concatenation operator *was* included, a suitable operator would be "++",
As mentioned earlier in this thread, that is not possible in Python as syntactically `x ++ y` would be parsed as `x + (+y)` (the plus binary operator followed by the plus unary operator).
"<>" is familiar to many people as "not equal" in various programming languages, including older versions of Python. I'm not entirely sure what connection "<>" has to append, it seems pretty arbitrary to me, although in fairness nearly all operators are arbitrary symbols if you go back far enough. -- Steve

On Fri, Jun 30, 2017 at 03:10:08PM +0200, "Sven R. Kunze" <srkunze@mail.de> wrote:
'+' is the perfect concat operator. I love Python for this feature.
+1 from me <bigwink>
Regards, Sven
Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On 6/30/2017 9:24 AM, Oleg Broytman wrote:
and me. I think extending it to chain iterators is an intriguing idea. It would not be the first time syntax was implemented with more than one special method. When the boolean value of an object is needed, first .__bool__, then .__len__ are used. Iter() first tries .__iter__, then .__getitem__. When counts are expressed in their original unary notation, addition is concatention. If one thinks of a sequence as a unary representation of its length*, then concatenation is adddition. *This is a version of the mathematical idea of cardinal number. Whether intentionally or by accident, or perhaps, whether by analysis or intuition, I think Guido got this one right. -- Terry Jan Reedy

On Fri, Jun 30, 2017 at 9:33 AM, Soni L. <fakedme+py@gmail.com> wrote:
Nope. Practicality beats purity. Something like this exists in REXX ("+" means addition, and "||" means concatenation), and it doesn't help (though it's necessary there as REXX doesn't have distinct data types for strings and numbers). String concatenation might as well be addition. It's close enough and it makes perfect sense. Get a bunch of people together and ask them this: "If 5+6 means 11, then what does 'hello' + 'world' mean?". Most of them will assume it means concatenation. ChrisA

On Thu, Jun 29, 2017 at 08:33:12PM -0300, Soni L. wrote:
Step 1. get rid of + for strings, lists, etc. (string/list concatenation is not addition)
I agree that using + for concatenation is sub-optimal, & is a better choice, but we're stuck with it. And honestly it's not *that* big a deal that I would break backwards compatibility for this. Fixing the "problem" is more of a pain than just living with it.
Chaining is not concatenation. Being able to concatenate two strings (or two tuples, two lists) and get an actual string rather than a chained iterator is a good thing. word = (stem + suffix).upper() Being able to chain arbitrary iterables and get an iterator is also a good thing: chain(astring, alist, atuple) If we had a chaining operator, it too would have to accept arbitrary iterables and return an iterator.
Step 3. add decimal concatenation operator for numbers: 2 cat 3 == 23,
When would you need that? What use-case for concatenation of numbers is there, and why is it important enough to use an operator instead of a custom function? The second part is the most critical -- I'm sure there are uses for concatenating digits to get integers, although I can't think of any right now -- but ASCII operators are in short supply, why are we using one for such a specialised and rarely used function? Things would be different if we had a dedicated concatenation operator, then we could allow things like 1 & '1' returns '11' say but we don't and I don't expect that allowing this is important enough to force the backwards compatibility break.
Python 4 will not be a major backwards incompatible version like Python 3 was. It will be just a regular evolutionary (rather than revolutionary) upgrade from 3.9. When I want to talk about major backwards incompatibilities, I talk about "Python 5000", by analogy to "Python 3000". -- Steve

On 2017-06-29 09:48 PM, Steven D'Aprano wrote:
astring cat alist is undefined for string (since strings are very specific about types), so it would return a list. alist cat atuple would return a list, because the list comes first. This is *EFFECTIVELY* equivalent to chaining, since iterating the results of these concatenations produces the *exact* same results as iterating their chainings. (And don't say "performance" - CPython has a GIL, and Python makes many convenience-over-performance tradeoffs like this.)
Since we'd have a concatenation operator, why not extend them to integers? No reason not to, really. In practice tho, it would never be used. This was never about integers, even if I did mention them.
This isn't a *major* backwards incompatibility. Unlike with unicode/strings, a dumb static analysis program can trivially replace + with the concatenation operator, whatever that may be. Technically, nothing forces us to remove + from strings and such and the itertools stuff - we could just make them deprecated in python 4, and remove them in python 5. (PS: I don't propose using literally "cat" for concatenation. That was just a placeholder.)

On Fri, Jun 30, 2017 at 12:14 PM, Soni L. <fakedme+py@gmail.com> wrote:
It wouldn't be quite that trivial though. If all you do is replace "+" with "&", you've broken anything that uses numeric addition. Since this is a semantic change, you can't defer it to run-time (the way a JIT compiler like PyPy could), and you can't afford to have it be "mostly right but might have edge cases" like something based on type hints would be. So there'd be some human work involved, as with the bytes/text distinction. What you're wanting to do is take one operator ("+") and split it into two roles (addition and concatenation). That means getting into the programmer's head, so it can't be completely automated. Even if it CAN be fully automated, though, what would you gain? You've made str+str no longer valid - to what end? Here's a counter-proposal: Start with your step 2, and create a new __concat__ magic method and corresponding operator. Then str gets a special case: class str(str): # let's pretend def __concat__(self, other): return self + str(other) And tuple gets a special case: class tuple(tuple): # pretend again def __concat__(self, other): return *self, *other And maybe a few others (list, set, possibly dict). For everything else, object() will handle them: class object(object): # mind-bending def __concat__(self, other): return itertools.chain(iter(self), iter(other)) Since this isn't *changing* the meaning of anything, it's backwards compatible. You gain an explicit concatenation operator, the default case is handled by Python's standard mechanisms, the special cases are handled by Python's standard mechanisms, and it's all exactly what people would expect. Then the use of '+' to concatenate strings can be deprecated without removal (or, more likely, kept fully supported by the language but deprecated in style guides), and you've mostly achieved what you sought. Your challenge: Find a suitable operator to use. It wants to be ASCII, and it has to be illegal syntax in current Python versions. It doesn't have to be a single character, but it should be short (two is okay, three is the number thou shalt stop at, four thou shalt not count, and five is right out) and easily typed, since string concatenation is incredibly common. It should ideally evoke "concatenation", but that isn't strictly necessary (the link between "@" and "matrix multiplication" is tenuous at best). Good luck. :) For my part, I'm -0.5 on my own counter-proposal, but that's a fair slab better than the -1000 that I am on the version that breaks backward compatibility for minimal real gain. ChrisA

On Thu, Jun 29, 2017 at 11:14:46PM -0300, Soni L. wrote:
This would be strongly unacceptable to me. If iterating over the items was the *only* thing people ever did with sequences, that *might* be acceptable, but it isn't. We do lots of other things, depending on what the sequence is: - we sort lists, append to them, slice them, etc; - we convert strings to uppercase, search them, etc. It is important to know that if you concatenate something to a string, it will either give a string, or noisily fail, rather than silently convert to a different type that doesn't support string operations. Now admittedly that rule can be broken by third-party classes (since they can overload operators to do anything), but that's more of a problem in theory than in practice. Your suggestion would make it a problem for builtins as well as (badly-behaved?) third-party classes. For when you don't care about the type, you just want it to be an iterator, that's where chaining is useful, and whether it is a chain function or a chain operator, it should accept any iterable and return an iterator. Concatenation is not the same as general chaining, although they are related. Concatenation should return the same type as its operands. Chaining can just return an arbitrary iterator.
(And don't say "performance" - CPython has a GIL, and Python makes many convenience-over-performance tradeoffs like this.)
Are you aware that CPython doesn't just have a GIL because the core devs think it would be funny to slow the language down? The GIL actually makes CPython faster: so far, all attempts to remove the GIL have made CPython slower. So your "performance" tradeoff goes the other way: without the GIL, Python code would be slower. (Perhaps the Gilectomy will change that in the future, but at the moment, it is fair to say that the GIL is an optimization that makes Python faster, not slower.)
Since we'd have a concatenation operator, why not extend them to integers? No reason not to, really.
That's the wrong question. Never ask "why not add this to the language?", the right question is "why should we add this?". We don't just add bloat and confusing, useless features to the language because nobody can think of a reason not to. Features have a cost: they cost developer effort to program and maintain, they cost effort to maintain the tests and documentation and to fix bugs, they cost users effort to learn about them and deal with them. Every feature has to pay its own way: the benefits have to outweigh the costs. -- Steve

On 30 Jun 2017, at 03:14, Soni L. <fakedme+py@gmail.com> wrote:
This isn't a *major* backwards incompatibility. Unlike with unicode/strings, a dumb static analysis program can trivially replace + with the concatenation operator, whatever that may be. Technically, nothing forces us to remove + from strings and such and the itertools stuff - we could just make them deprecated in python 4, and remove them in python 5.
No it can’t, not unless you’re defining concatenation as identical to numeric addition (which I saw in your original post you are not). For example: def mymethod(a, b): return a + b What should the static analysis program do here? Naturally, it’s unclear. The only way to be even remotely sure in the current Python world where type hinting is optional and gradual is to do what PyPy does, which is to run the entire program and JIT it, and even then PyPy puts in guards to confirm that it doesn’t get caught out if and when an assumption is wrong. So yes, I’d say this is at least as bad as the unicode/bytes divide in terms of static analysis: unless you make type hinting mandatory for any function including the symbol “+”, there is no automatic transformation that can be made here. Cory

On 30 June 2017 at 09:33, Soni L. <fakedme+py@gmail.com> wrote:
Step 4. make it into python 4, since it breaks backwards compatibility.
If a Python 4.0 ever happens, it will abide by the usual feature release compatibility restrictions (i.e. anything that it drops will have gone through programmatic deprecation in preceding 3.x releases). This means there won't be any abrupt changes in syntax or semantics the way there were for the 3.0 transition. http://www.curiousefficiency.org/posts/2014/08/python-4000.html goes into more detail on that topic (although some time after I wrote that article, we decided that there probably *will* just be a 3.10, rather than switching the numbering to 4.0) Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Just as an aside, if a concatenation operator *was* included, a suitable operator would be "++", this is the concatenation operator in languages like Haskell (for strings) and the majority of Scala cases. Alternatively "<>" is an alternative, being the monoidal append operator in Haskell, which retains a certain similarly. I suggest these purely for their accepted usage, which means they should be more reasonable to identify. Jamie On 30 Jun 2017 12:35 pm, "Victor Stinner" <victor.stinner@gmail.com> wrote:

On Fri, Jun 30, 2017 at 12:51:26PM +0100, Jamie Willis wrote:
Just as an aside, if a concatenation operator *was* included, a suitable operator would be "++",
As mentioned earlier in this thread, that is not possible in Python as syntactically `x ++ y` would be parsed as `x + (+y)` (the plus binary operator followed by the plus unary operator).
"<>" is familiar to many people as "not equal" in various programming languages, including older versions of Python. I'm not entirely sure what connection "<>" has to append, it seems pretty arbitrary to me, although in fairness nearly all operators are arbitrary symbols if you go back far enough. -- Steve

On Fri, Jun 30, 2017 at 03:10:08PM +0200, "Sven R. Kunze" <srkunze@mail.de> wrote:
'+' is the perfect concat operator. I love Python for this feature.
+1 from me <bigwink>
Regards, Sven
Oleg. -- Oleg Broytman http://phdru.name/ phd@phdru.name Programmers don't die, they just GOSUB without RETURN.

On 6/30/2017 9:24 AM, Oleg Broytman wrote:
and me. I think extending it to chain iterators is an intriguing idea. It would not be the first time syntax was implemented with more than one special method. When the boolean value of an object is needed, first .__bool__, then .__len__ are used. Iter() first tries .__iter__, then .__getitem__. When counts are expressed in their original unary notation, addition is concatention. If one thinks of a sequence as a unary representation of its length*, then concatenation is adddition. *This is a version of the mathematical idea of cardinal number. Whether intentionally or by accident, or perhaps, whether by analysis or intuition, I think Guido got this one right. -- Terry Jan Reedy
participants (11)
-
Chris Angelico
-
Clint Hepner
-
Cory Benfield
-
Jamie Willis
-
Nick Coghlan
-
Oleg Broytman
-
Soni L.
-
Steven D'Aprano
-
Sven R. Kunze
-
Terry Reedy
-
Victor Stinner