Inconsistency in splitting strings

Hey there, This is my first letter here, I'll try my best to explain the problem. First of all, I ran into this "problem" several times. I find str.split() a bit confusing. First of all, this is a corner case, but can happen like every day. print("".split()) # returns [] print("".split(" ")) # returns [''] print("".split("\t")) # returns [''] print("".split("\n")) # returns [''] print("".splitlines()) # returns [] So using split with or without a separator matters a lot, even when we use the same whitespace character split() uses. I think it is quite annoying. My idea is to return a list with an empty string in all cases mentioned above. Best, Gabor

On 10/20/2020 11:52 AM, Antal Gábor wrote:
With probably millions (at least!) of uses of str.split in real code, there's no way we can change this behavior. You might be able to argue for an additional parameter to control this, but I personally think the chances of confusion are too high. What I usually do is write a wrapper for str.split if I don't like how it's behaving. Eric

Interesting. I agree that this is inconsistent and confusing (and I'm quite curious how the implementation ended up this way). But I have literally NEVER been bitten by this -- perhaps it's because I WAS bitten by it way back when, and then started the habit of ignoring empty strings before I split() -- I have a lot of code like: line = line.strip() if line: # do the splitting, or whatever .... But as Eric says -- it is way too late to change this now -- at least the default behavior. -CHB On Tue, Oct 20, 2020 at 9:08 AM Eric V. Smith <eric@trueblade.com> wrote:
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Tue, Oct 20, 2020 at 8:57 AM Antal Gábor <antalgabor1993@gmail.com> wrote:
My idea is to return a list with an empty string in all cases mentioned above.
This will never be fixed in 3.x, but if it's fixed in The Version That Must Not Be Named, my preference would be that they all return [] because then it's easy to write "or ['']" if want the other behavior, while it's much more of a pain to fix the other way around.

For some explanation, see this StackOverflow answer by Raymond Hettinger: https://stackoverflow.com/a/16645307/11461120

On 10/20/2020 11:52 AM, Antal Gábor wrote:
With probably millions (at least!) of uses of str.split in real code, there's no way we can change this behavior. You might be able to argue for an additional parameter to control this, but I personally think the chances of confusion are too high. What I usually do is write a wrapper for str.split if I don't like how it's behaving. Eric

Interesting. I agree that this is inconsistent and confusing (and I'm quite curious how the implementation ended up this way). But I have literally NEVER been bitten by this -- perhaps it's because I WAS bitten by it way back when, and then started the habit of ignoring empty strings before I split() -- I have a lot of code like: line = line.strip() if line: # do the splitting, or whatever .... But as Eric says -- it is way too late to change this now -- at least the default behavior. -CHB On Tue, Oct 20, 2020 at 9:08 AM Eric V. Smith <eric@trueblade.com> wrote:
-- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Tue, Oct 20, 2020 at 8:57 AM Antal Gábor <antalgabor1993@gmail.com> wrote:
My idea is to return a list with an empty string in all cases mentioned above.
This will never be fixed in 3.x, but if it's fixed in The Version That Must Not Be Named, my preference would be that they all return [] because then it's easy to write "or ['']" if want the other behavior, while it's much more of a pain to fix the other way around.

For some explanation, see this StackOverflow answer by Raymond Hettinger: https://stackoverflow.com/a/16645307/11461120
participants (5)
-
Antal Gábor
-
Ben Rudiak-Gould
-
Christopher Barker
-
Dennis Sweeney
-
Eric V. Smith