[Python-bugs-list] [ python-Bugs-223261 ] string.split
noreply@sourceforge.net
noreply@sourceforge.net
Tue, 03 Dec 2002 10:56:01 -0800
Bugs item #223261, was opened at 2000-11-23 17:38
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=223261&group_id=5470
Category: Python Library
Group: Not a Bug
Status: Closed
Resolution: Invalid
Priority: 5
Submitted By: Dimitri Papadopoulos (papadopo)
Assigned to: Nobody/Anonymous (nobody)
Summary: string.split
Initial Comment:
Hi,
The following program:
#!/usr/bin/python
import string
print len(string.split('', ' '))
print len(string.split(' ', ' '))
prints:
1
2
This is of course sort of an undefined case, but I think it should be:
0
0
or at the very least:
0
1
Perl always prints 0.
----------------------------------------------------------------------
Comment By: David Jeske (jeske)
Date: 2002-12-03 18:56
Message:
Logged In: YES
user_id=7266
Here is a better definition of the problem. string.split and
string.join are not reciprical in this case.
a = []
b = string.split(string.join(a,","),",")
assert a == b, "this is not reciprocal"
Currently, b is equal to [""]
----------------------------------------------------------------------
Comment By: David Jeske (jeske)
Date: 2002-12-03 18:17
Message:
Logged In: YES
user_id=7266
The fact that it is documented does not make it useful. I
wouldn't argue for Perl's strange behavior, however, it is
strange that splitting an empty string behaves differently
when providing an explicit split paramater...
string.split("") --> []
string.split("",",") --> [""]
This behavior is both non-intuitive, and non-useful... for
example, this loop is bad:
a = "1,2,3"
b = ","
for a_item in string.split(a,b):
print int(a_item)
Because if a is empty and b is not None, then there is an
edge case where you have to handle an empty string falling
through. This should not be the case.
Doing string.split("",B) should ALWAYS result in the empty
list.
----------------------------------------------------------------------
Comment By: Tim Peters (tim_one)
Date: 2000-11-26 23:01
Message:
Closed as Not-a-Bug; see /F's remarks.
Note that Perl special-cases the snot out of a single blank used as a separator: "As a special case, specifying a PATTERN of space (' ') will split on white space just as split() with no arguments does. Thus, split(' ') can be used to emulate awk's default behavior, whereas split(/ /) will give you as many null initial fields as there are leading spaces. A split() on /\s+/ is like a split(' ') except that any leading whitespace produces a null first field. A split() with no arguments really does a split(' ', $_) internally."
That's why Perl acts so differently -- it's not splitting on a single blank at all! Try splitting on / / in Perl and see what happens (which *really* splits on a blank):
@a = split / /, " ";
@b = split / /, "";
$a = @a;
$b = @b;
print "$a $b\n";
----------------------------------------------------------------------
Comment By: Nobody/Anonymous (nobody)
Date: 2000-11-25 17:47
Message:
not a bug: it's behaving exactly as defined by the documentation:
If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string.
</F>
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=223261&group_id=5470