[Python-bugs-list] [ python-Bugs-223261 ] string.split

noreply@sourceforge.net noreply@sourceforge.net
Tue, 03 Dec 2002 10:56:01 -0800


Bugs item #223261, was opened at 2000-11-23 17:38
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=223261&group_id=5470

Category: Python Library
Group: Not a Bug
Status: Closed
Resolution: Invalid
Priority: 5
Submitted By: Dimitri Papadopoulos (papadopo)
Assigned to: Nobody/Anonymous (nobody)
Summary: string.split

Initial Comment:
Hi,

The following program:

#!/usr/bin/python
import string
print len(string.split('', ' '))
print len(string.split(' ', ' '))

prints:
1
2

This is of course sort of an undefined case, but I think it should be:
0
0
or at the very least:
0
1

Perl always prints 0.


----------------------------------------------------------------------

Comment By: David Jeske (jeske)
Date: 2002-12-03 18:56

Message:
Logged In: YES 
user_id=7266

Here is a better definition of the problem. string.split and 
string.join are not reciprical in this case.

a = []

b = string.split(string.join(a,","),",")

assert a == b, "this is not reciprocal"

Currently, b is equal to [""]



----------------------------------------------------------------------

Comment By: David Jeske (jeske)
Date: 2002-12-03 18:17

Message:
Logged In: YES 
user_id=7266

The fact that it is documented does not make it useful. I 
wouldn't argue for Perl's strange behavior, however, it is 
strange that splitting an empty string behaves differently 
when providing an explicit split paramater...

string.split("")  --> []
string.split("",",") --> [""]

This behavior is both non-intuitive, and non-useful... for 
example, this loop is bad:

a = "1,2,3"
b = ","
for a_item in string.split(a,b):
  print int(a_item)

Because if a is empty and b is not None, then there is an 
edge case where you have to handle an empty string falling 
through. This should not be the case. 

Doing string.split("",B) should ALWAYS result in the empty 
list.



----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2000-11-26 23:01

Message:
Closed as Not-a-Bug; see /F's remarks.

Note that Perl special-cases the snot out of a single blank used as a separator:  "As a special case, specifying a PATTERN of space (' ') will split on white space just as split() with no arguments does. Thus, split(' ') can be used to emulate awk's default behavior, whereas split(/ /) will give you as many null initial fields as there are leading spaces. A split() on /\s+/ is like a split(' ') except that any leading whitespace produces a null first field. A split() with no arguments really does a split(' ', $_) internally."

That's why Perl acts so differently -- it's not splitting on a single blank at all!  Try splitting on / / in Perl and see what happens (which *really* splits on a blank):

@a = split / /, " ";
@b = split / /, "";
$a = @a;
$b = @b;
print "$a $b\n";


----------------------------------------------------------------------

Comment By: Nobody/Anonymous (nobody)
Date: 2000-11-25 17:47

Message:
not a bug: it's behaving exactly as defined by the documentation:

If the second argument sep is present and not None, it specifies a string to be used as the word separator. The returned list will then have one more item than the number of non-overlapping occurrences of the separator in the string.

</F>

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=223261&group_id=5470