string/re behaviour not consistent with null string

Inyeol Lee inyeol_lee at yahoo.com
Tue Aug 6 21:56:45 EDT 2002


I've seen a long thread on '' in 'abc' in Python-Dev, and checked null
string behavior of other string methods and re functions. What I found is;

1. many of them assume null character between normal characters and
   at the start/end of string;

   "abc".count("") -> 4
   "abc".endswith("") -> 1
   "abc".find("") -> 0
   "abc".index("") -> 0
   "abc".rfind("") -> 3
   "abc".rindex("") -> 3
   "abc".startswith("") -> 1

   re.search("", "abc").span() -> (0, 0)
   re.match("", "abc").span() -> (0, 0)
   re.findall("", "abc") -> ['', '', '', '']
   re.sub("", "_", "abc") -> '_a_b_c_'
   re.subn("", "_", "abc") -> ('_a_b_c_', 4)

2. some of them generate exception;

   "" in "abc"
   "abc".replace("", "_")
   "abc".split("")

3. one of them ignores null string match;

   re.split("", "abc") -> ['abc']

(I couldn't test re.finditer but it seems to be the same as re.findall)


It looks like that "" in "abc" returns True instead of generating exception
in 2.3. Then how about changing others too? Since s.replace() and s.split()
currently generate exception, existing code breakage might not be that 
significant. For re.split(), maybe it's too late to change...

Inyeol



More information about the Python-list mailing list