[Patches] [ python-Patches-421214 ] splitlist() for raw and unicode strings

noreply@sourceforge.net noreply@sourceforge.net
Fri, 04 May 2001 12:23:49 -0700


Patches item #421214, was updated on 2001-05-03 18:29
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421214&group_id=5470

Category: core (C code)
Group: None
Status: Closed
Resolution: Rejected
Priority: 5
Submitted By: Gustavo Niemeyer (niemeyer)
Assigned to: M.-A. Lemburg (lemburg)
Summary: splitlist() for raw and unicode strings

Initial Comment:
This patch implements a feature that I wanted for a
long time in
python. It implements the splitlist() method for normal
and unicode strings.

This allows one to do something like:

"one, two and three".splitlist([",", "and"])

and get:

["one", " two ", "three"]


----------------------------------------------------------------------

>Comment By: Gustavo Niemeyer (niemeyer)
Date: 2001-05-04 12:23

Message:
Logged In: YES 
user_id=7887

Yes, I knew this could be done... but this is *many* times
slower than splitlist(). A small code doesn't mean a fast
code (especially in a high
level language). Btw, what splitlines()'s doing there? ;-)

Anyway... you know what's best for the language.

Thanks!

----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-05-04 11:41

Message:
Logged In: YES 
user_id=38388

Sorry, but the resonance I got from python-dev is too
negative to check this patch in.

Here's the alternative code which pretty much does the same
thing using a function (by Fredrik Lundh):

def tokenize(string, seps):
    return re.split("|".join(map(re.escape, seps)), string)

The good news is that you will probably be able to subclass
strings in one of the next releases (perhaps even Python
2.2).

----------------------------------------------------------------------

Comment By: Gustavo Niemeyer (niemeyer)
Date: 2001-05-04 06:53

Message:
Logged In: YES 
user_id=7887

Ok. Thanks.

About the method name, I'm just trying to follow the naming
convention already there. There are methods named split(),
splitlines(). I thought
splitlist() would be a good "brother".


----------------------------------------------------------------------

Comment By: M.-A. Lemburg (lemburg)
Date: 2001-05-04 00:54

Message:
Logged In: YES 
user_id=38388

I like this idea, but will have to check with the code bloat
police first.

BTW, I'd rename .splitlist() to .tokenize() since that's
what the method is really about (it is very similiar to C's
strtok()).

----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=305470&aid=421214&group_id=5470