[Python-Dev] "".tokenize() ?

M.-A. Lemburg mal@lemburg.com
Sat, 05 May 2001 10:13:30 +0200


Tim Peters wrote:
> 
> [MAL]
> > Gustavo Niemeyer submitted a patch which adds a tokenize like
> > method to strings and Unicode:
> >
> > "one, two and three".tokenize([",", "and"])
> > -> ["one", " two ", "three"]
> >
> > I like this method -- should I review the code and then check it in ?
> 
> -1 here.  Easily enough done via other means, and you just *know* different
> people will want different variants of tokenization (e.g., nobody in their
> right mind will want " two " coming back from that example, and, given that
> it does, that it doesn't also return " three" is baffling).

Ok. I rejected the patch with a mild response to take on this by
subclassing strings in Python 2.2 ;-)

> > PS: Haven't gotten any response regarding the .decode() method yet...
> > should I take this as "no objections" ?
> 
> +1 from me:  it's the other half of the existing .encode() method, and the
> current lack of symmetry is icky.

Right.

If I here no strong objections, I'll check in the .decode()
method next week.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Company & Consulting:                           http://www.egenix.com/
Python Software:                        http://www.lemburg.com/python/