[Python-bugs-list] [ python-Feature Requests-801847 ] Adding rsplit() to string and unicode objects.

Mon Sep 22 10:41:37 EDT 2003

Feature Requests item #801847, was opened at 2003-09-06 19:52
Message generated for change (Settings changed) made by rhettinger
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=801847&group_id=5470

>Category: None
>Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Sean Reifschneider (jafo)
>Assigned to: Nobody/Anonymous (nobody)
Summary: Adding rsplit() to string and unicode objects.

Initial Comment:
I'm attaching patches to the library and documentation

for implementing rsplit() on string and unicode

objects.  This works like split(), but working from the

right.

   ./python -c 'print u"foo, bar, baz".rsplit(None, 1)'

   [u'foo, bar,', u'baz']

This was supposed to be against the CVS code, but I've

had a heck of a time getting it checked out -- my

checkout has been hung for half an hour now.

The code patch is against the 2.3 release, the docs

patch is against the CVS.  My checkout got to docs, but

I didn't have the code to a point where I could build

and test it.

Sean

----------------------------------------------------------------------

Comment By: Jeremy Fincher (jemfinch)
Date: 2003-09-22 08:10

Message:
Logged In: YES 
user_id=99508

As a comment on the ease with which a programmer can get rsplit 

wrong, note that rhettinger's rsplit implementation is not correct: 

compare rsplit('foobarbaz', 'bar') with 'foobarbaz'.split('bar'). 

He forgot to reverse the separator if it's not None. 

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-09-11 18:55

Message:
Logged In: YES 
user_id=80475

Guido, do you care to pronounce on this one?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-09-11 01:07

Message:
Logged In: YES 
user_id=21627

There is PEP 2, which suggests to write a library PEP for

proposal to extend the library. Now, this probably would be

overkill for a single string method. However, I feel that

there are already too many string methods, so I won't accept

that patch. I'm not rejecting it, either, because I see that

other maintainers might have a different opinion. In short,

you should propose your change to python-dev, finding out

what "a majority" of the maintainers thinks; you might also

propose it on python-list, trying to collect reactions from

users. It would then be good to summarize these discussions

here (instead of carrying them out here).

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2003-09-10 15:40

Message:
Logged In: YES 
user_id=81797

I realize that rsplit() can be implemented, because, well, I

implemented it.

The standard library is there to provide ready-to-use

functionality so that users of python can concentrate on

their program instead of concentrate on re-inventing the

wheel.  find() can be implemented with a short loop, split()

can be implemented with find(), join() can be implemented

with a short loop.    Many things can be implemented with a

little additional effort on the part of the user to develop

or locate the code they're wanting.

These little things can add up quickly and can have quite a

dramatic impact on the programming experience in Python. 

Having to find or implement these functions will cause

distraction from the code at hand, time lost while finding,

implementing, testing, and maintaining the code in question.

One of Python's strengths is a rich standard library.  So,

what are the guidelines for determining when it's rich

enough?  Why is it ok to suggest that users should get

distracted from their code to go implement something else? 

Is there a policy that I'm not aware of that new

functionality should be put in the cookbook instead of the

standard library?  Why is it being ignored that some

programmers would find implementing rsplit() challenging?

I'm not trying to be difficult here, I honestly can't

understand the apparent change from having a rich library to

a "batteries not included" stance.  The response I got from

#python when I mentioned having submitted the patch

indicates to me that other experienced Python developers

expect there to be an rsplit().

So, why is there so much resistance to adding something to

the library?  What are the guidelines for determining if

something should be in the library?

Sean

----------------------------------------------------------------------

Comment By: Raymond Hettinger (rhettinger)
Date: 2003-09-10 14:35

Message:
Logged In: YES 
user_id=80475

I would classify this more as a technique than a fundamental 

string operation implemented by all stringlike objects 

(including UserString).  Accordingly, I recommend that the 

patch be closed and a recipe posted in the ASPN cookbook - 

something along the lines of:

>>> def rsplit(s, sep=None, maxsplit=-1):

...     return  [chunk[::-1] for chunk in s[::-1].split(sep, 

maxsplit)[::-1]]

>>> rsplit(u"foo, bar, baz", None, 1)

[u'foo, bar,', u'baz']

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2003-09-10 14:15

Message:
Logged In: YES 
user_id=81797

os.path.basename/os.path.dirname is an example of where you

could use rsplit. One of the other #python folks said he had

recently wanted rsplit for an application where he was

getting the domain name and user part from a list of e-mail

addresses, but found that some entries contained an "@" in

the user part.

Sean

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-09-10 12:08

Message:
Logged In: YES 
user_id=21627

I questioned the usefulness because I could not think of a

meaningful application. Now I see what a potential

application could be, but I doubt its generality, because

that approach would break if there could be two fields that

have commas in them.

I also disagree that symmetry can motivate usefulness: I

also doubt that all of the r* functions are useful, but they

cannot be removed for backwards compatibility. The fact that

rsplit would fit together with the other r* functions

indicates that adding rsplit would provide symmetry, not

that it would provide usefulness.

----------------------------------------------------------------------

Comment By: Sean Reifschneider (jafo)
Date: 2003-09-07 19:56

Message:
Logged In: YES 
user_id=81797

Can you provide more details about why the usefulness of

this function is in question?

First I would like to tell you the story of it coming to be,

then I will answer your incomplete question with a

(probably) incomplete answer.  I had a device which sent me

comma-separated fields, but one of the fields in the middle

could contain a comma.  The answer that seemed obvious to me

was to use split with a maxsplit to get the fields up to

that field, and then a rsplit with a maxsplit on the

remainder.  When I mentioned on #python that I was

implementing rsplit, 4 other fellow python users replied

right away that they had been wanting it.

To answer your question, it's useful because people using

strings are used to having r*() functions like rfind and

rstrip.  The lack of rsplit is kind of glaring in this

context.  Really, though, it's useful because otherwise

people have to implement -- often badly.

Sean

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2003-09-07 14:49

Message:
Logged In: YES 
user_id=21627

Why is this function useful?

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=801847&group_id=5470