[Python-bugs-list] [ python-Feature Requests-681533 ] Additional string stuff.

SourceForge.net noreply@sourceforge.net
Wed, 05 Mar 2003 19:03:12 -0800


Feature Requests item #681533, was opened at 2003-02-06 03:14
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=681533&group_id=5470

Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Jeremy Fincher (jemfinch)
Assigned to: Nobody/Anonymous (nobody)
Summary: Additional string stuff.

Initial Comment:
In a lot of my programs that mess with strings, I end up 
somewhere making a variable "ascii" via 
(string.maketrans('', '')).  It's just a 256-character string 
comprising the ascii character set. 
 
I use it, oftentimes, simply to be able to turn the 'translate' 
method on strings into a 'delete' method -- if I want to, 
say, remove all the spaces in aString, I'd do 
(aString.translate(ascii, string.whitespace)). 
 
Certainly an ascii variable in the string module couldn't 
hurt, would fit in with ascii_letters, etc. and would at least 
standarize the name of the full ascii set sure to be 
present in many Python programs. 
 
A little further out there, but I think just as useful, would 
be a delete method on strings.  So "foo bar 
baz".delete(string.whitespace) would return "foobarbaz".  
It would be equivalent to "foo bar baz".translate(ascii, 
string.whitespace), or the wildly inefficient: 
 
def delete(s, deleteChars): 
    l = [] 
    for c in s: 
        if c not in deleteChars: 
            l.append(c) 
    return ''.join(l) 
 
Anyway, that's all I can think of.  Do with it what you will. 
 
Jeremy 

----------------------------------------------------------------------

>Comment By: Jeremy Fincher (jemfinch)
Date: 2003-03-05 22:03

Message:
Logged In: YES 
user_id=99508

Ah, yes, you're right.  I never knew ASCII was only 7 bits per 
character.  Perhaps string.all_characters?  I just definitely think 
it should be publically available so there can be some 
consistency between applications that need a string of all 256 
8-bit characters. 

----------------------------------------------------------------------

Comment By: Jack Jansen (jackjansen)
Date: 2003-03-05 09:13

Message:
Logged In: YES 
user_id=45365

Note that "ascii" is definitely a bad name, as it means different things to anyone. To Python it usually means "7-bit ASCII" (as in the unicode "ascii" codec). To you it apparently means "8-bit something-or-other".

I have no opinion on whether this feature is a good idea, but if it is I would suggest a name with "any" or "all" in it, and possibly "8bit" too.

----------------------------------------------------------------------

Comment By: Jeremy Fincher (jemfinch)
Date: 2003-03-04 13:25

Message:
Logged In: YES 
user_id=99508

What's the status on this? 
 
I checked string.py, and there actually already is a value 
that's the 256 ASCII characters, but it's called _idmap.  Might 
we consider changing the name of that to "ascii"?  I'd be 
happy to make the patch. 
 
Jeremy 

----------------------------------------------------------------------

Comment By: Jeremy Fincher (jemfinch)
Date: 2003-02-07 01:47

Message:
Logged In: YES 
user_id=99508

I guess that's what I get for not reading the documentation :) 
 
Oh well, the other two suggestions stand :) 
 
Jeremy 

----------------------------------------------------------------------

Comment By: Jeremy Fincher (jemfinch)
Date: 2003-02-07 00:19

Message:
Logged In: YES 
user_id=99508

I guess that's what I get for not reading the documentation :) 
 
Oh well, the other two suggestions stand :) 
 
Jeremy 

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2003-02-06 21:15

Message:
Logged In: YES 
user_id=31435

Just noting that you can pass None for sep if you want to 
explicitly ask for the default behavior.

>>> "      a  b   c".split(None, 1)
['a', 'b   c']
>>>


----------------------------------------------------------------------

Comment By: Jeremy Fincher (jemfinch)
Date: 2003-02-06 04:19

Message:
Logged In: YES 
user_id=99508

Let me also make one other suggestion:  the split method on a 
string object has, by default, behavior that can't be replicated 
by passing an argument as a separator.  That is, the default 
separated acts like re.split(r'\s+'), but it's impossible to pass 
any value into the method to achieve that same result. 
 
The problem arises when a user wants to use the maxsplit() 
parameter to the method.  Because maxsplit is a positional 
parameter instead of a keyword parameter, the user *must* 
declare a separate to split on, and thus loses his ability to split 
on whitespace-in-general.  If maxsplit was changed from being 
a positional parameter to being a keyword parameter, then a 
programmer wouldn't have to give up the default behavior of 
the split method in order to pass it a maxsplit. 
 
At present, negative maxsplit values don't differ in any way 
from split's default behavior (with no maxsplit parameter given).  
Thus, the keyword maxsplit could default to -1 with no 
breakage of code.  I can't see any place where changing 
maxsplit to a keyword parameter would break any existing 
code 
 
Jeremy 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=355470&aid=681533&group_id=5470