pretty strange behavior of "strip"
MRAB
google at mrabarnett.plus.com
Fri Dec 5 10:59:10 EST 2008
rdmurray at bitdance.com wrote:
> On Thu, 4 Dec 2008 at 20:54, Terry Reedy wrote:
>>> 'toc.html'
>>> > > > test[4].strip('.html')
>>> 'oc'
>>>
>>> Can't figure out what is going on, really.
>>
>> What I can't figure out is why, when people cannot figure out what is
>> going on with a function (or methods in this case), they do not look
>> it up the doc. (If you are an exception and did, what confused you?)
>> Can you enlighten me?
>
> I'm a little embarrassed to admit this, since I've been using python for
> many years, but until I read these posts I did not understand how strip
> used its string argument, and I _have_ read the docs. I can't tell you
> what confused the OP, but I can tell you what confused me.
>
> I have often wished that in 'split' I could specify a _set_ of characters
> on which the string would be split, in the same way the default list
> of whitespace characters causes a split where any one (or more) of
> them appears. But instead the string argument is a multi-character
> separator. (Which is sometimes useful and I wouldn't want to lose the
> ability to specify a multi-character separator!)
>
> My first experience in using the string argument was with split, so when I
> ended up using it with strip, by analogy I assumed that the string passed
> to strip would also be a multi-character string, and thus stripped only
> if the whole string appeared exactly. Reading the documentation did
> not trigger me reconsider that assumption. I guess I'm just lucky that
> I haven't run into any bugs (but I think I've used the string argument
> to strip only once or twice in my career).
>
> It would be lovely if both the split and strip methods would have a
> second string argument that would use the string in the opposite sense
> (as a set for split, as a sequence match for strip).
>
> In the meantime the docs could be clarified by replacing:
>
> the characters in the string will be stripped
>
> with
>
> all occurrences of any of the characters in the string will be
> stripped
>
> --RDM
>
> PS: the OP might want to look at th os.path.splitext function.
>
If I had thought about it early enough I could have suggested that in
Python 3 split() and strip() should accept either a string or a set of
strings. It's still possible to extend split() in the future, but
changing the behaviour of strip() with a string argument would break
existing code, something which might have been OK as part of changes in
Python 3. Unfortunately I don't have access to the time machine! :-)
More information about the Python-list
mailing list