[ python-Bugs-1388489 ] bug in rstrip & lstrip

SourceForge.net noreply at sourceforge.net
Fri Dec 23 02:23:25 CET 2005


Bugs item #1388489, was opened at 2005-12-23 01:43
Message generated for change (Comment added) made by doerwalter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1388489&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Jason Whitlark (jcdelta)
Assigned to: Nobody/Anonymous (nobody)
Summary: bug in rstrip & lstrip

Initial Comment:
quick detail:
<snip>
Python 2.4.2 (#1, Dec  9 2005, 22:48:42)
[GCC 3.3.6 (Gentoo 3.3.6, ssp-3.3.6-1.0, pie-8.7.8)] on
linux2
Type "help", "copyright", "credits" or "license" for
more information.
>>> "net.tpl".rstrip('.tpl')
'ne'
>>> "foo.tpl".rstrip('.tpl')
'foo'
</snip>

I ran the following code to test this:
<snip>
26 - [jwhitlark at Snowflake]: ~/pythonBugTest
0> cat testForRStripBug.py
#! /usr/bin/python

for word in
open('/opt/openoffice/share/dict/ooo/en_US.dic', 'r'):
    word = word.split('/')[0]
    testWord = (word + '.tpl').rstrip('.tpl')
    if word != testWord:
        print word, testWord
</snip>

And came up with the attached file of incorrect
matches.  Out of 62075 words in the en_US.dic, 6864 do
not match.  Here is the frequency count of the last
letter of the origional word, the only pattern I could
discern so far:
<snip>
0> ./freqCount.py < run1
{'p': 566, 'l': 2437, 't': 3861}
</snip>

No other letters seem to be clipped.  Why this should
be so, I have no idea.  I would guess that the error
was in function do_xstrip in
python/trunk/Objects/stringobject.c, but C is not my
strong suit.  I will be looking at it further when I
have time, but if anyone knows how to fix this, please
help.

----------------------------------------------------------------------

>Comment By: Walter Dörwald (doerwalter)
Date: 2005-12-23 02:23

Message:
Logged In: YES 
user_id=89016

This is not a bug. The documentation
(http://docs.python.org/lib/string-methods.html) says that:
"The chars argument is a string specifying the set of
characters to be removed". i.e. "net.tpl".rstrip(".tpl")
strips every ".", "t", "p" and "l" character from the right
end of the string, *not* every occurence of the character
sequence ".tpl". This seems to be a frequent
misunderstanding, so if you can suggest improvements to the
docstring or the documentation, please do so. 

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1388489&group_id=5470


More information about the Python-bugs-list mailing list