[Tutor] string codes

Thu Nov 28 02:34:28 CET 2013

On 27/11/13 21:20, spir wrote:

>> py> s.startswith("bcd", 1, -1) and s.endswith("bcd", 1, -1)
>> True
>
> Hum, I don't understand the reasoning, here.
> * First, why use the end-index param? (Here in this case, or in any
> other)? It contradicts the idea of starting with in my view, but also is
> useless for sub-equality checking.

Not so. If you are looking for a string and know the string ends with 
that string you want the end point to exclude the known result at the 
end. And it is a startswith because you are checking from the
start of the substring.

One use case would be in checking a list of filenames that have a fixed 
format. For example I usually name my files with a common tag followed 
by a number followed by the type

CuteKitten0001.jpg to CuteKitten0130.jpg

Now I can use a startswith() start point of len('CuteKitten') to find 
the files in the range 0010-0019.

Or if a braindead program always adds a file extension and I've 
mistakenly specified one so that some files are like

filename.txt.txt

I can search for endswith '.txt' but omit the last .txt which
all(or many) files will have.

I'm sure there are other cases, and I agree there are other ways to 
achieve the same end, but the start,end indexes are useful on occasion.
Having said that I've only ever used the start index in practice. But 
its nice to have an option.

> * Second, imagining we'd like to use it anyway, shouldn't it be adjusted
> according to the lenth of the checked substring? In other words, why do
> you use '-1'?

To access the last character. It could be say -3 or -4 in the
filename examples above. The reason I used it was that it was
what you used in your original post...

> All in all, startswith plus start-index only seems to work fine, I
> guess. What is wrong? string.find also works (someone suggested it on
> the python-ideas mailing list) but requires both start- and end-
> indexes. Also, startswith returns true/false, which is more adequate
> conceptually for a checking instruction.

Yes, it depends on whether you are after a predicate or a search result.
find() is for locating a substring in a string. 'in' is the preferred 
predicate for general testing. startswith/endswith are for the special 
cases where you know which end the string will be and (presumably!) are 
optimised for those cases.

>>>> s = "abcde"
>>>> s.startswith("bcd", 1)
> True
>>>> s.find("bcd", 1, 4)
> 1
>>>> s = "abcdefghi"
>>>> s.startswith("bcd", 1)
> True
>>>> s.startswith("bcd", 2)
> False

You missed the cases using end index:

 >>> s.startswith('bcd',2,4)
False
 >>> s.startswith('bcd',2,5)
True

If you have a fixed record size string then testing against
only the specific substring may be important.

> I really want to know if I'm missing an obvious logical point, here;
> else, I will stupidly use startswith, which seems to do the job, and
> this w/o unduly creating unneeded string objects.

Again I'd suggest you are over stressing this temporary string thing.
testing against a slice is the easiest (and most flexible)  way
and really is not much of an overhead. And if you just want to
know if a substring exists use 'in'.

HTH
-- 
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos