[Tutor] string codes
Alan Gauld
alan.gauld at btinternet.com
Thu Nov 28 02:34:28 CET 2013
On 27/11/13 21:20, spir wrote:
>> py> s.startswith("bcd", 1, -1) and s.endswith("bcd", 1, -1)
>> True
>
> Hum, I don't understand the reasoning, here.
> * First, why use the end-index param? (Here in this case, or in any
> other)? It contradicts the idea of starting with in my view, but also is
> useless for sub-equality checking.
Not so. If you are looking for a string and know the string ends with
that string you want the end point to exclude the known result at the
end. And it is a startswith because you are checking from the
start of the substring.
One use case would be in checking a list of filenames that have a fixed
format. For example I usually name my files with a common tag followed
by a number followed by the type
CuteKitten0001.jpg to CuteKitten0130.jpg
Now I can use a startswith() start point of len('CuteKitten') to find
the files in the range 0010-0019.
Or if a braindead program always adds a file extension and I've
mistakenly specified one so that some files are like
filename.txt.txt
I can search for endswith '.txt' but omit the last .txt which
all(or many) files will have.
I'm sure there are other cases, and I agree there are other ways to
achieve the same end, but the start,end indexes are useful on occasion.
Having said that I've only ever used the start index in practice. But
its nice to have an option.
> * Second, imagining we'd like to use it anyway, shouldn't it be adjusted
> according to the lenth of the checked substring? In other words, why do
> you use '-1'?
To access the last character. It could be say -3 or -4 in the
filename examples above. The reason I used it was that it was
what you used in your original post...
> All in all, startswith plus start-index only seems to work fine, I
> guess. What is wrong? string.find also works (someone suggested it on
> the python-ideas mailing list) but requires both start- and end-
> indexes. Also, startswith returns true/false, which is more adequate
> conceptually for a checking instruction.
Yes, it depends on whether you are after a predicate or a search result.
find() is for locating a substring in a string. 'in' is the preferred
predicate for general testing. startswith/endswith are for the special
cases where you know which end the string will be and (presumably!) are
optimised for those cases.
>>>> s = "abcde"
>>>> s.startswith("bcd", 1)
> True
>>>> s.find("bcd", 1, 4)
> 1
>>>> s = "abcdefghi"
>>>> s.startswith("bcd", 1)
> True
>>>> s.startswith("bcd", 2)
> False
You missed the cases using end index:
>>> s.startswith('bcd',2,4)
False
>>> s.startswith('bcd',2,5)
True
If you have a fixed record size string then testing against
only the specific substring may be important.
> I really want to know if I'm missing an obvious logical point, here;
> else, I will stupidly use startswith, which seems to do the job, and
> this w/o unduly creating unneeded string objects.
Again I'd suggest you are over stressing this temporary string thing.
testing against a slice is the easiest (and most flexible) way
and really is not much of an overhead. And if you just want to
know if a substring exists use 'in'.
HTH
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
http://www.flickr.com/photos/alangauldphotos
More information about the Tutor
mailing list