[Python-ideas] string codes & substring equality

Wed Nov 27 15:53:26 CET 2013

On 27 November 2013 14:32, spir <denis.spir at gmail.com> wrote:
> In both cases, I guess ordinary idiomatic Python code actually _creates_ a
> new string object, as a substring of length 1 or more, which is otherwise
> useless; for instance:
>
>     if s[i] == char:
>         # match ok -- object s[i] unneeded
>
>     if s[i:j] == substr:
>         # match ok -- object s[i:j] unneeded
>
> What is actually needed is just to check for equality (or another check
> about a code, see below).

I almost never index or slice strings, so this isn't an issue I've
encountered. My first thought is that your approach may well be
sub-optimal, and something that avoids indexing might be better - but
without knowing the details of what you're trying to do it's hard to
say for sure. Also, I'd do some profiling to check that this really is
the performance bottleneck before worrying too much about optimising
it.

But assuming you've done that and there's a real issue here, you could
probably use str.find:

    if s.find(char, i, i+1) != -1:
        # match ok
    if s.find(substr, i, j) != -1:
        # match ok

It's a bit of a hack, as find in theory scans forward from the start
point - but by making the slice length the same as the length of the
search string it won't do that. And after all, low-level performance
tweaks generally *are* somewhat hackish, sacrificing obviousness for
speed, in my experience :-)

Paul