[Python-ideas] string codes & substring equality
Terry Reedy
tjreedy at udel.edu
Fri Nov 29 00:36:36 CET 2013
On 11/28/2013 6:43 AM, spir wrote:
> All right, thank you all for the exchange, the issue of substring
> comparison for equality is solved, with either .startswith(substr, i) or
> .find(substr, i,j). But there remain the problem of getting codes
> (unicodes code point) at arbitrary indexes in a string?
Do you mean ord(code[i])? We already have that.
> Is it weird to consider a .code(i) string method?
No and yes. There are hundreds, thousands of simple compositions that
different people might like baked into the language to speed a
particular application. Some numerical users might like Python to have
the C equivalent of
def muladd(a,b,c): return a * b + c # or maybe
def muladd(a,b,d): return a + b * c
> What would be its implementation cost?
What would be the implementation, maintenance, learning, and usability
cost of adding thousands of such little methods?
> I would really have good usage for it,
I believe use of ord is rather rare, as builtins go.
In 2.7, it works with both (byte) strings and unicode.
In 3.3, it does not work with bytes as indexing directly returns
ordinals (b'abc'[1] == 98). So if the text you are parsing is limited to
ascii or and small ascii superset, such as latin-1, you might do better
using the bytes encoding.
If your text potentially includes and unicode char and if you have
measurements that show the the extra cost of the intermediate single
char is really a bottleneck, then add the composed function privately.
Or perhaps you could use ctypes to access the innards of a string and
see if that is faster.
> certainly numerous other use cases exist.
More that a hand wave is needed to demonstrate that.
--
Terry Jan Reedy
More information about the Python-ideas
mailing list