[pypy-dev] Speeds of various utf8 operations
Maciej Fijalkowski
fijall at gmail.com
Sun Mar 5 14:24:24 EST 2017
This is checking for spaces in unicode (so it's known to be valid utf8)
On Sun, Mar 5, 2017 at 11:14 AM, Armin Rigo <armin.rigo at gmail.com> wrote:
> Hi Maciej,
>
> On 4 March 2017 at 19:01, Maciej Fijalkowski <fijall at gmail.com> wrote:
>> def next_codepoint_pos(code, pos):
>> chr1 = ord(code[pos])
>> if chr1 < 0x80:
>> return pos + 1
>> if 0xC2 >= chr1 <= 0xDF:
>> return pos + 2
>> if chr >= 0xE0 and chr <= 0xEF:
>> return pos + 3
>> return pos + 4
>
> If you don't want error checking, then you can simplify a bit the
> range checks here. Maybe it gives some more gains, but who knows:
>
> def next_codepoint_pos(code, pos):
> chr1 = ord(code[pos])
> if chr1 < 0x80:
> return pos + 1
> if chr1 <= 0xDF:
> return pos + 2
> if chr1 <= 0xEF:
> return pos + 3
> return pos + 4
>
>
> A bientôt,
>
> Armin.
More information about the pypy-dev
mailing list