Re: [pypy-dev] Unicode encode/decode speed

Feb. 17, 2013


      On Sun, Feb 17, 2013 at 11:43 AM, Armin Rigo <arigo@tunes.org> wrote:
...
Hi,
On Tue, Feb 12, 2013 at 7:14 PM, Eleytherios Stamatogiannakis
<estama@gmail.com> wrote:
...
Also we are looking into adding a special ffi.string_decode_UTF8 in CFFI's
backend to reduce the number of calls that are needed to go from utf8_char*
to PyPy's unicode.
A first note: I'm wondering why you need to convert from
utf-8-that-contains-only-ascii, to unicode, and back.  What is the
point of having unicode strings in the first place?  Can't you just
pass around your complete program plain non-unicode strings?
If not, then indeed, it would make (a bit of) sense to have ways to
convert directly between "char *" and unicode strings, in both
directions, assuming utf-8.  This could be done with an API like:
ffi.encode_utf8(unicode_string) -> new_char*_cdata
ffi.encode_utf8(unicode_string, target_char*_cdata, maximum_length)
ffi.decode_utf8(char*_cdata, [length]) -> unicode_string
Alternatively, we could accept unicode strings whenever a "char*" is
expected and encode it to utf-8, but that sounds a bit too magical.
A bientôt,
Armin.
_______________________________________________
pypy-dev mailing list
pypy-dev@python.org
http://mail.python.org/mailman/listinfo/pypy-dev
We should add rffi.charp2unicode too

Re: [pypy-dev] Unicode encode/decode speed

Maciej Fijalkowski