[pypy-dev] RPython unicode support design question

Carl Friedrich Bolz cfbolz at gmx.de
Tue Nov 6 13:07:29 CET 2007


Hi all,

Maciek is working on adding unicode support for RPython and him and me 
were discussing some of the design issues. One of the questions we had 
is how mixing of strings and unicode should work in RPython.

My proposal is the following: forbid _all_ mixing of strings and unicode 
in RPython. Mixing of strings and unicode would result in a SomeObject 
(e.g. "hello" + u"world" wouldn't be allowed). The only way to turn a 
byte string into a unicode object would be to use the encode and decode 
  methods and I guess in the beginning the only allowed encoding would 
be "ascii" (we can think about supporting other encodings later, or 
leave it to something in pypy/rlib).

This has many advantages, it forces the programmer to cleanly separate 
between unicode and byte strings and it makes some part of the 
implementation less messy.

Does that make sense to everybody? Are there other proposals?

Cheers,

Carl Friedrich



More information about the Pypy-dev mailing list