1 Oct
2008
1 Oct
'08
4:22 a.m.
On Tue, Sep 30, 2008 at 8:06 PM, <glyph@divmod.com> wrote:
The proposal of using U+0000 seems like it would have been almost the same from such a wrapper's perspective, except (A) people using the filesystem APIs without the benefit of such a wrapper would have been even more screwed, and (B) there are a few nasty corner-cases when dealing with surrogate (i.e. invalid, in UTF-8) code points which I'm not quite sure what it would have done with.
Surrogates in UTF-8 *should* be treated as errors, but current python is far too lax. That actually leads to another problem: improving validating will change what gets escaped and what doesn't. http://bugs.python.org/issue3297 http://bugs.python.org/issue3672 -- Adam Olsen, aka Rhamphoryncus