[Python-ideas] Py3k invalid unicode idea
Terry Reedy
tjreedy at udel.edu
Thu Oct 9 20:42:48 CEST 2008
Dillon Collins wrote:
> On Thursday 09 October 2008, Stephen J. Turnbull wrote:
> With my proposal, unicode strings would have a valid flag, and one could
> easily modify PyUnicode_AS_UNICODE to return NULL (and a UnicodeError) if the
> string is invalid, and make a PyUnicode_AS_RAWUNICODE that wouldn't. Or you
> could simply document that libraries need to call a PyUnicode_ISVALID to
> determine whether or not the string contains invalid codes.
Would it make any sense to have a Filename subclass or a BadFilename
subclass or more generally a PUAcode subclass for any unicode generated
by the core that uses the PUA? In either of the latter cases, any app
using the PUA would/could know not to mix PUAcode instances into their
own unicode. And leakage without re-encoding into bytes could be
inhibited more easily.
tjr
More information about the Python-ideas
mailing list