[Python-Dev] PEP 383 update: utf8b is now the error handler
Antoine Pitrou
solipsis at pitrou.net
Wed May 6 11:17:43 CEST 2009
Martin v. Löwis <martin <at> v.loewis.de> writes:
>
> > I don't personally care (I already was aware of UTF-8B), but there are
> > plenty of others who do.
>
> I think it is a fairly bad name, because it is easy to confuse it with
> the "surrogates" error handler (unless you suggest to rename that also).
I didn't bother to say it at the time, but I think "surrogates" is a pretty bad
name. It should be more indicative of what it does, e.g. "surrogates-pass", or
"surrogates-accept".
> > > It's a security risk. If U+DCXX would map to \xXX, then somebody could
> > > embed U+DC2E U+DC2E U+DC2F into a character string; even if this gets
> > > sanitized, nobody would expect that this will actually access ../
Agreed this is an annoying security breach. The whole point of the PEP is that
application developers do not have to care about filename encoding issues,
which is defeated is they have to check for strange (illegal) combinations of
characters.
By the way, what are the ASCII characters that are not suppported by Shift-JIS?
Not many I suppose? (if I read the Wikipedia entry correctly, it's only the
backslash and the tilde).
Regards
Antoine.
More information about the Python-Dev
mailing list