[Python-Dev] PEP 383 update: utf8b is now the error handler
"Martin v. Löwis"
martin at v.loewis.de
Wed May 6 09:53:33 CEST 2009
> > > Second, I suggest "surrogate-replace" as the name of the error handler
> > > rather than "utf8b".
> >
> > I think this is bike-shedding.
>
> I don't personally care (I already was aware of UTF-8B), but there are
> plenty of others who do.
I think it is a fairly bad name, because it is easy to confuse it with
the "surrogates" error handler (unless you suggest to rename that also).
> You have to fix the existing uses of
> the obsolete "python-escape", anyway.
Indeed - but only in the PEP. In the implementation, it's already utf8b
throughout. Now it is also in the PEP; thanks for pointing that out.
> > It's a security risk. If U+DCXX would map to \xXX, then somebody could
> > embed U+DC2E U+DC2E U+DC2F into a character string; even if this gets
> > sanitized, nobody would expect that this will actually access ../
>
> The odds that anybody will actually take notice of U+002E U+002E
> U+002F in a string are sufficiently small that any number of exploits
> have already been based on it. I agree that there is some additional
> risk from this if people make the check for "../" before they prepend
> "\ucd2e\udc2e\udc2f", but I think that risk is very small compared to
> the pain of having a error handler whose raison d'etre is to not raise
> exceptions go ahead and raise them anyway.
The problem is that functions like normpath will recognize ../, and
that applications rely on them for file name sanitation. If they could
be tricked into writing outside of their target folders, this would
be a huge security risk.
OTOH, I don't care breaking applications on misconfigured systems.
People using SJIS as their locale encodings have bigger problems
than Python raising exceptions.
> See also my reply to Lino Mastrodomenico.
URL?
> But you're writing the PEP, so this battle will have to be deferred.
> Eventually Python will have to take a stand on Unicode conformance,
> but it's not urgent yet.
I think it's always applications that are conforming or not, rather
than libraries. Libraries should allow to write conforming applications.
They may refuse to write certain non-conforming applications (although
users then replace the library with one that does allow them to do
what they want). Libraries can never enforce that applications conform
to some standard.
> Sorry! I suggest substituting the paragraph above for the paragraph
> which begins "The encode error handler interface presentlyrequires..."
> at line 129.
Ah, ok. This was Glen Linderman's text before - now it's yours :-)
> I think I forgot to do this before: "I hereby dedicate all text
> I suggest for inclusion in the PEP to the public domain."
:-)
Martin
More information about the Python-Dev
mailing list