[New-bugs-announce] [issue4426] UTF7 decoding is far too strict

Nick Barnes report at bugs.python.org
Tue Nov 25 12:11:56 CET 2008


New submission from Nick Barnes <Nick.Barnes at pobox.com>:

UTF-7 decoding raises an exception for any character not in the RFC2152
"Set D" (directly encoded characters).  In particular, it raises an
exception for characters in "Set O" (optional direct characters), such
as < = > [ ] @ etc.  These characters can legitimately appear in
UTF-7-encoded text, and should be decoded (as themselves).  As it is,
the UTF-7 decoder can't reliably be used to decode any UTF-7 text other
than that encoded by Python's own UTF-7 encoder.

Looking at the source of unicodeobject.c, the call to the SPECIAL macro
on line 1009 has hardcoded second and third arguments of zero.  Maybe
changing the second argument to 1 would fix this.  Maybe.

----------
components: Unicode
messages: 76405
nosy: Nick Barnes
severity: normal
status: open
title: UTF7 decoding is far too strict
type: behavior
versions: Python 2.6

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue4426>
_______________________________________


More information about the New-bugs-announce mailing list