<br><br><div><span class="gmail_quote">On 1/2/07, <b class="gmail_sendername">Barry Warsaw</b> <<a href="mailto:barry@python.org">barry@python.org</a>> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
-----BEGIN PGP SIGNED MESSAGE-----<br>Hash: SHA1<br><br>On Jan 2, 2007, at 10:53 PM, Anthony Baxter wrote:<br><br>> Additionally, base32 and base16 are not supported by codecs,<br>> according to the docs, and neither is the ability to specify
<br>> alternate character mappings (I don't know how heavily used the<br>> last is, though).<br><br>Which reminds me of another problem I have with the codecs module.<br>codecs are a potential security issue because they are a backdoor way
<br>to get modules imported. For example, if I get an email with a<br>specified charset, the natural thing is to want to use .decode() to<br>turn that into a unicode. The problem is that Python can be<br>essentially tricked into importing any module that way. Try this:
<br><br>Python 2.4.3 (#1, Jun 12 2006, 19:42:21)<br>[GCC 4.0.1 (Apple Computer, Inc. build 5341)] on darwin<br>Type "help", "copyright", "credits" or "license" for more information.
<br> >>> import sys<br> >>> sys.modules['smtplib']<br>Traceback (most recent call last):<br> File "<stdin>", line 1, in ?<br>KeyError: 'smtplib'<br> >>> 'foo'.decode('smtplib')
<br>Traceback (most recent call last):<br> File "<stdin>", line 1, in ?<br>LookupError: unknown encoding: smtplib<br> >>> sys.modules['smtplib']<br><module 'smtplib' from '/opt/local/Library/Frameworks/
<br>Python.framework/Versions/2.4/lib/python2.4/smtplib.pyc'><br> >>></blockquote><div><br> </div><br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
Okay, so this doesn't open any more holes than are already opened,<br>but I worry that the style of encouraging codec use will make such<br>potential holes more widespread. The problem is that it's very<br>difficult to code defensively around this because you can't really
<br>whitelist or blacklist the set of valid codecs, and you can't<br>feasibly audit every importable module to see if it has nasty import<br>side effects.<br><br>It would help if codec lookup wasn't import based, or somehow any
<br>import side-effects could be isolated.</blockquote><div><br>I bet using an absolute import on the part of the codecs module to specify the encodings package would deal with a lot of your worries.<br></div><br>-Brett</div>