Mailman 3 [issue18870] eval() uses latin-1 to decode str - docs

Aug. 29, 2013

      Merlijn van Deen added the comment:

On the lowest level, this affects exec, eval(), compile() and input() (!). On a higher level, more modules are affected:

modules ast, codeop, compiler, cProfile, dis, distutils (not sure), doctest, idlelib, ihooks, pdb, pkgutil, plat-mac, py_compile, rexec, runpy and timeit all call compile()

modules dbd, compiler, gettext, idlelib, lib2to3, lib-tk.turtle, logging, mhlib, pdb, plat-irix5, plat-mac, rexec, rlcompleter and warnings all call eval()

and modules Bastion, bdb, code, collections, cProfile, distutils, doctest, idlelib, ihooks, imputil, pdb, plat-irix5, plat-irix6, plat-mac, profile, rexec, site, timeit and trace all call exec.

Not all of them necessarily take user-supplied code - I haven't checked that.

After checking tests/test_pep263.py, it seems the behavior is a bit more complicated than I initially thought: a str parameter is considered latin-1 unless either
 a) an utf-8 bom is present, in which case it is considered utf-8
 b) an # encoding: XXX  line is present, in which case it is considered
    to be in that encoding

In any case, I have attached a doc patch for exec, eval(), compile(), and ast.literal_eval(), because I think these are the most widely used. I think input() does not need a doc change because it explicitly refers to eval().

I ignored the subtleties noted above for the doc patch, simplifying to 'pass either a Unicode or a latin-1 encoded string'.

----------
keywords: +patch
Added file: http://bugs.python.org/file31513/doc_18870.patch

_______________________________________
Python tracker <report@bugs.python.org>
<http://bugs.python.org/issue18870>
_______________________________________

[issue18870] eval() uses latin-1 to decode str

Merlijn van Deen

tags

participants (1)