[issue18870] eval() uses latin-1 to decode str
![](https://secure.gravatar.com/avatar/fa0f7819f1825f596b384c19aa7dcf33.jpg?s=120&d=mm&r=g)
Merlijn van Deen added the comment: On the lowest level, this affects exec, eval(), compile() and input() (!). On a higher level, more modules are affected: modules ast, codeop, compiler, cProfile, dis, distutils (not sure), doctest, idlelib, ihooks, pdb, pkgutil, plat-mac, py_compile, rexec, runpy and timeit all call compile() modules dbd, compiler, gettext, idlelib, lib2to3, lib-tk.turtle, logging, mhlib, pdb, plat-irix5, plat-mac, rexec, rlcompleter and warnings all call eval() and modules Bastion, bdb, code, collections, cProfile, distutils, doctest, idlelib, ihooks, imputil, pdb, plat-irix5, plat-irix6, plat-mac, profile, rexec, site, timeit and trace all call exec. Not all of them necessarily take user-supplied code - I haven't checked that. After checking tests/test_pep263.py, it seems the behavior is a bit more complicated than I initially thought: a str parameter is considered latin-1 unless either a) an utf-8 bom is present, in which case it is considered utf-8 b) an # encoding: XXX line is present, in which case it is considered to be in that encoding In any case, I have attached a doc patch for exec, eval(), compile(), and ast.literal_eval(), because I think these are the most widely used. I think input() does not need a doc change because it explicitly refers to eval(). I ignored the subtleties noted above for the doc patch, simplifying to 'pass either a Unicode or a latin-1 encoded string'. ---------- keywords: +patch Added file: http://bugs.python.org/file31513/doc_18870.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue18870> _______________________________________
participants (1)
-
Merlijn van Deen