[issue5604] imp.find_module() mixes UTF8 and MBCS
Andrew Svetlov
report at bugs.python.org
Sun Apr 5 06:59:12 CEST 2009
Andrew Svetlov <andrew.svetlov at gmail.com> added the comment:
Continuing work on problem I figured out:
* on Windows it's impossible to convert filenames to file system
encoding without and don't miss something.
* Windows can work properly only with unicode (wchar_t) characters.
* all other systems feels itself good using utf-8 (or another filesystem
encoding).
* it's very errorprone to change 'char*' to 'PyUnicode*'.
To solve this problem I assume:
* all char* in Python API is utf-8.
* if there are need call to operation system api like fopen - call
imp_fopen, this function will do need conversions. Inside import.c there
are 4 calls: fopen, open_exclusive, unlink, stat. I want to write stubs
for this calls.
* also loaders for dynamic modules aka 'C extensions' have to expect
utf-8 as pathname parameter, not 'filesystem encoded'.
Patch for windows is applied (STILL NOT CONVERTED TO OTHER OS).
But for Windows it works (regression tests passed).
If this solution is applicable for 3.1 (as I know Cannon works on excellent importlib but this library will replace imp functionality only
in 3.2) - I can continue switching. Unfortunately I cannot test py3k
trunk on non-windows machines - but I can 'make all OS calls as
expected' and wait for buildbot answer.
Please review import_patch_4th_edition.zip and if I ran in wrong way -
let me know.
----------
nosy: +brett.cannon
Added file: http://bugs.python.org/file13618/import_patch_4th_edition.zip
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue5604>
_______________________________________
More information about the Python-bugs-list
mailing list