Alexandre Delanoë, 16.12.2013 16:57:
Année 2013, vendredi 13 décembre, vers 07:09, Stefan Behnel écrivait:
It seems to me that this has nothing specifically to do with lxml. Your problem is badly/differently encoded file names. The solution might be to make sure you only use one encoding for file names, i.e. the one that your operating system (read: file system) expects.
You are right but the parser is supposed to parse files from differents operating systems.
Well, it does, as long as you stay within the bounds of each operating system. Note that the parser is only ever running on one operating system at a time. (Although, as I said, the problem you describe is not an OS issue but a file system issue.) As soon as you start transferring files between different systems, it's your own responsibility to adapt the files and/or their names as needed. For example, you may have to adapt the encoding that a file system is mounted with in order to integrate it properly into the currently running system. Basically, using different encodings on the same file system is just screaming for trouble in all sorts of places. Imagine the case where a directory name is encoded in one encoding and a file name in that directory uses a different encoding. Then there is simply no way to decode the complete file path any more. Stefan