[Python-Dev] Improve open() to support reading file starting with an unicode BOM
Eric Smith
eric at trueblade.com
Fri Jan 8 22:40:47 CET 2010
>> Shouldn't this encoding guessing be a separate function that you call
>> on either a file or a seekable stream ?
>>
>> After all, detecting encodings is just as useful to have for non-file
>> streams.
>
> Other stream sources typically have out-of-band ways to signal the
> encoding: only when reading from the filesystem do we pretty much
> *have* to guess, and in that case the BOM / signature is the best
> heuristic we have. Also, some non-file streams are not seekable, and so
> can't be guessed via a pre-pass.
But what if the file were in (for example) a zip file? I think you
definitely want to have access to this functionality outside of open().
Eric.
More information about the Python-Dev
mailing list