[Python-Dev] Improve open() to support reading file starting with an unicode BOM

Eric Smith eric at trueblade.com
Fri Jan 8 22:40:47 CET 2010


>> Shouldn't this encoding guessing be a separate function that you call
>> on either a file or a seekable stream ?
>>
>> After all, detecting encodings is just as useful to have for non-file
>> streams.
>
> Other stream sources typically have out-of-band ways to signal the
> encoding:  only when reading from the filesystem do we pretty much
> *have* to guess, and in that case the BOM / signature is the best
> heuristic we have.  Also, some non-file streams are not seekable, and so
> can't be guessed via a pre-pass.

But what if the file were in (for example) a zip file? I think you
definitely want to have access to this functionality outside of open().

Eric.




More information about the Python-Dev mailing list