[Python-Dev] Improve open() to support reading file starting with an unicode BOM
Victor Stinner
victor.stinner at haypocalc.com
Fri Jan 8 23:10:32 CET 2010
Le vendredi 08 janvier 2010 22:40:47, Eric Smith a écrit :
> >> Shouldn't this encoding guessing be a separate function that you call
> >> on either a file or a seekable stream ?
> >>
> >> After all, detecting encodings is just as useful to have for non-file
> >> streams.
> >
> > Other stream sources typically have out-of-band ways to signal the
> > encoding: only when reading from the filesystem do we pretty much
> > *have* to guess, and in that case the BOM / signature is the best
> > heuristic we have. Also, some non-file streams are not seekable, and so
> > can't be guessed via a pre-pass.
>
> But what if the file were in (for example) a zip file? I think you
> definitely want to have access to this functionality outside of open().
FYI my patch (encoding="BOM") is implemented in TextIOWrapper, and
TextIOWrapper takes a binary stream as input, not a filename.
--
Victor Stinner
http://www.haypocalc.com/
More information about the Python-Dev
mailing list