[Python-Dev] Improve open() to support reading file starting with an unicode BOM

"Martin v. Löwis" martin at v.loewis.de
Sun Jan 10 00:40:15 CET 2010

>> How does the requirement that it be implemented as a codec miss the
>> point?
> If we want it to be the default, it must be able to fallback on the current
> locale-based algorithm if no BOM is found. I don't think it would be easy for a
> codec to do that.

Yes - however, Victor currently apparently *doesn't* want it to be the
default, but wants the user to specify encoding="BOM". If so, it isn't
the default, and it is easy to implement as a codec.

>> FWIW, I agree with Walter that if it is provided through the encoding=
>> argument, it should be a codec. If it is built into the open function
>> (for whatever reason), it must be provided by some other parameter.
> Why not simply encoding=None?

I don't mind. Please re-read Walter's message - it only said that
*if* this is activated through encoding="BOM", *then* it must be
a codec, and could be on PyPI. I don't think Walter was talking about
the case "it is not activated through encoding='BOM'" *at all*.

> The default value should provide the most useful
> behaviour possible. Forcing users to choose between two different autodetection
> strategies (encoding=None and another one) is a little insane IMO.

That wouldn't disturb me much. There are a lot of things in that area
that are a little insane, starting with Microsoft Windows :-)


More information about the Python-Dev mailing list