[Python-ideas] Support WHATWG versions of legacy encodings

Thu Jan 11 04:01:04 EST 2018

On Thu, Jan 11, 2018 at 7:58 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> On 11.01.2018 01:22, Nick Coghlan wrote:
>> On 11 January 2018 at 05:04, M.-A. Lemburg <mal at egenix.com> wrote:
>>> For the stdlib, I think we should stick to standards and
>>> not go for spreading non-standard ones.
>>>
>>> So -1 on adding WHATWG encodings to the stdlib.
>>
>> We already support HTML5 in the standard library, and saying "We'll
>> accept WHATWG's definition of HTML, but not their associated text
>> encodings" seems like a strange place to draw a line when it comes to
>> standards support.
>
> There's a problem with these encodings: they are mostly meant
> for decoding (broken) data, but as soon as we have them in the stdlib,
> people will also start using them for encoding data, producing more
> corrupted data.
>
> Do you really things it's a good idea to support this natively
> in Python ?
>
> The other problem is that WHATWG considers its documents "living
> standards", i.e. they are subject to change and don't come with
> a version number (apart from a date).
>
> This makes sense when you look at their mostly decoding-only
> nature, but, again for encoding, creates an interoperability problem.

Would it be viable to have them in the stdlib for decoding only? To
have them simply not work for encoding?

ChrisA