[New-bugs-announce] [issue39154] "utf8-sig" missing from codecs (inconsistency)

Peter Ludemann report at bugs.python.org
Sun Dec 29 12:42:10 EST 2019


New submission from Peter Ludemann <peter.ludemann at gmail.com>:

In general, 'utf8' and 'utf-8' are interchangeable in the codecs (and in many parts of the Python library). However, 'utf8-sig' is missing ... and it happens to also be generated by lib2to3.tokenize.detect_encoding.

>>> import codecs
>>> codecs.getincrementaldecoder('utf-8-sig')()
<encodings.utf_8_sig.IncrementalDecoder object at 0x7fecbcdbbc10>
>>> codecs.getincrementaldecoder('utf8-sig')()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python3.6/codecs.py", line 987, in getincrementaldecoder
    decoder = lookup(encoding).incrementaldecoder
LookupError: unknown encoding: utf8-sig

----------
components: Unicode
messages: 358994
nosy: Peter Ludemann, ezio.melotti, vstinner
priority: normal
severity: normal
status: open
title: "utf8-sig" missing from codecs (inconsistency)
type: behavior
versions: Python 3.6, Python 3.7, Python 3.8

_______________________________________
Python tracker <report at bugs.python.org>
<https://bugs.python.org/issue39154>
_______________________________________


More information about the New-bugs-announce mailing list