[issue9411] configparser doesn't support specifying encoding in read()

Thu Jul 29 10:08:21 CEST 2010

New submission from Łukasz Langa <lukasz at langa.pl>:

By default, configparser classes simply `open()` and `read()` files specified in the list passed to `.read()`. This means these calls use the default platform-specific encoding and this is prone to breakage.

An existing solution is to use `readfp()` and pass files one by one to it, handling opening them with a specific encoding manually. This is needlessly complex as it increases the amount of boilerplate needed.

Please find attached a patch where I've added an `encoding=` argument to the `read()` method. By default it chooses `sys.getdefaultencoding()` so the behaviour is backwards compatible. We might consider switching that to 'UTF-8' but there are many INI files from the Windows land encoded in Windows specific codepages.

Anyway, the currently proposed implementation is compatible and enables specifying an `encoding` explicitly. The patch includes a new unit test and some minor fixes for behaviour exposed by this test.

----------
files: cfgparser.3
messages: 111899
nosy: brian.curtin, ezio.melotti, georg.brandl, lukasz.langa, michael.foord
priority: normal
severity: normal
status: open
title: configparser doesn't support specifying encoding in read()
Added file: http://bugs.python.org/file18247/cfgparser.3

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9411>
_______________________________________