[Python-Dev] Relaxing Unicode error handling

Stuart Bishop stuart at stuartbishop.net
Tue Jan 6 21:14:46 EST 2004


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1


On 03/01/2004, at 10:44 PM, Martin v. Loewis wrote:

> People keep complaining that they get Unicode errors
> when funny characters start showing up in their inputs.
>
> In some cases, these people would apparantly be happy
> if Python would just "keep going", IOW, they want to
> get moji-bake (garbled characters), instead of
> exceptions that abort their applications.

I think the key word here is 'apparently'. I also think
that giving people this knob too tweak will simply shuffle
their problem from one place to another. This will be a net
loss, since we will end up with these users trying to repair
their damaged data, as well as developers having to maintain
systems or integrate with libraries where this knob was twiddled
either through ignorance or as a quick fix. If you are writing
an XML processing application, it might appear quite sane to set
unicode.errorhandling to 'xmlcharref' but it will bite you on
the bum eventually when that project grows.

> I'd like to add a static property unicode.errorhandling,
> which defaults to "strict". Applications could set this
> to "replace" or "ignore", silencing the errors, and
> risking loss of data.

I'd be happier as a command line argument for this sort of thing.
This way, it is explicit that 'I want to run this application with
lossy Unicode conversions' without having to worry about something
you didn't write flipping the switch.

- --  
Stuart Bishop <stuart at stuartbishop.net>
http://www.stuartbishop.net/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (Darwin)

iD8DBQE/+2uZAfqZj7rGN0oRAsyJAJ4yyO1/VwF8131Ev8IN44KtpAH1MQCfRlhI
6e65Oclei4FonvO5gHBbEjc=
=ikNP
-----END PGP SIGNATURE-----




More information about the Python-Dev mailing list