How do I automate the removal of all non-ascii characters from my code?
rhodri at wildebst.demon.co.uk
Mon Sep 12 23:39:41 CEST 2011
On Mon, 12 Sep 2011 15:47:00 +0100, jmfauth <wxjmfauth at gmail.com> wrote:
> On 12 sep, 10:49, Steven D'Aprano <steve
> +comp.lang.pyt... at pearwood.info> wrote:
>> Even with a source code encoding, you will probably have problems with
>> source files including \xe2 and other "bad" chars. Unless they happen to
>> fall inside a quoted string literal, I would expect to get a
> This is absurd and a complete non sense. The purpose
> of a coding directive is to inform the engine, which
> is processing a text file, about the "language" it
> has to speak. Can be a html, py or tex file.
> If you have problem, it's probably a mismatch between
> your coding directive and the real coding of the
> file. Typical case: ascii/utf-8 without signature.
Now read what Steven wrote again. The issue is that the program contains
characters that are syntactically illegal. The "engine" can be perfectly
correctly translating a character as a smart quote or a non breaking space
or an e-umlaut or whatever, but that doesn't make the character legal!
Rhodri James *-* Wildebeest Herder to the Masses
More information about the Python-list