[Idle-dev] Unicode within IDLE

Fredrik Lundh Fredrik Lundh" <effbot@telia.com
Mon, 3 Apr 2000 16:41:08 +0200


Neil Hodgson wrote:
> I was pleased to find that IDLE is already using a Unicode based text
> widget so is able to display non-roman strings.
>=20
> However, the syntax colouring is disturbed by this. It appears to be
> confused by the difference in length when measuring bytes or =
characters.

the regular expression engine used by IDLE's tokenizer
doesn't (yet) understand UTF-8.

in other words, it sees your UTF-8 character as two
distinct 8-bit characters, while the Text widget sees
it as a single character.

this shouldn't be very hard to fix, once 'sre' is up to
snuff.  maybe in the next alpha.

</F>