Linguistically correct Python text rendering
David Opstad
opstad at batnet.com
Wed Feb 25 10:01:52 EST 2004
In article <m3u11ffk1f.fsf at pc150.maths.bris.ac.uk>,
Michael Hudson <mwh at python.net> wrote:
> But it seems to be impossible to programmatically determine which
> encoding the terminal being printed to at a given moment is using (and
> the user can fiddle this at run time). If I'm wrong about this, I'd
> like to know.
The encoding issue is peripheral to my point; sorry if I wasn't clearer
in my original message. It doesn't matter what the encoding is. The main
issue is that for some writing systems (e.g. Arabic) simply outputting
the characters in a Unicode string, irrespective of encoding, will
produce garbled results.
> What more would you have us do?
Well, for those writing systems whose presentation forms are included in
Unicode, how about a further processing step? So that at a minimum, if I
start with an Arabic string like "abc" I can get out an Arabic string
like "CBA" where bidi reordering has happened, and contextual
substitution has been done. Then, outputting the processed Unicode
string using stdout will work without further intervention (assuming a
font for the writing system is present, of course).
It's probably irrational of me, I admit, but I'd love to see Python
correctly render *any* Unicode string, not just the subsets requiring no
reordering or contextual processing.
Dave
More information about the Python-list
mailing list