[Python-3000] Displaying strings containing unicode escapes

Terry Reedy tjreedy at udel.edu
Thu May 1 19:49:37 CEST 2008


""Martin v. Löwis"" <martin at v.loewis.de> wrote in message 
news:4819F21D.8070808 at v.loewis.de...
|>  > I think "standard repertoire based on Unicode" may be confusing the 
issue.
| >
| > By "standard repertoire" I mean that all Pythons will show the same
| > characters the same way, while "based on Unicode" is intended to mean
| > looking at TR#36 and TR#39 in picking the repertoires.
|
| I don't think either TR#36 or TR#39 are applicable here. This is not
| identifier syntax; there may various symbols and whatnot in the
| string, which should also be rendered as-is.
|
| The escaping that repr() does is *not* to achieve unambiguity,
| but to achieve printability.

I agree with Martin that chasing 'unambiguity' is something of a chimera. 
Whether or not the glyphs for two Unicode chars are identical or not 
depends on the display system.  As I type these here, 1(one) and l (el) are 
barely distinguishable, depending on reading lens and distance.  Should one 
be excaped?  I think not.  I have had displays in which they are pixel for 
pixel identical, but also ones which made them clearly different. Ditto for 
0 (zero) and O (Oh).  A and <Alpha> *could* be made to look different on 
modern high-definition outputs.  I suspect they already have been or will 
be.

I think standard Python should somehow have two options: escape everything 
but ASCII (for unambuguity and old display systems) and escape nothing that 
is potentially printable (leaving partially capable systems to fare as they 
will).  In-between solutions will ultimately be programmer and system 
specific.

Terry Jan Reedy





More information about the Python-3000 mailing list