A challenge to the ASCII proponents.
Steven D'Aprano
steve at cyber.com.au
Sun Jul 20 21:56:51 EDT 2003
Alan Kennedy <alanmk at hotmail.com> wrote in message news:<3F1AAC0A.7A6326B at hotmail.com>...
> Alan Kennedy:
>
> > The final point I'd like to make [explicit] is: nobody had to ask
> > me how or why my xml snippet worked: there were no tricks. Nobody
> > asked for debugging information, or for reasons why they couldn't
> > see it:
Sorry Alan, but when I follow your instructions and save your XML to
disk and open it in Opera 6.01 on Win 98, I get this:
XML parsing failed: not well-formed (1:0)
At least it renders visibly in my browser, although I don't think its
rendering the way you wished. <grin>
(For the record, this is the contents of the XML file, triple-quoted
for your convenience:
"""<?xml version="1.0" encoding="utf-8"?>
<verb>γίγνωσκω</verb>""")
[snip]
> In summary:
>
> 1. I managed to make a greek word, using the original greek glyphs,
> appear on everyone's "rendering surface", by posting a 7-bit clean XML
> snippet. Another poster widened the software coverage even further by
> posting a 7-bit clean HTML snippet. Both of our 7-bit markup snippets
> travelled safely throughout the entirety of UseNet, including all the
> 7-bit relays and gateways.
I couldn't see either rendered correctly in either Opera's newsreader
or the Google archive.
> 2. The only other person who managed it, without using markup, was
> Martin von Loewis, who is so good at this stuff that he confidently
> makes statements like "what I did was right: it was Google that got it
> wrong". Martin used the UTF-8 character set, i.e. a non-ASCII,
> non-7-bit-clean character set, to achieve this. Although I'm sure
> Martin could have managed it with UTF-7 as well.
Martin's effort did work for me in Opera's newsreader, but not in the
Google Groups archive. But we already knew that Google broke it.
> 3. If anybody else was willing to give it a try, they don't seem to
> have had enough confidence in their knowledge of encodings, MIME,
> transports, NNTP, etc, etc, to have actually hit the "send" button, in
> case it didn't work. Which doesn't bode well for the average person in
> the street: if the technology specialists in this newsgroup don't feel
> in command of the issue, what hope for everyone else?
Exactly. Which brings us back to Ben's suggestion: when writing for a
general audience using unknown systems, stick to ASCII, or at least
follow your rich text with a description of what your reader should
see:
"""And I can use Umlauts (äöü) -- you should see a, o and u all in
lowercase with two dots on top."""
It's a mess and I despair. It would be nice if everyone used bug-free
XML-aware newsreaders, browsers and mail clients, but the majority
don't. That's why I always practice defensive writing whenever I use
any character I can't see on my keyboard, and spell it out in ASCII.
That's not very satisfactory, but its better than some random
percentage of your audience seeing "?????".
--
Steven D'Aprano
More information about the Python-list
mailing list