[Doc-SIG] Non-ASCII Characters in reST (was: [Doc-SIG] docuti ls f eedback )

Aahz aahz@pythoncraft.com
Tue, 18 Jun 2002 09:04:55 -0400


On Tue, Jun 18, 2002, Schlaepfer, Ueli (ESEC ZG) wrote:
> David Goodger wrote:
>> 
>> This is an input encoding issue, the solution to which (you guessed
>> it!) hasn't been implemented yet.  I'm not even sure what the solution
>> should be.  Although I've worked with different "character sets" in
>> the past (such as Japanese SJIS, and Chinese and Korean encodings),
>> the encoding was always known beforehand.  With Docutils, it won't be.
>> Anyone with Unicode encoding/decoding experience, I'd appreciate some
>> advice.
> 
> Emacs does some  guesswork concerning file encoding --  should we have a
> look at that for a starter?

Again, see PEP 263.  As David said, there are a lot of problems with it,
but it *does* start with Emacs as its base.

> The  language is much  less of  an issue,  I think.   Stating it  in the
> document  if it's  not  what the  default  would be  is  easy enough  to
> understand, and such a statement won't  turn into a lie as easily as the
> encoding.  A  command-line option is a  must, though; I  don't expect an
> American to state  that his documents are in English,  but I won't state
> it if mine are  in German either.  So I need a  way to tell the frontend
> what language it should use.

You edit the document to add the language.  Otherwise, what if you're
processing multiple documents in a single run, all in different
languages?

BTW, right-justified text looks ugly in a monospaced font.
-- 
Aahz (aahz@pythoncraft.com)           <*>         http://www.pythoncraft.com/

Project Vote Smart: http://www.vote-smart.org/