[Doc-SIG] German Umlauts

grubert at users.sourceforge.net grubert at users.sourceforge.net
Tue Jul 8 10:24:26 EDT 2003


On Mon, 7 Jul 2003, Christian Tismer wrote:

> > On Mon, 7 Jul 2003, Dinu Gherman wrote:
> >
> >
> >>Christian Tismer:
> >>
> >>
> >>>Where I have problems are so-called "Umlauts".
> >>>I know this is kinda Unicode issue, but I'd just
> >>>like to know if and how this is handled in RST?
> >>
> >>I'm much of a reST newbie myself, but I do see Umlauts using the
> >>Latin-1 encoding and a call like this:
> >>
> >>   rest = docutils.core.publish_string(
> >>         text,
> >>         writer_name='html',
> >>         settings_overrides={'input_encoding': 'latin-1',
> >>                             'output_encoding': 'latin-1'})
> >>
> >>
> >>>The simple HTML way of ü doesn't work (why?).
> >>
> >>Obviously, normal HTML snippets are not just recognised without
> >>using some kind of magic directives or escaping mechanisms...
> >>Finding that out is on my todo list as well...
>
> Sure, I meant it like a suggestion.
>
> grubert at users.sourceforge.net wrote:
>
> > being ignorant as long as possible, my 2cent:
> >
> > 0.01: we have two encodings
> >
> >       a. the input encoding: which tells the reader (reST parser)
> >          what to expect.
> >       b. the output encoding: which tells the writer what to
> >          produce.
> >
> > 0.02: the html writer does handle the smallest possible number
> >       of html character encoding &<>" and as a bonus @ will be
> >       written as &#64; to maybe fool some foolish spamrobots.
> >
> >       believing from seeing: when i make a document here (latin1 or iso-8859-1)
> >
> >       a) running through: html.py without specifying an ecnoding gives me
> >          ``Ã?`` for an Ä.
> >       b) option "-i iso-8859" produces "Ä" for "Ä".
> >          and ``<?xml version="1.0" encoding="utf-8" ?>`` in the header.
>
> Thanks a lot, I already found out that it is
> possible to specify encoding schemes.
> Ok, I should have been more specific:
> I'm using ReST in a Zope ZWiki.
> How do I change the encoding for Zope?

how do you call docutils,

1. above is the example for a custom publisher,

2. if you use html.py (maybe not a bad idea, let zope cache the page and decide
   when remake is necessary) use commandline or the configuration file.

3. ZReST http://docutils.sourceforge.net/sandbox/richard/ZReST/

which problem do have with each of this options ?

> Is it possible to set that per page, or do I have to set something by
> environment variables, global for Zope?

depends :-)

> What would be the most flexible default encoding for a multilingual website?

depends on the assumed clients i would say utf-8 might support the widest
span of characters but i am unsure about the number of browser supporting
it.

what is multilingual: english, görmän, ...

i check with david about encoding umlauts.

-- 
 BINGO: dramatically coordinate business infrastructures
 --- Engelbert Gruber -------+
  SSG Fintl,Gruber,Lassnig  /
  A6170 Zirl   Innweg 5b   /
  Tel. ++43-5238-93535 ---+



More information about the Doc-SIG mailing list