[Python-Dev] The "i" string-prefix: I18n'ed strings

Fri Apr 7 11:37:40 CEST 2006

Martin Blais wrote:
> Hi all
> 
> I got an evil idea for Python this morning -- Guido: no, it's not
> about linked lists :-) -- , and I'd  like to bounce it here.  But
> first, a bit of context.

This has been discussed a few times before, see e.g.

http://mail.python.org/pipermail/python-list/2000-January/020346.html

In summary, the following points were made in the various
discussions (this is from memory, so I may have forgotten
a few points):

* the string literal modifiers r"" and u"" are really only a cludge
  which should not be extended to other uses

* being able to register such modifiers would result in unreadable
  and unmaintainable code, since the purpose of the used modifiers
  wouldn't be clear to the reader of a code snippet

* writing i"" instead of _("") saves two key-strokes - not really
  enough to warrant the change

* if you want to do it right, you'd also have to add iu"",
  ir"" for completeness

* internationalization requires a lot more than just calling
  a function: context and domains are very important when it
  comes to translating strings in i18n efforts; these can
  easily be added to a function call as parameter, but not
  to a string modifier

* there are lots of tools to do string extraction using the
  _("") notation (which also works in C); for i"" such tools
  would have to be rewritten

> In the context of writing i18n apps, programmers have to "mark"
> strings that may be internationalized in a way that
> 
> - a special hook gets called at runtime to perform the lookup in a
> catalog of translations setup for a specific language;
> 
> - they can be extracted by an external tool to produce the keys of all
> the catalogs, so that translators can update the list of keys to
> translate and produce the values in the target languages.
> 
> Usually, you bind a function to a short name, like _() and N_(), and
> it looks kind-of like this::
> 
>     _("My string to translate.")
> 
>     or
> 
>     N_("This is marked for translation") # N_() is a noop.
> 
> pygettext does the work of extracting those patterns from the files,
> doing all the parsing manually, i..e it does not use runtime Python
> introspection to do this at all, it is simply a simple text parsing
> algorithm (which works pretty well).  I'm simplifying things a bit,
> but that is the jist of how it works, for those not familiar with
> i18n.
> 
> 
> This morning I woke up staring at the ceiling and the only thing in my
> mind was "my web app code is ugly".  I had visions of LISP parentheses
> with constructs like
> 
>    ...
>    A(P(_("Click here to forget"), href="...
>    ...
> 
> (In my example, I built a library not unlike stan for creating HTML,
> which is where classes A and P come from.)  I find the i18n markup a
> bit annoying, especially when there are many i18n strings close
> together.  My point is: adding parentheses around almost all strings
> gets tiresome and "charges" the otherwise divine esthetics of Python
> source code.
> 
> (Okie, that's enough for context.)
> 
> 
> So I had the following idea: would it not be nice if there existed a
> string-prefix 'i' -- a string prefix like for the raw (r'...') and
> unicode (u'...') strings -- that would mark the string as being for
> i18n?   Something like this (reusing my example above)::
> 
>    A(P(i"Click here to forget", href="...
> 
> Notes:
> 
> - We could then use the spiffy new AST to build a better parser to
> extract those strings from the source code as well.
> 
> - We could also have a prefix "I" for strings to be marked but not
> runtime-translated, to replace the N_() strings.
> 
> - This implies that we would have to introduce some way for these
> strings to call a custom function at runtime.
> 
> - My impression is that this process of i18n is common enough that it
> does not "move" very much, and that there aren't 18 ways to go about
> it either, so that it would be reasonable to consider adding it to the
> language.   This may be completely wrong, I am by no means an i18n
> expert, please show if this is not the case.
> 
> - Potential issue: do we still need other prefixes when 'i' is used,
> and if so, how do we combine them...
> 
> 
> Okay, let's push it further a bit:  how about if there was some kind
> of generic mechanism built-in in Python for adding new string-prefixes
> which invoke callbacks when the string with the prefix is evaluated? 
> This could be used to implement what I'm suggesting above, and beyond.
>  Something like this::
> 
>    import i18n
>    i18n.register_string_prefix('i', _)
>    i18n.register_string_prefix('I', N_)
> 
> I'm not sure what else we might be able to do with this, you may have
> other useful ideas.
> 
> 
> Any comments welcome.
> 
> cheers,
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/mal%40egenix.com

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Apr 07 2006)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ::::