[New-bugs-announce] [issue12964] Two improvements for the locale aliasing engine

Sinisa Segvic report at bugs.python.org
Mon Sep 12 14:41:27 CEST 2011


New submission from Sinisa Segvic <sinisa.segvic at fer.hr>:

Hi,

There appears to be some space for improvement
regarding the stable implementation
of the Python locale aliasing engine.

Sometimes, one wishes to be able
to override the default system locale.
For instance, it would be nice that a program
supposed to sort people according to national rules
would be able to run correctly even when the system
does not default to the national locale.

Judging from the Python manuals and
provided the desired national locale is installed,
this should be doable in at least the following two ways:

import locale
ianaLanguageSubtag='en'       # the desired national locale
locale.setlocale(locale.LC_ALL,
  (ianaLanguageSubtag, locale.getpreferredencoding())) #(1)
locale.setlocale(locale.LC_ALL,
  locale.normalize(ianaLanguageSubtag))                #(2)


For a quicker reference, this is
the relevant part of the manual:
http://docs.python.org/release/3.2/library/locale.html
'''
locale.setlocale(category, locale=None)
  ...
  If (the locale) is a tuple, it is converted
  to a string using the locale aliasing engine.
  ...
'''

The locale aliasing engine binds
the IANA language subtags to POSIX locales.
Its effects can be directly observed
through locale.normalize:
>>> import locale
>>> locale.normalize('hr')
'hr_HR.ISO8859-2'
>>> locale.normalize('en')
'en_US.ISO8859-1'


My first objection concerns the Windows behaviour
of the calls (1) and (2) above.
Both of the two calls *do not work* since Windows
does not recognize strings such as 'en_US.ISO8859-1'.
Instead, Windows provides their own locale nomenclature:
http://msdn.microsoft.com/en-us/library/x99tb11d%28VS.80%29.aspx
Consequently, the following *works*:

locale.setlocale(locale.LC_ALL, 'English_United States.1252')

IMHO this issue should be fixed, perhaps by providing
an alternate definition of the locale alias dictionary
(locale.locale_alias).


My second objection concerns the behaviour on Linux,
where the call (1) above always works,
while (2) in some cases might not work.
It happens that the call (2)
requests an outdated 8-bit encoding
although UTF8 has obtained a world-wide acceptance.
The call shall result in a locale error
whenever the desired national locale
is present only in the UTF8 variant.

This might be fixed by changing the encodings
in the locale.locale_alias from 8-bit variants to UTF8.
Note however that the problem could be circumvented
by issuing the call (1), so this would be
less important than the Windows fix proposed above.

Source code references:
  .../Python-3.2.2/Lib/locale.py
  locale.locale_alias
  locale.normalize
  locale.setlocale

comp.lang.python discussion:
  http://groups.google.com/group/comp.lang.python/browse_thread/thread/3591d496cf108ad2#

Cheers,

Sinisa

----------
components: Library (Lib), Windows
messages: 143902
nosy: ssegvic
priority: normal
severity: normal
status: open
title: Two improvements for the locale aliasing engine
type: behavior
versions: Python 3.2

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue12964>
_______________________________________


More information about the New-bugs-announce mailing list