[Python-Dev] Strange error importing a Pickle from 2.7 to 3.2

M.-A. Lemburg mal at egenix.com
Thu Feb 24 10:02:10 CET 2011


Alexander Belopolsky wrote:
> On Wed, Feb 23, 2011 at 6:32 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> Alexander Belopolsky wrote:
> ..
>>> In what sense is "Latin-1" the official name?  The IANA charset
>>> registry has the following listing
>>>
>>>
>>> Name: ISO_8859-1:1987                                    [RFC1345,KXS2]
>>> MIBenum: 4
>>> Source: ECMA registry
>>> Alias: iso-ir-100
>>> Alias: ISO_8859-1
>>> Alias: ISO-8859-1 (preferred MIME name)
>>> Alias: latin1
> ..
>> "Latin-1" is short for "Latin Alphabet No. 1" and
>> started out as ECMA-94 in 1985 and 1986:
> 
> This does not explain your preference of "Latin-1" over "Latin1".

This is not my preference. See e.g. Wikipedia
http://en.wikipedia.org/wiki/ISO/IEC_8859-1

It is common practice to replace spaces in descriptive names with
a hyphen to come up with an identifier string (even Google
does or undoes this when searching the net).

Replacing spaces with an empty string is also an option, but
doesn't read as well.

> Both are perfectly valid abbreviations for "Latin Alphabet No. 1".
> The spelling without "-" has the advantage of being a valid Python
> identifier and a module name.

The hyphens are converted to underscores by the lookup function
in the encodings package. That turns the name into a valid
Python module name.

>  The IANA registration for "latin1" and
> lack of that for "latin-1" most likely indicates that the former is
> more commonly found in machine readable metadata.

I don't know why you emphasize so much on machine readable metadata.
Python source code is machine readable, the Internet is machine
readable, all documents found there are machine readable.

As I said earlier on: the IANA registry is just that - a registry
of names with the purpose of avoiding name clashes in the resp.
name space. As such, it is not a standard, but merely a tool
to map various aliases to a canoncial name.

The fact that an alias is registered doesn't allow any
implication on whether it's in wide-spread use or not, e.g.
"csISOLatin1" gives me 6810 hits on Google.

I get 788,000 hits for 'latin1 -"latin-1"' on Google,
'latin-1' gives 2,600,000 hits. Looks like it's still
the preferred way to write that encoding name.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 24 2011)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/


More information about the Python-Dev mailing list