unicode and dbf files
sjmachin at lexicon.net
Mon Oct 26 20:21:45 CET 2009
On Oct 27, 3:22 am, Ethan Furman <et... at stoneleaf.us> wrote:
> John Machin wrote:
> > On Oct 24, 4:14 am, Ethan Furman <et... at stoneleaf.us> wrote:
> >>John Machin wrote:
> >>>On Oct 23, 3:03 pm, Ethan Furman <et... at stoneleaf.us> wrote:
> >>>>John Machin wrote:
> >>>>>On Oct 23, 7:28 am, Ethan Furman <et... at stoneleaf.us> wrote:
> > Try this:
> Wow. Question, though: all those codepages mapping to 437 and 850 --
> are they really all the same?
437 and 850 *are* codepages. You mean "all those language driver IDs
mapping to codepages 437 and 850". A codepage merely gives an
encoding. An LDID is like a locale; it includes other things besides
the encoding. That's why many Western European languages map to the
same codepage, first 437 then later 850 then 1252 when Windows came
> >> '\x68' : ('cp895', 'Kamenicky (Czech) MS-DOS'), # iffy
> > Indeed iffy. Python doesn't have a cp895 encoding, and it's probably
> > not alone. I suggest that you omit Kamenicky until someone actually
> > wants it.
> Yeah, I noticed that. Tentative plan was to implement it myself (more
> for practice than anything else), and also to be able to raise a more
> specific error ("Kamenicky not currently supported" or some such).
The error idea is fine, but I don't get the "implement it yourself for
practice" bit ... practice what? You plan a long and fruitful career
inplementing codecs for YAGNI codepages?
> >> '\x7b' : ('iso2022_jp', 'Japanese Windows'), # wag
> > Try cp936.
> You mean 932?
> Very helpful indeed. Many thanks for reviewing and correcting.
> Learning to deal with unicode is proving more difficult for me than
> learning Python was to begin with! ;D
?? As far as I can tell, the topic has been about mapping from
something like a locale to the name of an encoding, i.e. all about the
pre-Unicode mishmash and nothing to do with dealing with unicode ...
BTW, what are you planning to do with an LDID of 0x00?
More information about the Python-list