unicode string problems

Martin v. Loewis martin at v.loewis.de
Mon Apr 1 18:27:38 EST 2002


bokr at oz.net (Bengt Richter) writes:

> But it does make me think, should _all_ strings be subtypes
> of a raw octet-string type according to their encoding? Then one
> could visualize automatic inter-encoding promotions analogous to
> numeric promotions, and if i/o sources and sinks have encoding
> designators, Gonçalo's f.write("Março 2002" + march.Name()) should
> "just work" if the output encoding permits.

That assumes that the output encoding is known, or can be
determined. As-is, it can't - you don't know the encoding of f, and
you don't know the encoding of "Março" (furthermore, the encoding of f
won't help, since you have to perform the addition before invoking
write).

In an all-Unicode approach, f would have been obtained from
codecs.open, and the string literal would have been a Unicode literal.

Regards,
Martin




More information about the Python-list mailing list