String formatting for complex writing systems
Leo.Kislov at gmail.com
Wed Jun 27 12:14:47 CEST 2007
On Jun 27, 3:10 am, Leo Kislov <Leo.Kis... at gmail.com> wrote:
> On Jun 27, 12:20 am, Andy <fukaz... at gmail.com> wrote:
> > Hi guys,
> > I'm writing a piece of software for some Thai friend. At the end it
> > is supposed to print on paper some report with tables of text and
> > numbers. When I test it in English, the columns are aligned nicely,
> > but when he tests it with Thai data, the columns are all crooked.
> > The problem here is that in the Thai writing system some times two or
> > more characters together might take one single space, for example งิ
> > (u"\u0E07\u0E34"). This is why when I use something like u"%10s"
> > % ..., it just doesn't work as expected.
> > Is anybody aware of an alternative string format function that can
> > deal with this kind of writing properly?
> In general case it's impossible to write such a function for many
> unicode characters without feedback from rendering library.
> Assuming you use *fixed* font for English and Thai the following
> function will return how many columns your text will use:
> from unicodedata import category
> def columns(self, s):
> return sum(1 for c in s if category(c) != 'Mn')
That should of course be written as def columns(s). Need to learn to
proofread before posting :)
More information about the Python-list