[Python-Dev] Unicode character property methods
M.-A. Lemburg
mal@lemburg.com
Tue, 07 Mar 2000 10:14:25 +0100
Guido van Rossum wrote:
> [MAL about adding .isdecimal(), .isdigit() and .isnumeric()]
> > Some more examples from the unicodedata module (which makes
> > all fields of the database available in Python):
> >
> > >>> unicodedata.decimal(u"3")
> > 3
> > >>> unicodedata.decimal(u"²")
> > 2
> > >>> unicodedata.digit(u"²")
> > 2
> > >>> unicodedata.numeric(u"²")
> > 2.0
> > >>> unicodedata.numeric(u"\u2155")
> > 0.2
> > >>> unicodedata.numeric(u'\u215b')
> > 0.125
>
> Hm, very Unicode centric. Probably best left out of the general
> string methods. Isspace() seems useful, and an isdigit() that is only
> true for ASCII '0' - '9' also makes sense.
Well, how about having all three on Unicode objects
and only .isdigit() on string objects ?
> What about "123".isdigit()? What does Java say? Or do these only
> apply to single chars there? I think "123".isdigit() should be true
> if "abc".islower() is true.
In the current uPython implementation u"123".isdigit() is true;
same for the other two methods.
> > > > Similar APIs are already available through the unicodedata
> > > > module, but could easily be moved to the Unicode object
> > > > (they cause the builtin interpreter to grow a bit in size
> > > > due to the new mapping tables).
> > > >
> > > > BTW, string.atoi et al. are currently not mapped to
> > > > string methods... should they be ?
> > >
> > > They are mapped to int() c.s.
> >
> > Hmm, I just noticed that int() et friends don't like
> > Unicode... shouldn't they use the "t" parser marker
> > instead of requiring a string or tp_int compatible
> > type ?
>
> Good catch. Go ahead.
Done. float(), int() and long() now accept charbuf
compatible objects as argument.
--
Marc-Andre Lemburg
______________________________________________________________________
Business: http://www.lemburg.com/
Python Pages: http://www.lemburg.com/python/