[Python-Dev] Unicode character property methods

Tue, 07 Mar 2000 10:14:25 +0100

Guido van Rossum wrote:
> [MAL about adding .isdecimal(), .isdigit() and .isnumeric()]
> > Some more examples from the unicodedata module (which makes
> > all fields of the database available in Python):
> >
> > >>> unicodedata.decimal(u"3")
> > 3
> > >>> unicodedata.decimal(u"²")
> > 2
> > >>> unicodedata.digit(u"²")
> > 2
> > >>> unicodedata.numeric(u"²")
> > 2.0
> > >>> unicodedata.numeric(u"\u2155")
> > 0.2
> > >>> unicodedata.numeric(u'\u215b')
> > 0.125
> 
> Hm, very Unicode centric.  Probably best left out of the general
> string methods.  Isspace() seems useful, and an isdigit() that is only
> true for ASCII '0' - '9' also makes sense.

Well, how about having all three on Unicode objects
and only .isdigit() on string objects ?

> What about "123".isdigit()?  What does Java say?  Or do these only
> apply to single chars there?  I think "123".isdigit() should be true
> if "abc".islower() is true.

In the current uPython implementation u"123".isdigit() is true;
same for the other two methods.

> > > > Similar APIs are already available through the unicodedata
> > > > module, but could easily be moved to the Unicode object
> > > > (they cause the builtin interpreter to grow a bit in size
> > > > due to the new mapping tables).
> > > >
> > > > BTW, string.atoi et al. are currently not mapped to
> > > > string methods... should they be ?
> > >
> > > They are mapped to int() c.s.
> >
> > Hmm, I just noticed that int() et friends don't like
> > Unicode... shouldn't they use the "t" parser marker
> > instead of requiring a string or tp_int compatible
> > type ?
> 
> Good catch.  Go ahead.

Done. float(), int() and long() now accept charbuf
compatible objects as argument.

-- 
Marc-Andre Lemburg
______________________________________________________________________
Business:                                      http://www.lemburg.com/
Python Pages:                           http://www.lemburg.com/python/