[Python-bugs-list] [ python-Bugs-652104 ] int(u"\u1234") raises UnicodeEncodeError

noreply@sourceforge.net noreply@sourceforge.net
Wed, 11 Dec 2002 08:39:25 -0800


Bugs item #652104, was opened at 2002-12-11 17:07
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=652104&group_id=5470

Category: Unicode
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: Guido van Rossum (gvanrossum)
Assigned to: Martin v. Löwis (loewis)
>Summary: int(u"\u1234") raises UnicodeEncodeError

Initial Comment:
In python 2.2, int of a unicode string containing
non-digit characters raises ValueError, like all other
attempts to convert an invalid string or unicode to
int. But in Python 2.3, it appears that int() of a
unicode string si implemented differently and now can
raise UnicodeEncodeError:

>>> int(u"\u1234")
Traceback (most recent call last):
  File "<stdin>", line 1, in ?
UnicodeEncodeError: 'decimal' codec can't encode
character '\u1234' in position 0: invalid decimal
Unicode string
>>> 

I think it's important that int() of a string or
unicode argument only raises ValueError to indicate
invalid inputs -- otherwise one ends up writing bare
excepts for conversions to string (as it is too much
trouble to keep track of which Python versions can
raise which exceptions).


----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2002-12-11 17:39

Message:
Logged In: YES 
user_id=21627

I don't see the problem:

>>> try:
...   int(u"\u1234")
... except ValueError:
...   print "caught"
...
caught
>>> issubclass(UnicodeEncodeError,ValueError)
True


----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=652104&group_id=5470