[Python-Dev] unicode Exception messages in py2.7

Chris Barker chris.barker at noaa.gov
Fri Nov 15 01:41:35 CET 2013


On Thu, Nov 14, 2013 at 3:58 PM, Steven D'Aprano <steve at pearwood.info> wrote:

> It's not a given that the current behaviour *is* a bug.

I'll concede that it's not a bug unless someone said somewhere that
unicode messages should work .. but that's kind of a semantic
argument.

I have to say it's a very odd choice to me that it suppresses the
message, rather than raising an encoding error, like what happens
everywhere else the default encoding is used.

In fact, I noticed that the message can be anything that can be
stringified, which makes it particularly wacky that you can't use a
unicode object.

> Exception
> messages in 2 are byte-strings, not Unicode.

well, they are anything that you can call str() on anyway...

> Trying to use Unicode
> instead is not, as far as I can tell, supported behaviour.

clearly not

> If the exception message cannot be converted to a byte-string,
> suppressing the display of the message seems like perfectly reasonable
> behaviour to me:

well, yes and no -- the fact is that unicode objects ARE special --
and it wouldn't hurt to treat them that way. And I'm not sure that
suppressing the message when you've passed in a weird object that
raises an exception when you try to convert it to a string makes sense
either -- suppressing an exception is really not a good idea in
general -- you really should have a good reason for it. I'm guessing
that this was put in to save a lot of crashing from unicode objects,
but what do I know?

Actually, when I think about it, Exceptions being raised when you call
str(0 on something are probably pretty rare -- if you define a class
with no __str__ method, you get a default string version -- there
can't be many use-cases where you want to make sure no one tries to
make a string out of your object...

> although it would be nice if a newline was used so the prompt was bumped
> to the next line.

yup -- that would be good.

> The point is, I'm not convinced that this is a bug at all.

OK -- to clarify the discussion a bit:

I think we all agree that this is not a fatal bug that MUST be fixed.

Is this something that could be improved  or is the current behavior
the best we could have, given the limitations of strings an unicode in
py2 anyway?

If it's not a desirable change, then we're done -- sorry for the noise.

If it is a desirable change, then is the benefit worth the possible
breakage of code. Do assess that, you need to trade off the size of
the benefit with the amount of breakage.

I think it would be a pretty nice benefit

I can't see that it would cause a lot of breakage.

Any idea how we could assess how much code or tests are out there in
the would that this would affect?

I contend that it wouldn't be much because:

If I had thought to write a test for this, I would have thought to fix
my code so that it would either never use a unicode object for a
message, or, like I have done in my code, encode it when passing it in
to the Exception.

There is certainly a chance that some doctests would break, if people
had not looked carefully at them -- i.e. that wanted to test that the
exception was raised, but did not notice that the message didn't get
through.

How many are there? who knows?

-Chris

-- 

Christopher Barker, Ph.D.
Oceanographer

Emergency Response Division
NOAA/NOS/OR&R            (206) 526-6959   voice
7600 Sand Point Way NE   (206) 526-6329   fax
Seattle, WA  98115       (206) 526-6317   main reception

Chris.Barker at noaa.gov


More information about the Python-Dev mailing list