[Patches] [ python-Patches-532180 ] fix xmlrpclib float marshalling bug

Fri, 28 Mar 2003 15:25:54 -0800

Patches item #532180, was opened at 2002-03-19 23:28
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=532180&group_id=5470

Category: Library (Lib)
Group: Python 2.3
>Status: Closed
>Resolution: Rejected
Priority: 5
Submitted By: Brian Quinlan (bquinlan)
Assigned to: Fredrik Lundh (effbot)
Summary: fix xmlrpclib float marshalling bug

Initial Comment:
As it stands now, xmlrpclib can send doubles, such as 
1.#INF, that are not part of the XML-RPC standard. 
This patch causes a ValueError to be raised instead.

----------------------------------------------------------------------

>Comment By: Martin v. Löwis (loewis)
Date: 2003-03-29 00:25

Message:
Logged In: YES 
user_id=21627

I'll conclude that it is a lot of tedious work for no
reason, and close this patch.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-21 00:55

Message:
Logged In: YES 
user_id=31435

Python's internal format buffers are too small to use C %f 
in its full generality, so you're suggesting something 
there that's much harder to get done than you suspect.  
Note that %f isn't a cureall anyway, as in either Python or 
C, e.g., '%f' % 1e-10 throws away all information, 
producing a string of zeroes.  What you did is usually much 
better than that.

Let's wait to hear what /F wants to do.  If he's inclined 
to take this part of the spec at face value, I can work 
with him to write a "conforming" float->string that's 
numerically sound.  Else it's a lot of tedious work for no 
reason.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-21 00:24

Message:
Logged In: YES 
user_id=108973

OK, this floating point stuff is over my head.

Is it OK that it loses accuracy?  
- No
Is it OK that it produces 16 trailing zeroes for 1e-250?
- Yes
Is it OK that it raises OverflowError for the normal double 
1e-300?  
- No

Would exposing and using the C %f specifier, along with 
repr, make for identical roundtrips?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 23:53

Message:
Logged In: YES 
user_id=31435

I don't use XML-RPC, so I'm assigning this to /F (it was 
his code at the start, and he wants to keep it in synch 
with his company's version).

Formatting floats is a difficult job if you pay attention 
to accuracy.  The original code had the property that 
converting a Python float to an XML-RPC string, then back 
to a float again, reproduced the original input exactly.  
The code in the patch enjoys that property only by 
accident; much of the time a roundtrip conversion using it 
won't reproduce the number that was passed in.  Is that 
OK?  There's no way to tell, since the XML-RPC spec has 
scant idea what it's doing here, so leaves important 
questions unanswered.  OTOH, it seems to me that the 
*point* of this porotocol is to transport values across 
boxes, so of course it should move heaven and earth to 
transport them faithfully.

Is it OK that it loses accuracy?  Is it OK that it produces 
16 trailing zeroes for 1e-250?  Is it OK that it raises 
OverflowError for the normal double 1e-300?  No matter 
what's asked, the spec has no answers.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-20 21:48

Message:
Logged In: YES 
user_id=108973

Ooops, I already wrote the converter (see new patch). I'm 
not very concerned about sending 300 character strings for 
large doubles, but I guess someone might be. I am concerned 
about how large and ugly the code is.

XML-RPC is very poorly specified but the grammar for 
doubles seems reasonably clear (silly, but clear).

If you don't like my double marshalling code, you could 
please just checkin your infinity/NaN detection code (also 
part of my patch)?

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 21:13

Message:
Logged In: YES 
user_id=31435

If you think XML-RPC users are keen to see multi-hundred 
character strings produced for ordinary doubles, Python 
isn't going to be much help (you'll have to write your own 
float -> string conversion); or if you think they're happy 
to get an exception if they want to pass (e.g.) 1e20, you 
can keep using repr() and complain because repr(1e20) 
produces an exponent.

"decimal format" is simply two extremely common words 
pasted together <+.9 wink>.  I expect the Python docs here 
ended up so vague because whoever wrote this part of the 
docs didn't know the full story and didn't have time to 
figure it out.

But I expect the same is true of the part of this spec 
dealing with doubles (it doesn't define what it means 
by "double-precision", and then goes on to say stuff that 
doesn't make sense for what C or Java mean by double, or by 
what IEEE-754 means by double precision -- it's off in its 
own world, so if you take it at face value you'll have to 
guess what the world is, and implement it yourself).

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-20 20:32

Message:
Logged In: YES 
user_id=108973

I think that we should be flexible about the data that we 
accept but rigorous about the data that we generate. So the 
sign should always be send but not required. 

"decimal format" appears in the Python documentation 
(http://www.python.org/doc/current/lib/typesseq-
strings.html) so it is probably a documentation bug if the 
meaning is not widely known.

I parsed it as "not exponential format".

My question was whether the %f Python format specifier 
simply mapped to the C %f format specifier. But, based on 
the output of a simple C program, that does not appear to 
be the case.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 20:04

Message:
Logged In: YES 
user_id=31435

Well, Brian, the spec clearly disallows 1.0 too -- if you 
want to take that spec seriously, you can implement what it 
says and we'll redirect the complaints to your personal 
email account <wink>.

I can't parse your question about the C library (like, I 
don't know what you mean by "decimal format").

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-20 19:57

Message:
Logged In: YES 
user_id=108973

Whether it was intended or not, the spec clearly disallows 
it. 

I noticed the %f behavior too, which is interesting because 
the Python docs say: 
f Floating point decimal format

I wonder if it is the underlying C library refusing to 
write large float values in decimal format.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 19:08

Message:
Logged In: YES 
user_id=31435

Ack, I take part of that back:  it's Python's 
implementation of '%f' that can produce exponent notation.  
There's no simple way to get the effect of C's %f from 
Python.  It's clear as mud whether "the spec" *intended* to 
outlaw exponent notation.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 18:53

Message:
Logged In: YES 
user_id=31435

"%f" can produce exponent notation too, which is also not 
allowed by this pseudo-spec.

r = repr(some_double)
if 'n' in r or 'N' in r:
    raise ValueError(...)

is robust, will work fine x-platform, and isn't insane 
<wink>.

----------------------------------------------------------------------

Comment By: Brian Quinlan (bquinlan)
Date: 2002-03-20 18:31

Message:
Logged In: YES 
user_id=108973

Eric Kidd's XML-RPC C uses sprintf("%f") for marshalling 
and strtod for unmarshalling.

Let me design a more robust patch. 

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 17:23

Message:
Logged In: YES 
user_id=31435

The spec appears worse than useless to me here -- whoever 
wrote it just made stuff up.  They don't appear to know 
anything about floats or about grammar specification.  Do 
you really want to allow "+." and disallow "1.0"?  This 
seems a case where the spec is so braindead that nobody (in 
their mind <wink>) will implement it as given.  What do 
other implementations do?

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 17:03

Message:
Logged In: YES 
user_id=21627

You are right. An even better patch would check for
compliance with the protocol. Currently, the xmlrpc spec says

#  There is no representation for infinity or negative 
# infinity or "not a number". At this time, only decimal
# point notation is allowed, a plus or a minus, followed by
# any number of numeric characters, followed by a period 
# and any number of numeric characters. Whitespace is not 
# allowed. The range of allowable values is 
# implementation-dependent, is not specified.

That would be best validated with a regular expression.

----------------------------------------------------------------------

Comment By: Tim Peters (tim_one)
Date: 2002-03-20 16:02

Message:
Logged In: YES 
user_id=31435

Note that the patch only catches "the problem" on a 
platform whose C library can't read back its own float 
output.  Windows is in that class, but many other platforms 
aren't.

It would be better to see whether 'n' or 'N' appear in the 
repr() (that would catch variations of 'inf', 'INF', 'NaN' 
and 'IND', while no "normal" float contains n).

----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2002-03-20 08:28

Message:
Logged In: YES 
user_id=21627

It seems repr of the float is computed twice in every case.
I recommend to save the result of the first computation.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=305470&aid=532180&group_id=5470