[Python-bugs-list] Re: PR#110

guido@CNRI.Reston.VA.US guido@CNRI.Reston.VA.US
Tue, 19 Oct 1999 08:57:34 -0400 (EDT)


> Uhm, I just got a notice that you fixed the problem.  However, looking
> at the CVS tree, it looks like you added:
>         /* and remove any padding */
>         bin_len -= npad;
>         if (bin_len < 0)
>                 bin_len = 0;
> 
> This may fix the symptom, but it doesn't fix the underlying problems:

Your bug report didn't indicate that you had a problem with this.

> Sets of 4 pad sequences will cause the loss of encoded data, ie:
>   'SGVsbG8K'      decoded is 'Hello\n'
>   'SGVsbG8K===='  decoded is 'Hello'
> 
> Illegal pad sequences can appear anywhere:
>   'a=aaa'  decodes to 'h\006', but
>   is actually nonsense, according to the
>   specification.
> 
> Characters >chr(127) are remapped to the lower 127 ASCII set (by masking
> with 0x7f) instead of being ignored:
>   'aaaa' and 'aaa\xe1' both decode to 'i\246\232',
>   but the second form should ignore the '\xe1' and
>   raise an Incorrect padding.
> 
> The patch included with this message ought to make binascii's base64
> decoding routine RFC 2045 compliant.  Please see section 6.8 at
> "http://www.faqs.org/rfcs/rfc2045.html" for details on what RFC 2045 is.
> 
> The problems that I fixed with binascii are:
>  -- Illegal padding now generates an exception.
> 	(binascii.Error: Invalid pad sequence)
> 
>  -- Padding now indicates EOI (as per RFC)
>     Previously, padding could be used anywhere
>     in input, violating the RFC.
> 
>  -- Padding no longer removes characters from
>     data string (resulting in lost data/strings
>     with negative lengths).
> 
>  -- Illegal characters now ignored, instead of 
>     possibly being remapped to a valid character.
> 
> Base64 decoding should (hopefully!) be RFC 2045 compliant with this
> patch!  If you can, let me know any problems you find in the patch.

I have one reservation: raising an exception for invalid input seems
overkill.  I'd rather ignore illegal padding (like you are ignoring
non-ASCII characters).  That way, a garbled file may be recoverable at
least partially.  The rest of your patch description seems fine.
Remember, the general Internet philosophy is "be strict in what you
send, but generous in what you accept."  (Or words to that effect.)

> Here's how to apply the patch in this message:
>  1) Go into your Python source directory & go into the Modules/
>     directory.
>  2) Make sure you have an unmodified Python 1.5.2 binascii.c file.

I wish you had made the change relative to the latest CVS version, but 
this is okay.

>  3) Type: patch -N binascii.c binascii.patch
> 
> Anyway, I just want to say that I love using Python (my favorite
> language for 3 years now!).  This is the first time that I've actually
> had the chance to help advance an open-source style project, and I hope
> that I've done things right.  Thanks!

Thanks!  I have one request (with pain in my heart): please resend me
your patch with, in the body of the message, the disclaimer suggested
on this webpage: http://www.python.org/1.5/bugrelease.html

--Guido van Rossum (home page: http://www.python.org/~guido/)