[Python-bugs-list] [ python-Bugs-473009 ] binascii_b2a_base64() improper str limit

noreply@sourceforge.net noreply@sourceforge.net
Sun, 21 Oct 2001 18:26:43 -0700


Bugs item #473009, was opened at 2001-10-19 21:42
You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=473009&group_id=5470

Category: Python Library
Group: Python 2.1.1
Status: Open
Resolution: None
Priority: 5
Submitted By: Dave Cinege (dcinege)
Assigned to: Nobody/Anonymous (nobody)
Summary: binascii_b2a_base64() improper str limit

Initial Comment:
Modules/binascii.c
binascii_b2a_base64() contains the following 
restrictive code:
	if ( bin_len > BASE64_MAXBIN ) {
		PyErr_SetString(Error, "Too much data 
for base64 line");
		return NULL;
	}

This is an error. The base64 method of encoding data 
has no length limitation. The MIME message RCF has 
such a limitation of base64 encoded data. The 
function should not assume it's only input must be 
MIME compatible. The base64 python module itself
is designed for MIME I/O only, and properly limits 
itself. The binascii function should be left raw.

binascii_a2b_base64() properly accepts input of any 
size.

How I came across this bug: I use base64 to ascii
armor binary data in log entries in a distributed 
network monitoring system. For the sake of ease of 
parsing (human and machine) all log entries are 
delimited by a single line. I commonly have unbroken 
base64 encoded fields of 64KB in size or greater.

Unfortunatly I am unable to encode this data like 
this:
result64 = binascii.b2a_base64(s)
I must do this:
result64 = re.sub('[ |\n]','',base64.encodestring(s))
Which is *much* slower.  : <

I feel this is an outright bug and should be 
corrected. If their is some argument for backward 
compatibly an optional function argument should be 
present to allow bypassing this limitation.




----------------------------------------------------------------------

>Comment By: Guido van Rossum (gvanrossum)
Date: 2001-10-21 18:26

Message:
Logged In: YES 
user_id=6380

I'm with David. It's up to the higher level code (e.g. the
base64 module) to avoid writing lines longer than 76
characters; the underlying function in binascii doesn't have
to act as a policeman here. There may be other applications
of the same encoding where the 76-char limit does not apply.

----------------------------------------------------------------------

Comment By: Dave Cinege (dcinege)
Date: 2001-10-20 21:34

Message:
Logged In: YES 
user_id=314434

>Can you cite any relevant standard that defines base64 to 
>work in that way? Base64 is defined in RFC 2045 section 
>6.8., which clearly says

>The encoded output stream must be represented in lines 
>of no more than 76 characters each.

This is difficult to do because base64 itself has not 
(yet) been seperatly
defined in it's own RFC. It should be and this issue has 
been brought
up recently on the W3 lists.

IE:
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJun/0212.html
http://lists.w3.org/Archives/Public/w3c-ietf-xmldsig/2001AprJun/0210.html

The part of the RFC you have quoted is relevent to the use 
of base64
encoding in the context of MIME, the purpose clearly being 
to
ensure compatibly with email (SMTP, POP3, MUA, etc) 
standards.

However this 76 character line length rule is irrelevent 
when dealing
with arbitary binary data, not meant for MIME encapulated 
transmission.
This is clearly seen the describtion of the actual base64 
algorithms
itself:

   The encoding process represents 24-bit groups of input 
bits as output
   strings of 4 encoded characters.  Proceeding from left 
to right, a
   24-bit input group is formed by concatenating 3 8bit 
input groups.
   These 24 bits are then treated as 4 concatenated 6-bit 
groups, each
   of which is translated into a single digit in the 
base64 alphabet.
   When encoding a bit stream via the base64 encoding, the 
bit stream
   must be presumed to be ordered with the 
most-significant-bit first.
   That is, the first bit in the stream will be the 
high-order bit in
   the first 8bit byte, and the eighth bit will be the 
low-order bit in
   the first 8bit byte, and so on.
...
   In base64
   data, characters other than those in Table 1, line 
breaks, and other
   white space probably indicate a transmission error, 
about which a
   warning message or even a message rejection might be 
appropriate
   under some circumstances.

Additionally the use of 'unlimited length' base64 encoding 
of binary data
has reached critical mass. For braod based example HTTP 
based authorization
'encrypts' the username:password in base64. However no 
length limit can
be used, else it would arbiltarily limit the amount of 
data that could
be passed without interfering with the HTTP protocol 
itself.
IE: (Lines should not appear wrapped)

'Logging in' to a webserver with
Username:
	abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXY
Z0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUV
WXYZ0123456789
Password:
	test

Will have the web broswer send the AUTH request header as 
follows: 
Authorization: Basic 
YWJjZGVmZ2hpamtsbW5vcHFyc3R1dnd4eXpBQkNERUZHSElKS0xNTk9QUVJTVFVWV1hZWjAxMjM0NTY3ODlhYmNkZWZnaGlqa2xtbm9wcXJzdHV2d3h5ekFCQ0RFRkdISUpLTE1OT1BRUlNUVVZXWFlaMDEyMzQ1Njc4OTp0ZXN

The latter field is an 'unlimited' length base64 encoding. 
(Testing done with KDE Konqueror, other browsers may vary)

Due to it's simple application you will find many a 
reference stating:
''The Base64 algorithm has become "the standard" for 
encoding binary data.''
Clearly line length limitation are counter productive to 
such use.









----------------------------------------------------------------------

Comment By: Martin v. Löwis (loewis)
Date: 2001-10-20 06:30

Message:
Logged In: YES 
user_id=21627

Can you cite any relevant standard that defines base64 to 
work in that way? Base64 is defined in RFC 2045 section 
6.8., which clearly says

  The encoded output stream must be represented in lines 
of no more than 76 characters each.



----------------------------------------------------------------------

You can respond by visiting: 
http://sourceforge.net/tracker/?func=detail&atid=105470&aid=473009&group_id=5470