[Python-bugs-list] rfc822.Message.readheaders bug (PR#3)

skip@mojam.com skip@mojam.com
Mon, 12 Jul 1999 15:36:04 -0400 (EDT)


Full_Name: Skip Montanaro
Version: 1.5.2
OS: Unix
Submission from: eric.cnri.reston.va.us (132.151.1.38)
Submitted by: guido


[resubmitted by GvR]

I think there's a bug in rfc822.Message.readheaders (v. 1.5.2).  In that
method it splits a header into name and value and assigns to a dict:

    headerseen = self.isheader(line)
    if headerseen:
	# It's a legal header line, save it.
	list.append(line)
	self.dict[headerseen] = string.strip(line[len(headerseen)+2:])
	continue

See the "len(headerseen)+2" as the starting index of the slice?  I think
that should be "len(headerseen)+1".  It appears the code assumes there is a
space following the colon that separates the name and the value.  My reading 
of the relevant section of RFC 822 suggests that a single colon is the only
separator between a field name and its value:

     3.2.  HEADER FIELD DEFINITIONS

          These rules show a field meta-syntax, without regard for the
     particular  type  or internal syntax.  Their purpose is to permit
     detection of fields; also, they present to  higher-level  parsers
     an image of each field as fitting on one line.

     field       =  field-name ":" [ field-body ] CRLF

     field-name  =  1*<any CHAR, excluding CTLs, SPACE, and ":">

     field-body  =  field-body-contents
                    [CRLF LWSP-char field-body]

     field-body-contents =
                   <the ASCII characters making up the field-body, as
                    defined in the following sections, and consisting
                    of combinations of atom, quoted-string, and
                    specials tokens, or else consisting of texts>

I got the above from http://www.faqs.org/rfcs/rfc822.html.