[issue4773] HTTPMessage not documented and has inconsistent API across 2.6/3.0
Brad Miller
report at bugs.python.org
Fri Mar 27 02:00:27 CET 2009
Brad Miller <bonelake at gmail.com> added the comment:
On Thu, Mar 26, 2009 at 4:29 PM, Barry A. Warsaw <report at bugs.python.org>wrote:
>
> Barry A. Warsaw <barry at python.org> added the comment:
>
> I propose that you only document the getitem header access API. I.e.
> the thing that info() gives you can be used to access the message
> headers via message['content-type']. That's an API common to both
> rfc822.Messages (the ultimate base class of mimetools.Message) and
> email.message.Message.
>
As I've found myself in the awkward position of having to explain the new
3.0 api to my students I've thought about this and have some
ideas/questions.
I'm also willing to help with the documentation or any enhancements.
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'addinfourl' object is unsubscriptable
I wish I new what an addinfourl object was.
'Fri, 27 Mar 2009 00:41:34 GMT'
'Fri, 27 Mar 2009 00:41:34 GMT'
['Date', 'Server', 'Last-Modified', 'ETag', 'Accept-Ranges',
'Content-Length', 'Connection', 'Content-Type']
Using x.headers over x.info() makes the most sense to me, but I don't know
that I can give any good rationale. Which would we want to document?
'text/html; charset=ISO-8859-1'
I guess technically this is correct since the charset is part of the
Content-Type header in HTTP but it does make life difficult for what I think
will be a pretty common use case in this new urllib: read from the url (as
bytes) and then decode them into a string using the appropriate character
set.
As you follow this road, you have the confusing option of these three calls:
'iso-8859-1'
>>> x.headers.get_charsets()
['iso-8859-1']
I think it should be a bug that get_charset() does not return anything in
this case. It is not at all clear why get_content_charset() and
get_charset() should have different behavior.
Brad
>
> ----------
> nosy: +barry
>
> _______________________________________
> Python tracker <report at bugs.python.org>
> <http://bugs.python.org/issue4773>
> _______________________________________
>
----------
Added file: http://bugs.python.org/file13430/unnamed
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue4773>
_______________________________________
-------------- next part --------------
<div><br><div class="gmail_quote">On Thu, Mar 26, 2009 at 4:29 PM, Barry A. Warsaw <span dir="ltr"><<a href="mailto:report at bugs.python.org">report at bugs.python.org</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
Barry A. Warsaw <<a href="mailto:barry at python.org">barry at python.org</a>> added the comment:<br>
<br>
I propose that you only document the getitem header access API. Â I.e.<br>
the thing that info() gives you can be used to access the message<br>
headers via message['content-type']. Â That's an API common to both<br>
rfc822.Messages (the ultimate base class of mimetools.Message) and<br>
email.message.Message.<br></blockquote><div><br></div><div>As I've found myself in the awkward position of having to explain the new 3.0 api to my students I've thought about this and have some ideas/questions.<div>
<br></div><div>I'm also willing to help with the documentation or any enhancements.</div><div><br></div><div>>>> x = urllib.request.urlopen('<a href="http://knuth.luther.edu/python/test.html">http://knuth.luther.edu/python/test.html</a>')</div>
<div><div>>>> x['Date']</div><div>Traceback (most recent call last):</div><div>Â Â File "<stdin>", line 1, in <module></div><div>TypeError: 'addinfourl' object is unsubscriptable</div>
<div><br></div><div>I wish I new what an addinfourl object was.</div><div><br></div><div><div>>>> <a href="http://x.info">x.info</a>()['Date']</div><div>'Fri, 27 Mar 2009 00:41:34 GMT'</div><div><br>
</div><div><div>>>> x.headers['Date']</div><div>'Fri, 27 Mar 2009 00:41:34 GMT'</div><div><br></div><div><div>>>> x.headers.keys()</div><div>['Date', 'Server', 'Last-Modified', 'ETag', 'Accept-Ranges', 'Content-Length', 'Connection', 'Content-Type']</div>
<div><br></div><div>Using x.headers over <a href="http://x.info">x.info</a>() Â makes the most sense to me, but I don't know that I can give any good rationale. Â Which would we want to document?</div></div></div></div>
<div><br></div><div><div>>>> x.headers['Content-Type']</div><div>'text/html; charset=ISO-8859-1'</div><div><br></div><div>I guess technically this is correct since the charset is part of the Content-Type header in HTTP but it does make life difficult for what I think will be a pretty common use case in this new urllib: Â read from the url (as bytes) and then decode them into a string using the appropriate character set.</div>
<div><br></div></div><div>As you follow this road, you have the confusing option of these three calls:</div><div><br></div><div><div>>>> x.headers.get_charset()</div><div>>>> x.headers.get_content_charset()</div>
<div>'iso-8859-1'</div><div>>>> x.headers.get_charsets()</div><div>['iso-8859-1']</div><div><br></div><div>I think it should be a bug that get_charset() does not return anything in this case. Â It is not at all clear why get_content_charset() and get_charset() should have different behavior.</div>
<div><br></div><div>Brad</div></div><div><br></div></div></div><div>Â </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
<br>
----------<br>
nosy: +barry<br>
<div><div></div><div class="h5"><br>
_______________________________________<br>
Python tracker <<a href="mailto:report at bugs.python.org">report at bugs.python.org</a>><br>
<<a href="http://bugs.python.org/issue4773" target="_blank">http://bugs.python.org/issue4773</a>><br>
_______________________________________<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Brad Miller<br>Assistant Professor, Computer Science<br>Luther College<br>
</div>
More information about the Python-bugs-list
mailing list