[Python-checkins] r80092 - python/branches/py3k/Doc/library/urllib.request.rst
orsenthil at gmail.com
Mon Apr 19 10:12:52 CEST 2010
On Sat, Apr 17, 2010 at 12:05:00PM -0400, R. David Murray wrote:
> Senthil, I think that we are in general considering Python 3 a "clean
> start", and avoiding mentioning how things were done in Python 2 except
> where it is important for compatibility (eg: pickle). I think the
> mention of how Python 2 did it actually muddies the explanation of how
> one should do it. I would either drop the mention of Python 2, or
> move it to a footnote (I favor just dropping it).
> How about this:
> Note that urlopen returns a bytes object. This is because there is no way
> for urlopen to automatically determine the encoding of the byte stream
> it receives from the http sever. In general, a program will decode
> the returned bytes object to string once it determines or guesses
> the appropriate encoding.
Yes, I get your point, David. My write up was more considering the
specific bug where the request was to be explicit and helpful to the
newcomers. Perhaps urllib2 how-to tutorial can provide the specific
details and this specific note can be written along the lines that you
> Aside: I was curious how one went about determining the encoding, and
> found this fascinating document that seems to show just now non-trivial
> doing so is:
> And I thought email was a pain to parse. Little did I know.
This is interesting as how other clients are adopting the strategy for
guessing the correct encoding.
More information about the Python-checkins