python3 urlopen(...).read() returns bytes
ajaksu
ajaksu at gmail.com
Tue Dec 23 01:00:36 EST 2008
On Dec 22, 9:05 pm, Christian Heimes <li... at cheimes.de> wrote:
> ajaksu schrieb:
>
> > That said, a "decode to declared HTTP header encoding" version of
> > urlopen could be useful to give some users the output they want (text
> > from network io) or to make it clear why bytes is the safe way.
>
> Yeah, your idea sounds both useful and feasible. A patch is welcome! :)
Would monkeypatching what urlopen returns be good enough or should we
aim at a cleaner implementation?
Glenn, does this sketch work for you?
def urlopen_text(url, data=None,
timeout=socket._GLOBAL_DEFAULT_TIMEOUT):
response = urlopen(url, data, timeout)
_readline = response.readline
_readlines = response.readlines
_read = response.read
charset = response.headers.get_charsets()[0]
def readline(limit = -1):
content = _readline()
return str(content, encoding=charset)
response.readline = readline
def readlines(hint = None):
content = _readlines()
return [str(line, encoding=charset) for line in content]
response.readlines = readlines
def read(n = -1):
content = _read()
return str(content, encoding=charset)
response.read = read
return response
Any comments/suggestions are very welcome. I could use some help from
people that know urllib on the best way to get the charset. Maybe
after some sleep I can code it in a less awful way :)
Daniel
More information about the Python-list
mailing list