python3 urlopen(...).read() returns bytes

ajaksu ajaksu at
Tue Dec 23 07:00:36 CET 2008

On Dec 22, 9:05 pm, Christian Heimes <li... at> wrote:
> ajaksu schrieb:
> > That said, a "decode to declared HTTP header encoding" version of
> > urlopen could be useful to give some users the output they want (text
> > from network io) or to make it clear why bytes is the safe way.
> Yeah, your idea sounds both useful and feasible. A patch is welcome! :)

Would monkeypatching what urlopen returns be good enough or should we
aim at a cleaner implementation?

Glenn, does this sketch work for you?

def urlopen_text(url, data=None,
    response = urlopen(url, data, timeout)
    _readline = response.readline
    _readlines = response.readlines
    _read =
    charset = response.headers.get_charsets()[0]
    def readline(limit = -1):
        content = _readline()
        return str(content, encoding=charset)
    response.readline = readline
    def readlines(hint = None):
        content = _readlines()
        return [str(line, encoding=charset) for line in content]
    response.readlines = readlines
    def read(n = -1):
        content = _read()
        return str(content, encoding=charset) = read
    return response

Any comments/suggestions are very welcome. I could use some help from
people that know urllib on the best way to get the charset. Maybe
after some sleep I can code it in a less awful way :)


More information about the Python-list mailing list