[New-bugs-announce] [issue25439] Add type checks to urllib.request.Request

Ezio Melotti report at bugs.python.org
Mon Oct 19 11:31:23 CEST 2015


New submission from Ezio Melotti:

Currently urllib.request.Request seems to accept invalid types silently, only to fail later on with unhelpful errors when the request is passed to urlopen.

This might cause users to go through something similar:

>>> r = Request(b'https://www.python.org')
>>> # so far, so good
>>> urlopen(r)
Traceback (most recent call last):
  ...
urllib.error.URLError: <urlopen error unknown url type: b'https>

This unhelpful error might lead users to think https is not supported, whereas the problem is that the url should have been str, not bytes.

The same problem applies to post data:

>>> r = Request('https://www.python.org', {'post': 'data'})
>>> # so far, so good
>>> urlopen(r)
Traceback (most recent call last):
  ...
TypeError: memoryview: dict object does not have the buffer interface
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  ...
ValueError: Content-Length should be specified for iterable data of type <class 'dict'> {'post': 'data'}

This error seems to indicate that Content-Length should be specified somewhere, but AFAICT from the docs, the data should be bytes or None -- so let's try to urlencode them:

>>> r = Request('https://www.python.org', urlencode({'post': 'data'}))
>>> # so far, so good
>>> urlopen(r)
Traceback (most recent call last):
  ...
TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.

OK, maybe I should use bytes in the dict:

>>> r = Request('https://www.python.org', urlencode({b'post': b'data'}))
>>> # so far, so good
>>> urlopen(r)
Traceback (most recent call last):
  ...
TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.

That didn't work, I also needed to encode the output of urlencode().


Most of these problems could be prevented if Request() raised for non-str URLs, and non-bytes (and non-None) POST data.  Unless there some valid reason to accept invalid types, I think they should be rejected early.

----------
components: Library (Lib)
keywords: easy
messages: 253173
nosy: ezio.melotti
priority: normal
severity: normal
stage: needs patch
status: open
title: Add type checks to urllib.request.Request
type: enhancement
versions: Python 3.6

_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue25439>
_______________________________________


More information about the New-bugs-announce mailing list