[issue20270] urllib.parse doesn't work with empty port

Serhiy Storchaka
Wed Jan 15 13:26:25 CET 2014

New submission from Serhiy Storchaka:

According to RFC 3986 the port subcomponent is defined as zero or more decimal digits delimited from the host by a single colon. I.e. 'python.org:' is valid (but not normalized) form. Empty port is equivalent to absent port.

>>> import urllib.parse
>>> p = urllib.parse.urlparse('http://python.org:')
>>> p.hostname
>>> p.port  # should return None
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/serhiy/py/cpython-3.3/Lib/urllib/parse.py", line 156, in port
    port = int(port, 10)
ValueError: invalid literal for int() with base 10: ''
>>> urllib.parse.splitport('python.org:')  # should return ('python.org', None)
('python.org:', None)
>>> urllib.parse.splitnport('python.org:')  # should return ('python.org', -1)
('python.org', None)
>>> urllib.parse.splitnport('python.org:', 80)  # should return ('python.org', 80)
('python.org', None)

Proposed patch fixes this. It also adds tests for urllib.parse.splitport().

components: Library (Lib)
files: urllib_parse_empty_port.patch
keywords: patch
messages: 208155
nosy: orsenthil, serhiy.storchaka
priority: normal
severity: normal
stage: patch review
status: open
title: urllib.parse doesn't work with empty port
type: behavior
versions: Python 2.7, Python 3.3, Python 3.4
Added file: http://bugs.python.org/file33480/urllib_parse_empty_port.patch

Python tracker

