[New-bugs-announce] [issue9374] urlparse should parse query and fragment for arbitrary schemes
Nick Welch
report at bugs.python.org
Sun Jul 25 00:58:41 CEST 2010
New submission from Nick Welch <mackstann at gmail.com>:
While the netloc/path parts of URLs are scheme-specific, and urlparse can be forgiven for refusing to parse them for unknown schemes, the query and fragment parts are standardized, and should be parsed for unrecognized schemes.
According to Wikipedia:
------------------
Internet standard STD 66 (also RFC 3986) defines the generic syntax to be used in all URI schemes. Every URI is defined as consisting of four parts, as follows:
<scheme name> : <hierarchical part> [ ? <query> ] [ # <fragment> ]
------------------
http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax
Here is a demonstration of what urlparse currently does:
>>> urlparse.urlsplit('myscheme://netloc/path?a=b#frag')
SplitResult(scheme='myscheme', netloc='', path='//netloc/path?a=b#frag', query='', fragment='')
>>> urlparse.urlsplit('http://netloc/path?a=b#frag')
SplitResult(scheme='http', netloc='netloc', path='/path', query='a=b', fragment='frag')
----------
components: Library (Lib)
messages: 111511
nosy: Nick.Welch
priority: normal
severity: normal
status: open
title: urlparse should parse query and fragment for arbitrary schemes
type: behavior
versions: Python 2.6
_______________________________________
Python tracker <report at bugs.python.org>
<http://bugs.python.org/issue9374>
_______________________________________
More information about the New-bugs-announce
mailing list