[New-bugs-announce] [issue43883] Making urlparse WHATWG conformant
report at bugs.python.org
Sun Apr 18 15:43:37 EDT 2021
New submission from Senthil Kumaran <senthil at uthcode.com>:
Mike Lissner reported that a set test suites that exercise extreme conditions with URLs, but in conformance with url.spec.whatwg.org
was maintained here:
These test cases were used against urlparse and urljoin method.
The basic idea is to iterate over the test cases and try joining and parsing them. The script wound up messier than I wanted b/c there's a fair bit of normalization you have to do (e.g., the test cases expect blank paths to be '/', while urlparse returns an empty string), but you'll get the idea.
The bad news is that of the roughly 600 test cases fewer than half pass. Some more normalization would fix some more of this, and I don't imagine all of these have security concerns (I haven't thought through it, honestly, but there are issues with domain parsing too that look meddlesome). For now I've taken it as far as I can, and it should be a good start, I think.
The final numbers the script cranks out are:
Done. 231/586 successes. 1 skipped.
stage: needs patch
title: Making urlparse WHATWG conformant
Python tracker <report at bugs.python.org>
More information about the New-bugs-announce