On Feb 16, 2020, at 07:21, Antoine Pitrou <antoine@python.org> wrote:
FWIW, I agree with Senthil here. A slight behaviour change in 3.9 is fine, especially in an area where the "right" semantics are not immediately obvious. What we want to avoid is breaking behaviour changes in bugfix releases.
I agree totally that we don't want to break behavior in bugfix releases and I have no problem with making breaking changes in feature releases (3.9.0) as warranted.
My point was that, after looking at this a bit, it seems to me that making this change does not address some of the underlying problems with the urlparse API and that it makes things *much* worse for the many users who are understandably expecting urlparse to sanely handle schemaless urlstrings, the most commonly seen urls format today.
Note that we strongly imply that we sanely handle them by offering the "scheme=" paramater to urlparse. Another example: prior to 3.7.6 and 3.8.1:
urlparse("www.google.com:8080", scheme="http") ParseResult(scheme='http', netloc='', path='www.google.com:8080', params='', query='', fragment='')
That isn't what users would expect; what they would expect is how things work with an explicit scheme (note the swapping of netloc and path).
urlparse("https://www.google.com:8080", scheme="http") ParseResult(scheme='https', netloc='www.google.com:8080', path='', params='', query='', fragment='')
But at least there is a relatively simple workaround that users have discovered as witnessed by the requests code snippet I cited earlier: use the path field if netloc is empty.
Now with the change in 3.8.1 and 3.7.6, the behavior is very different and pretty useless even with an explicit scheme="http" parameter:
urlparse("www.google.com:8080", scheme="http") ParseResult(scheme='www.google.com', netloc='', path='8080', params='', query='', fragment='')
i.e. www.google.com://8000
While that may be what strict adherence to the RFC dictates, most users aren't going to expect or desire results like that. So while the change may fix some cases, it's only making matters worse. What kind of workaroud do you use for that result?
In another open issue concerning a different urlparse issue, Victor noted that (4 months ago) "there are 124 open issues with "urllib" in their title and 12 open issues with "urlparse" in their title" and hit a bit of a dead end with a proposed fix.
https://bugs.python.org/issue36338#msg355322
Rather than continuing this change in 3.9 introducing yet another, even more unexpected behavior, I think we should first try to address what appears to me to be the (a?) root cause issue: urlparse's API is not suited for parsing both strictly RFC-compliant URLs (which are clearly not well-understood) *and* today's schemeless URLs as have evolved over the years to become the most commonly encountered form of URL. Users want and need both. The merged change makes the previous situation worse, IMHO.
Le 16/02/2020 à 13:13, Senthil Kumaran a écrit :
On Sun, Feb 16, 2020 at 2:20 AM Ned Deily <nad@python.org <mailto:nad@python.org>> wrote:
For 3.9.0, I recommend we reconsider this change (temporarily reverting it) and consider whether an API change to accommodate the various use cases would be better
For 3.9. - I am ready to defend the patch even at the cost of the breaking of the parsing of undefined behavior. We should keep it. The patch simplifies a lot of corner cases and fixes the reported bugs. We don't guarantee backward compatibility between major versions, so I assume users will be careful when relying upon this undefined behavior and will take corrective action on their side before upgrading to 3.9.
We want patch releases to be backward compatible. That was the user-complaint.
Thanks, Senthil
python-committers mailing list -- python-committers@python.org To unsubscribe send an email to python-committers-leave@python.org https://mail.python.org/mailman3/lists/python-committers.python.org/ Message archived at https://mail.python.org/archives/list/python-committers@python.org/message/S... Code of Conduct: https://www.python.org/psf/codeofconduct/
python-committers mailing list -- python-committers@python.org To unsubscribe send an email to python-committers-leave@python.org https://mail.python.org/mailman3/lists/python-committers.python.org/ Message archived at https://mail.python.org/archives/list/python-committers@python.org/message/P... Code of Conduct: https://www.python.org/psf/codeofconduct/
-- Ned Deily nad@python.org -- []