Mailman 3 Overriding stdlib http package - Python-Dev

Overriding stdlib http package

Demian Brecht

14 Jan 2015 14 Jan '15

4:32 p.m.

Hi all, As part of the work I'm doing on httplib3 (now that I've actually gotten a bit of time), one of the things I'm trying to get done is injection of httplib3 over http in order to not have to modify all import paths in modules and such. Here's the gist of what I have so far: https://gist.github.com/demianbrecht/bc6530a40718e4fcbf90. It's greatly simplified over importlib2's inject mechanism, but I'm assuming that's largely due to requirements of that package (i.e. Python 2) in contrast to this one. My questions are: Does this look sane? Is there anything that I might be not accounting for? It /does/ seem to work as expected when running tests, but I'm curious if there's anything that I might be missing that might jump out at someone more intimately familiar with the mechanics of importlib. Thanks, Demian

Attachments:

signature.asc (application/pgp-signature — 819 bytes)

Show replies by thread

Ian Cordasco

14 Jan 14 Jan

4:37 p.m.

I think this belongs on python-list, not python-dev. On Wed, Jan 14, 2015 at 10:32 AM, Demian Brecht <demianbrecht@gmail.com> wrote:

...

Hi all,

As part of the work I'm doing on httplib3 (now that I've actually gotten a bit of time), one of the things I'm trying to get done is injection of httplib3 over http in order to not have to modify all import paths in modules and such. Here's the gist of what I have so far: https://gist.github.com/demianbrecht/bc6530a40718e4fcbf90.

It's greatly simplified over importlib2's inject mechanism, but I'm assuming that's largely due to requirements of that package (i.e. Python 2) in contrast to this one.

My questions are: Does this look sane? Is there anything that I might be not accounting for? It /does/ seem to work as expected when running tests, but I'm curious if there's anything that I might be missing that might jump out at someone more intimately familiar with the mechanics of importlib.

Thanks, Demian

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/graffatcolmingov%40gmail....

Antoine Pitrou

4:54 p.m.

On Wed, 14 Jan 2015 08:32:23 -0800 Demian Brecht <demianbrecht@gmail.com> wrote:

...

Hi all,

As part of the work I'm doing on httplib3 (now that I've actually gotten a bit of time), one of the things I'm trying to get done is injection of httplib3 over http in order to not have to modify all import paths in modules and such. Here's the gist of what I have so far: https://gist.github.com/demianbrecht/bc6530a40718e4fcbf90.

What don't you simply monkeypatch sys.modules, e.g.: import myhttplib sys.modules['http'] = myhttplib or doesn't it work as desired? Regards Antoine.

Tres Seaver

5:04 p.m.

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 01/14/2015 11:54 AM, Antoine Pitrou wrote:

...

On Wed, 14 Jan 2015 08:32:23 -0800 Demian Brecht <demianbrecht@gmail.com> wrote:

...
Hi all,

As part of the work I'm doing on httplib3 (now that I've actually gotten a bit of time), one of the things I'm trying to get done is injection of httplib3 over http in order to not have to modify all import paths in modules and such. Here's the gist of what I have so far: https://gist.github.com/demianbrecht/bc6530a40718e4fcbf90.

What don't you simply monkeypatch sys.modules, e.g.:

import myhttplib

sys.modules['http'] = myhttplib

or doesn't it work as desired?

Doesn't that leave any prior imports broken (using the original module)? Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tseaver@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iEYEARECAAYFAlS2oZYACgkQ+gerLs4ltQ457gCfTSuwfOUHOivoQAUncq6VbxdQ YOkAoLec1hghar8IULuaz5W0MTXOtQm/ =tvv7 -----END PGP SIGNATURE-----

Antoine Pitrou

5:20 p.m.

On Wed, 14 Jan 2015 12:04:22 -0500 Tres Seaver <tseaver@palladion.com> wrote:

...

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1

On 01/14/2015 11:54 AM, Antoine Pitrou wrote:

...
On Wed, 14 Jan 2015 08:32:23 -0800 Demian Brecht <demianbrecht@gmail.com> wrote:

...
Hi all,

As part of the work I'm doing on httplib3 (now that I've actually gotten a bit of time), one of the things I'm trying to get done is injection of httplib3 over http in order to not have to modify all import paths in modules and such. Here's the gist of what I have so far: https://gist.github.com/demianbrecht/bc6530a40718e4fcbf90.

What don't you simply monkeypatch sys.modules, e.g.:

import myhttplib

sys.modules['http'] = myhttplib

or doesn't it work as desired?

Doesn't that leave any prior imports broken (using the original module)?

Not sure. Any fiddling with the import system is better done at startup, anyway. Regards Antoine.

Demian Brecht

5:37 p.m.

Hm, I /did/ try that but ran into issues. Swapping the custom finder for the monkey patch now seems to work as expected though. Could be that I was doing something else at the time that caused it not to work. I'll keep running with that and will ping the thread if the issues surface again. Thanks! On 2015-01-14 8:54 AM, Antoine Pitrou wrote:

...

What don't you simply monkeypatch sys.modules, e.g.:

import myhttplib

sys.modules['http'] = myhttplib

or doesn't it work as desired?

Ionel Cristian Mărieș

6:38 p.m.

You could do the sys.modules patch as Antoine suggested in a .pth file, so that it's triggered at startup. Eg, very similar: https://github.com/xando/subprocess.run/blob/ab02d165802b2ad57dd0d16c1169ab0... Thanks, -- Ionel Cristian Mărieș, blog.ionelmc.ro On Wed, Jan 14, 2015 at 6:32 PM, Demian Brecht <demianbrecht@gmail.com> wrote:

...

Hi all,

As part of the work I'm doing on httplib3 (now that I've actually gotten a bit of time), one of the things I'm trying to get done is injection of httplib3 over http in order to not have to modify all import paths in modules and such. Here's the gist of what I have so far: https://gist.github.com/demianbrecht/bc6530a40718e4fcbf90.

It's greatly simplified over importlib2's inject mechanism, but I'm assuming that's largely due to requirements of that package (i.e. Python 2) in contrast to this one.

My questions are: Does this look sane? Is there anything that I might be not accounting for? It /does/ seem to work as expected when running tests, but I'm curious if there's anything that I might be missing that might jump out at someone more intimately familiar with the mechanics of importlib.

Thanks, Demian

_______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/contact%40ionelmc.ro

Guido van Rossum

7:35 p.m.

Why do you want to hack the existing http modules? This is not a rhetorical question. The answer may lead us to redesign the existing http modules to be more flexible so that the higher-level problem you are trying to solve by hacking http import can be solved instead by using an interface provided by the stdlib http module. -- --Guido van Rossum (python.org/~guido)

Demian Brecht

8:07 p.m.

On 2015-01-14 11:35 AM, Guido van Rossum wrote:

...

Why do you want to hack the existing http modules?

This is not a rhetorical question. The answer may lead us to redesign the existing http modules to be more flexible so that the higher-level problem you are trying to solve by hacking http import can be solved instead by using an interface provided by the stdlib http module.

Sorry, this venture began in core-mentorship, so a little context may be of use: My end goal is to become a maintainer of the http package. As I'm not a core dev, Nick had suggested making a friendly fork of the package in order to facilitate progress without being bound to the non-core dev contributor workflow (it can, at times, be a little painful getting reviews and such completed on orphaned packages). So, the question that I was trying to answer isn't directly related to the http package in particular, but how to override stdlib modules in general with a third party package in order to facilitate out of band development while making minimal changes to package code (i.e. changing all absolute import package names in test and module code) to ease upstream merging. That all said, this would likely be a moot issue if I had commit privileges ;) But it might be nice to figure out a good workflow should this come up again with any other new contributors looking to take ownership of an orphaned module.

Guido van Rossum

8:25 p.m.

Aha. Glad I asked. You would arguably get a more useful response if you asked on core-mentorship and explained some of that background (for those of us who rely on external memory :-). The stdlib intentionally makes what you are trying to do hard (so library writers don't have to worry about stdlib modules being overridden with hacks at the whim of other library writers or app writers). I'm not sure how commit privileges would help you -- can't you just fork the CPython (I'm sure there's already a Bitbucket mirror that you can fork easily) and do your work there? Even with commit privileges you wouldn't be committing partial work unreviewed. On Wed, Jan 14, 2015 at 12:07 PM, Demian Brecht <demianbrecht@gmail.com> wrote:

...

...
Why do you want to hack the existing http modules?

This is not a rhetorical question. The answer may lead us to redesign the existing http modules to be more flexible so that the higher-level

On 2015-01-14 11:35 AM, Guido van Rossum wrote: problem

...
you are trying to solve by hacking http import can be solved instead by using an interface provided by the stdlib http module.

Sorry, this venture began in core-mentorship, so a little context may be of use: My end goal is to become a maintainer of the http package. As I'm not a core dev, Nick had suggested making a friendly fork of the package in order to facilitate progress without being bound to the non-core dev contributor workflow (it can, at times, be a little painful getting reviews and such completed on orphaned packages).

So, the question that I was trying to answer isn't directly related to the http package in particular, but how to override stdlib modules in general with a third party package in order to facilitate out of band development while making minimal changes to package code (i.e. changing all absolute import package names in test and module code) to ease upstream merging.

That all said, this would likely be a moot issue if I had commit privileges ;) But it might be nice to figure out a good workflow should this come up again with any other new contributors looking to take ownership of an orphaned module.

-- --Guido van Rossum (python.org/~guido)

Demian Brecht

9 p.m.

On 2015-01-14 12:25 PM, Guido van Rossum wrote:

...

I'm not sure how commit privileges would help you -- can't you just fork the CPython (I'm sure there's already a Bitbucket mirror that you can fork easily) and do your work there? Even with commit privileges you wouldn't be committing partial work unreviewed.

The friendly module fork allows for others to easily (or at least the intention is to do it easily) use the module with the new, backwards compatible features as a drop in replacement for the stdlib module. Giving others the ability to do this would lend itself to the adoption of the module and bug reports and such before upstream patches are produced. That said, the main downside to the friendly fork is the patch submission process: After changes have been merged to the fork, there's bound to be churn during the upstream patch submission, which would likely lead to something that looks like:

...

Implement feature/bug fix [1] Commit changes to httlib3 Generate patch for CPython Import patch to local CPython Run unit tests [1] Generate hg patch (patchA) for submission to bug tracker Upload patchA patchA is reviewed Implement review changes and generate patchB [1] Upload patchB [...wait for merge...] Merge delta of patchB and patchA to httplib3 Test/upload new PyPI package

I see commit privileges helping in two ways: 1. I've experienced lag on a few occasions between review and merge. I'm assuming that this is largely due to a lack of dotted line maintainer of the http package (although I believe that the general consensus is that Senthil is the de facto maintainer of the package). Commit privileges would help in getting the patches merged once reviews are complete. 2. It would help my own workflow. While feature development can be done in httplib3, I do also tend to swap between issues in the bug tracker and large feature work. Because I have two lines of work (CPython/bug tracker and Github), I run into issues around where these changes should be made: Should the bug fixes live in CPython/bug tracker or should I fix the issue in httplib3 and go through the submission workflow above? Either way, I'm signing myself up for a good deal of headache managing the httplib3 work, especially when development work across feature branches is dependent on patches submitted to CPython. I definitely don't mind the extra work if there are no other options, but my end goal is to be a maintainer of the http package and core developer, not to maintain a third party fork.

Brett Cannon

9:19 p.m.

On Wed Jan 14 2015 at 4:08:52 PM Demian Brecht <demianbrecht@gmail.com> wrote:

...

On 2015-01-14 12:25 PM, Guido van Rossum wrote:

...
I'm not sure how commit privileges would help you -- can't you just fork the CPython (I'm sure there's already a Bitbucket mirror that you can fork easily) and do your work there? Even with commit privileges you wouldn't be committing partial work unreviewed.

The friendly module fork allows for others to easily (or at least the intention is to do it easily) use the module with the new, backwards compatible features as a drop in replacement for the stdlib module.

But as Guido pointed out, we _like_ it being difficult to do because we don't want this kind of substitution happening as code ends up depending on bugs and quirks that you may fix.

...

Giving others the ability to do this would lend itself to the adoption of the module and bug reports and such before upstream patches are produced.

That said, the main downside to the friendly fork is the patch submission process: After changes have been merged to the fork, there's bound to be churn during the upstream patch submission, which would likely lead to something that looks like:

...
Implement feature/bug fix [1] Commit changes to httlib3 Generate patch for CPython Import patch to local CPython Run unit tests [1] Generate hg patch (patchA) for submission to bug tracker Upload patchA patchA is reviewed Implement review changes and generate patchB [1] Upload patchB [...wait for merge...] Merge delta of patchB and patchA to httplib3 Test/upload new PyPI package

I see commit privileges helping in two ways:

1. I've experienced lag on a few occasions between review and merge. I'm assuming that this is largely due to a lack of dotted line maintainer of the http package (although I believe that the general consensus is that Senthil is the de facto maintainer of the package). Commit privileges would help in getting the patches merged once reviews are complete.

2. It would help my own workflow. While feature development can be done in httplib3, I do also tend to swap between issues in the bug tracker and large feature work. Because I have two lines of work (CPython/bug tracker and Github), I run into issues around where these changes should be made: Should the bug fixes live in CPython/bug tracker or should I fix the issue in httplib3 and go through the submission workflow above? Either way, I'm signing myself up for a good deal of headache managing the httplib3 work, especially when development work across feature branches is dependent on patches submitted to CPython.

I definitely don't mind the extra work if there are no other options, but my end goal is to be a maintainer of the http package and core developer, not to maintain a third party fork.

How many other modules are dependent on the http module in the stdlib that are going to be affected by your changes? One option is you fork http **and** and modules in the stdlib that are dependent on it. You don't really have to change the other modules beyond their import statement of using http -- you can even do `import http3 as http` or something to minimize the changes -- but you at least don't have to monkeypatch sys.modules for others to gain from your http changes. Plus as you patch stuff in http you may find you have/want to patch other dependent modules as well and so you will have already done that.

Donald Stufft

9:35 p.m.

...

On Jan 14, 2015, at 4:19 PM, Brett Cannon <brett@python.org> wrote:

But as Guido pointed out, we _like_ it being difficult to do because we don't want this kind of substitution happening as code ends up depending on bugs and quirks that you may fix.

Not all of us, I hate the default order of sys.path. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Demian Brecht

9:58 p.m.

On 2015-01-14 1:19 PM, Brett Cannon wrote:

...

But as Guido pointed out, we _like_ it being difficult to do because we don't want this kind of substitution happening as code ends up depending on bugs and quirks that you may fix.

I can understand the reasoning.

...

How many other modules are dependent on the http module in the stdlib that are going to be affected by your changes? One option is you fork http **and** and modules in the stdlib that are dependent on it. You don't really have to change the other modules beyond their import statement of using http -- you can even do `import http3 as http` or something to minimize the changes -- but you at least don't have to monkeypatch sys.modules for others to gain from your http changes. Plus as you patch stuff in http you may find you have/want to patch other dependent modules as well and so you will have already done that.

It looks like there are 5 other modules dependent on the http package. If I understand what you're proposing, it pretty much defeats the purpose of what I'm trying to accomplish with a standalone httplib3 package. That said, considering the points that you and Guido have both made, I think that the best course of action is to either just fork CPython as a whole or to continue with httplib3 but abandon overriding sys.modules, develop features detached from the stdlib and worry about fixing dependencies when integrating changes upstream.

Brett Cannon

15 Jan 15 Jan

4:30 p.m.

On Wed Jan 14 2015 at 4:58:20 PM Demian Brecht <demianbrecht@gmail.com> wrote:

...

On 2015-01-14 1:19 PM, Brett Cannon wrote:

...
But as Guido pointed out, we _like_ it being difficult to do because we don't want this kind of substitution happening as code ends up depending on bugs and quirks that you may fix.

I can understand the reasoning.

...
How many other modules are dependent on the http module in the stdlib that are going to be affected by your changes? One option is you fork http **and** and modules in the stdlib that are dependent on it. You don't really have to change the other modules beyond their import statement of using http -- you can even do `import http3 as http` or something to minimize the changes -- but you at least don't have to monkeypatch sys.modules for others to gain from your http changes. Plus as you patch stuff in http you may find you have/want to patch other dependent modules as well and so you will have already done that.

It looks like there are 5 other modules dependent on the http package. If I understand what you're proposing, it pretty much defeats the purpose of what I'm trying to accomplish with a standalone httplib3 package.

That said, considering the points that you and Guido have both made, I think that the best course of action is to either just fork CPython as a whole or to continue with httplib3 but abandon overriding sys.modules, develop features detached from the stdlib and worry about fixing dependencies when integrating changes upstream.

If I were you I would fork and then for bugfixes send them upstream to us while you develop API additions independently. That way if your fork gains traction you can come to us and say "my fork has a stable API, has existed for (at least) a year, and the community seems to have rallied behind it", at which point we can look at drawing it in. And if you fix enough bugs we might make you maintainer anyway while you work out API design with the community outside of the stdlib.

Nick Coghlan

24 Jan 24 Jan

7:57 a.m.

On 15 January 2015 at 07:35, Donald Stufft <donald@stufft.io> wrote:

...

On Jan 14, 2015, at 4:19 PM, Brett Cannon <brett@python.org> wrote:

But as Guido pointed out, we _like_ it being difficult to do because we don't want this kind of substitution happening as code ends up depending on bugs and quirks that you may fix.

Not all of us, I hate the default order of sys.path.

It's mostly an opinion that arises from debugging other people's problems after they've managed to import the wrong thing without realising it (cf. the "don't use 'socket.py' as the name of your script for learning about how TCP sockets work" problem). We're aware that annoys power users, but they're far better equipped to handle the problem than if we inverted the situation. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia

Donald Stufft

3:17 p.m.

...

On Jan 24, 2015, at 2:57 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:

On 15 January 2015 at 07:35, Donald Stufft <donald@stufft.io> wrote:

...
On Jan 14, 2015, at 4:19 PM, Brett Cannon <brett@python.org> wrote:

But as Guido pointed out, we _like_ it being difficult to do because we don't want this kind of substitution happening as code ends up depending on bugs and quirks that you may fix.

Not all of us, I hate the default order of sys.path.

It's mostly an opinion that arises from debugging other people's problems after they've managed to import the wrong thing without realising it (cf. the "don't use 'socket.py' as the name of your script for learning about how TCP sockets work" problem). We're aware that annoys power users, but they're far better equipped to handle the problem than if we inverted the situation.

It’s not just power users that it’s good for, it makes it harder for even beginners to use things like backports of modules. For example unittest2 and explaining to people the difference between unittest and unittest2 and that unittest2 isn’t actually any different than unittest on newer versions of Python. Or, for example, PEP 453 could have been like 100x better if it would have been reasonable to just add pip to the stdlib but still enabling the ability to install an upgraded version of it that would take precedence. Or you have things like pdb++ which needs to replace the pdb import because a lot of tools only have a flag like —pdb and do not provide a way to switch it to a different import. The sys.path ordering means that pdb++ has to do hacks in its setup.py[1] which means it won’t be compatible with Wheel files or with a world where sdists don’t use a setup.py. The current situation is that if you install something as an egg (which setuptools does by default anyways) then setuptools will put it before the stdlib and it’ll take precedence. This is a nice situation because it means that if you do run into a problem then it’s easier to debug because ``python -c import module; print(module.__file__)`` will always return the same answer in the “broken” environment. The alternative is often either a different name (which confuses people as to the relation) or monkey patching which means that module.__file__ might either be wrong if they just monkey patched the file and it always means that the behavior is going to change depending on what you’ve imported which is way more confusing then being able to override the stdlib. --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Donald Stufft

3:21 p.m.

...

On Jan 24, 2015, at 10:17 AM, Donald Stufft <donald@stufft.io> wrote:

Or you have things like pdb++ which needs to replace the pdb import because a lot of tools only have a flag like —pdb and do not provide a way to switch it to a different import. The sys.path ordering means that pdb++ has to do hacks in its setup.py[1] which means it won’t be compatible with Wheel files or with a world where sdists don’t use a setup.py.

Sorry, forgot to link this https://bitbucket.org/antocuni/pdb/src/4669c3747a396e3766173feb40ebece32ab08... --- Donald Stufft PGP: 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

Demian Brecht

26 Jan 26 Jan

7:49 p.m.

On 2015-01-24 7:17 AM, Donald Stufft wrote:

...

It’s not just power users that it’s good for, it makes it harder for even beginners to use things like backports of modules.

What about cases where new module versions are put in as dependencies of other packages and they stomp standard library packages unbeknownst to the user installing the higher level package? For example, let's say packageB overrides stdlib's packageA. packageC requires packageB, which stomps packageA at import time. Now, author of packageD requires packageC but is unaware of the fact that packageB overrides packageA, but heavily uses packageA directly and expects the stdlib behavior, not the modified behavior in packageB. (Hope I got the hierarchy right in that description ;)) This would likely cause unexpected behavior and I can only assume that it would likely be quite difficult to track down, even for a power user. The same logic applies to unrelated stdlib modules that depend on the stdlib behavior of packageA as Brett pointed out. As someone who's recently faced the problem, while making it easier would have been immediately beneficial to me as the module author, I can understand the reasoning behind making this a difficult thing to do. I /do/ think that it might be worthwhile to invest some time in making it easier to do while still satisfying the safety of other packages, but I would venture to say it would definitely be non-trivial.

3395

Age (days ago)

3407

Last active (days ago)

List overview

Download

18 comments

9 participants

participants (9)

Antoine Pitrou
Brett Cannon
Demian Brecht
Donald Stufft
Guido van Rossum
Ian Cordasco
Ionel Cristian Mărieș
Nick Coghlan
Tres Seaver

Overriding stdlib http package

tags

participants (9)