Python including packages

Hey all, after this talk <http://pyfound.blogspot.com/2019/05/amber-brown-batteries-included-but.html> on how useful standard libraries are this has been in talks in multiple channels. I just wanted to present my idea on the same. Why not keep the essentials (ensurepip) and strip off everything else. When someone imports a package like datetime, we can catch the error (ImportError) and install it. Or something similar. Please do let me know your feedback on why this might or might not be a good option so that we can come up with better solutions for issues. Thank you!

On 08Jul2019 11:40, Siddharth Prajosh <sprajosh@gmail.com> wrote:
Are you thinking this happens at runtime? And is your objective to ship a much smaller Python standard library and load whatever is actually required as discovered? The usual difficulty is that there's no general way to fetch packages in every environment. For the obvious case: the offline environment, with no network access. Another trickiness is that while we usually try to not conditionally import stuff, sometimes that happens. Which means you might run your programme and autoimport most things, but still miss something which only gets imported in a special circumstance. _However_, there's something to be said for the convenience. Had you considered writing a module which plugs into the import machinery to auto-pip-install on ImportError? Then you could test your ideas. Finally, there's some security considerations. A prize cause for an import error is simply misspelling a module name. If that misspelling matches a known module, that gets fetched. AND RUN. If the module used in error is malicious that's a really nasty failure mode. Even a module with a similar name and similar but not identical semantics could cause undesired (eg damaging, or just silently buggy) behaviour for the user. There have been real world examples of malicious packages put into package repositories. If I recall (and my memory is fuzzy here), quite a few in the JavaScript world and I think there was a known one in the PyPI repo. Leaving aside the "use a likely misspelling" situation, the other situation is where a known module is withdrawn and a malicious person installs something evil under the previously trustworthy name. These issues make me cautious about automatically importing anything that seems to be missing. I'm more comfortable treating ImportErrors as stuff to inspect. Perhaps I misspelled something. Perhaps I've failed to install something important. Perhaps I'm using a feature I didn't really plan to install. Cheers, Cameron Simpson <cs@cskk.id.au>

Thanks, Cameron Simpson, for the feedback! The security issue you mentioned is something really serious I didn't really think about. I usually do this a lot for my side projects and random stuff I automate. Hence suggested this. Again, thanks for taking your time. On Mon, Jul 8, 2019 at 1:14 PM Cameron Simpson <cs@cskk.id.au> wrote:
Another trickiness is that while we usually try to not conditionally

Please don't! Python standard library can of course be improved with such a model, but the cost overwrite the beneficit by a large margin. First this proposition is not backward compatible. Basically it will break any script using such library for the first time if internet is not available. Second, the distribution of standard lib with python is a solved problem. New solution will bring new "bugs" in particular on debian at least which has its own way to distribute python and if such feature is intriduced will create incompatibility with other kinds of python. The current "bug" I'm thinking of is that by default debian don't ship tkinter, breaking any installation instruction given by a software developper. Third, the python stdlib is a really strength of python and even if it has some limitation, it does the job good enough for a big number of case. To my knowledge, the only significant weakness of the stdlib is that requests does a far better job than urllib, but urllib already make a very good job and is definitely usable. Fourth it is a jump in the unknown of which consequences will probably when it will be too late to go back and as such should be taken very carefully. Fifth, as far as I understand the core reason of the initial talk is that twisted will support forever python 2, and as such demand that the whole python ecosystem find workaround to allow this attitude. Breaking compatibility with python 2 was a failure but now we must live with. If I understand correctly the path chosen is helping as much as possible migration, force developper to migrate if they want either new features or security uodates and let python 2 users with no suported solution if they want to stick with python 2. As such asking that the future of python is drive by keeping the python 2 compatibility forever just go against all the effort to cope with python 2 python 3 transition and will basically will be a burden forever for the whole python ecosystem evolution. In my reading pretty all the talk is as weak, but my mail is already very lonh Le lun. 8 juil. 2019 08:13, Siddharth Prajosh <sprajosh@gmail.com> a écrit :

On 8 Jul 2019, at 12:19, Xavier Combelle <xavier.combelle@gmail.com> wrote:
Fifth, as far as I understand the core reason of the initial talk is that twisted will support forever python 2, and as such demand that the whole python ecosystem find workaround to allow this attitude.
As I understand it Twisted has done its last feature release for python 2 and that branch will only get critical fixes going forward. The python 3 version will be where the new features will land. Barry

On 08Jul2019 11:40, Siddharth Prajosh <sprajosh@gmail.com> wrote:
Are you thinking this happens at runtime? And is your objective to ship a much smaller Python standard library and load whatever is actually required as discovered? The usual difficulty is that there's no general way to fetch packages in every environment. For the obvious case: the offline environment, with no network access. Another trickiness is that while we usually try to not conditionally import stuff, sometimes that happens. Which means you might run your programme and autoimport most things, but still miss something which only gets imported in a special circumstance. _However_, there's something to be said for the convenience. Had you considered writing a module which plugs into the import machinery to auto-pip-install on ImportError? Then you could test your ideas. Finally, there's some security considerations. A prize cause for an import error is simply misspelling a module name. If that misspelling matches a known module, that gets fetched. AND RUN. If the module used in error is malicious that's a really nasty failure mode. Even a module with a similar name and similar but not identical semantics could cause undesired (eg damaging, or just silently buggy) behaviour for the user. There have been real world examples of malicious packages put into package repositories. If I recall (and my memory is fuzzy here), quite a few in the JavaScript world and I think there was a known one in the PyPI repo. Leaving aside the "use a likely misspelling" situation, the other situation is where a known module is withdrawn and a malicious person installs something evil under the previously trustworthy name. These issues make me cautious about automatically importing anything that seems to be missing. I'm more comfortable treating ImportErrors as stuff to inspect. Perhaps I misspelled something. Perhaps I've failed to install something important. Perhaps I'm using a feature I didn't really plan to install. Cheers, Cameron Simpson <cs@cskk.id.au>

Thanks, Cameron Simpson, for the feedback! The security issue you mentioned is something really serious I didn't really think about. I usually do this a lot for my side projects and random stuff I automate. Hence suggested this. Again, thanks for taking your time. On Mon, Jul 8, 2019 at 1:14 PM Cameron Simpson <cs@cskk.id.au> wrote:
Another trickiness is that while we usually try to not conditionally

Please don't! Python standard library can of course be improved with such a model, but the cost overwrite the beneficit by a large margin. First this proposition is not backward compatible. Basically it will break any script using such library for the first time if internet is not available. Second, the distribution of standard lib with python is a solved problem. New solution will bring new "bugs" in particular on debian at least which has its own way to distribute python and if such feature is intriduced will create incompatibility with other kinds of python. The current "bug" I'm thinking of is that by default debian don't ship tkinter, breaking any installation instruction given by a software developper. Third, the python stdlib is a really strength of python and even if it has some limitation, it does the job good enough for a big number of case. To my knowledge, the only significant weakness of the stdlib is that requests does a far better job than urllib, but urllib already make a very good job and is definitely usable. Fourth it is a jump in the unknown of which consequences will probably when it will be too late to go back and as such should be taken very carefully. Fifth, as far as I understand the core reason of the initial talk is that twisted will support forever python 2, and as such demand that the whole python ecosystem find workaround to allow this attitude. Breaking compatibility with python 2 was a failure but now we must live with. If I understand correctly the path chosen is helping as much as possible migration, force developper to migrate if they want either new features or security uodates and let python 2 users with no suported solution if they want to stick with python 2. As such asking that the future of python is drive by keeping the python 2 compatibility forever just go against all the effort to cope with python 2 python 3 transition and will basically will be a burden forever for the whole python ecosystem evolution. In my reading pretty all the talk is as weak, but my mail is already very lonh Le lun. 8 juil. 2019 08:13, Siddharth Prajosh <sprajosh@gmail.com> a écrit :

On 8 Jul 2019, at 12:19, Xavier Combelle <xavier.combelle@gmail.com> wrote:
Fifth, as far as I understand the core reason of the initial talk is that twisted will support forever python 2, and as such demand that the whole python ecosystem find workaround to allow this attitude.
As I understand it Twisted has done its last feature release for python 2 and that branch will only get critical fixes going forward. The python 3 version will be where the new features will land. Barry
participants (4)
-
Barry Scott
-
Cameron Simpson
-
Siddharth Prajosh
-
Xavier Combelle