Python including packages
Hey all, after this talk http://pyfound.blogspot.com/2019/05/amber-brown-batteries-included-but.html on how useful standard libraries are this has been in talks in multiple channels. I just wanted to present my idea on the same. Why not keep the essentials (ensurepip) and strip off everything else. When someone imports a package like datetime, we can catch the error (ImportError) and install it. Or something similar. Please do let me know your feedback on why this might or might not be a good option so that we can come up with better solutions for issues. Thank you!
On 08Jul2019 11:40, Siddharth Prajosh
Hey all, after this talk http://pyfound.blogspot.com/2019/05/amber-brown-batteries-included-but.html on how useful standard libraries are this has been in talks in multiple channels. I just wanted to present my idea on the same.
Why not keep the essentials (ensurepip) and strip off everything else. When someone imports a package like datetime, we can catch the error (ImportError) and install it. Or something similar.
Are you thinking this happens at runtime? And is your objective to ship
a much smaller Python standard library and load whatever is actually
required as discovered?
The usual difficulty is that there's no general way to fetch packages in
every environment. For the obvious case: the offline environment, with
no network access.
Another trickiness is that while we usually try to not conditionally
import stuff, sometimes that happens. Which means you might run your
programme and autoimport most things, but still miss something which
only gets imported in a special circumstance.
_However_, there's something to be said for the convenience.
Had you considered writing a module which plugs into the import
machinery to auto-pip-install on ImportError? Then you could test your
ideas.
Finally, there's some security considerations.
A prize cause for an import error is simply misspelling a module name.
If that misspelling matches a known module, that gets fetched. AND RUN.
If the module used in error is malicious that's a really nasty failure
mode. Even a module with a similar name and similar but not identical
semantics could cause undesired (eg damaging, or just silently buggy)
behaviour for the user.
There have been real world examples of malicious packages put into
package repositories. If I recall (and my memory is fuzzy here), quite a
few in the JavaScript world and I think there was a known one in the
PyPI repo.
Leaving aside the "use a likely misspelling" situation, the other
situation is where a known module is withdrawn and a malicious person
installs something evil under the previously trustworthy name.
These issues make me cautious about automatically importing anything
that seems to be missing.
I'm more comfortable treating ImportErrors as stuff to inspect. Perhaps
I misspelled something. Perhaps I've failed to install something
important. Perhaps I'm using a feature I didn't really plan to install.
Cheers,
Cameron Simpson
Thanks, Cameron Simpson, for the feedback!
The security issue you mentioned is something really serious I didn't
really think about. I usually do this a lot for my side projects and random
stuff I automate. Hence suggested this.
Again, thanks for taking your time.
On Mon, Jul 8, 2019 at 1:14 PM Cameron Simpson
On 08Jul2019 11:40, Siddharth Prajosh
wrote: Hey all, after this talk < http://pyfound.blogspot.com/2019/05/amber-brown-batteries-included-but.html
on how useful standard libraries are this has been in talks in multiple channels. I just wanted to present my idea on the same.
Why not keep the essentials (ensurepip) and strip off everything else. When someone imports a package like datetime, we can catch the error (ImportError) and install it. Or something similar.
Are you thinking this happens at runtime? And is your objective to ship a much smaller Python standard library and load whatever is actually required as discovered?
The usual difficulty is that there's no general way to fetch packages in every environment. For the obvious case: the offline environment, with no network access.
Another trickiness is that while we usually try to not conditionally
import stuff, sometimes that happens. Which means you might run your programme and autoimport most things, but still miss something which only gets imported in a special circumstance.
_However_, there's something to be said for the convenience.
Had you considered writing a module which plugs into the import machinery to auto-pip-install on ImportError? Then you could test your ideas.
Finally, there's some security considerations.
A prize cause for an import error is simply misspelling a module name. If that misspelling matches a known module, that gets fetched. AND RUN.
If the module used in error is malicious that's a really nasty failure mode. Even a module with a similar name and similar but not identical semantics could cause undesired (eg damaging, or just silently buggy) behaviour for the user.
There have been real world examples of malicious packages put into package repositories. If I recall (and my memory is fuzzy here), quite a few in the JavaScript world and I think there was a known one in the PyPI repo.
Leaving aside the "use a likely misspelling" situation, the other situation is where a known module is withdrawn and a malicious person installs something evil under the previously trustworthy name.
These issues make me cautious about automatically importing anything that seems to be missing.
I'm more comfortable treating ImportErrors as stuff to inspect. Perhaps I misspelled something. Perhaps I've failed to install something important. Perhaps I'm using a feature I didn't really plan to install.
Cheers, Cameron Simpson
Please don't!
Python standard library can of course be improved with such a model, but
the cost overwrite the beneficit by a large margin.
First this proposition is not backward compatible. Basically it will break
any script using such library for the first time if internet is not
available.
Second, the distribution of standard lib with python is a solved problem.
New solution will bring new "bugs" in particular on debian at least which
has its own way to distribute python and if such feature is intriduced will
create incompatibility with other kinds of python. The current "bug" I'm
thinking of is that by default debian don't ship tkinter, breaking any
installation instruction given by a software developper.
Third, the python stdlib is a really strength of python and even if it has
some limitation, it does the job good enough for a big number of case. To
my knowledge, the only significant weakness of the stdlib is that requests
does a far better job than urllib, but urllib already make a very good job
and is definitely usable.
Fourth it is a jump in the unknown of which consequences will probably when
it will be too late to go back and as such should be taken very carefully.
Fifth, as far as I understand the core reason of the initial talk is that
twisted will support forever python 2, and as such demand that the whole
python ecosystem find workaround to allow this attitude. Breaking
compatibility with python 2 was a failure but now we must live with. If I
understand correctly the path chosen is helping as much as possible
migration, force developper to migrate if they want either new features or
security uodates and let python 2 users with no suported solution if they
want to stick with python 2. As such asking that the future of python is
drive by keeping the python 2 compatibility forever just go against all the
effort to cope with python 2 python 3 transition and will basically will be
a burden forever for the whole python ecosystem evolution. In my reading
pretty all the talk is as weak, but my mail is already very lonh
Le lun. 8 juil. 2019 08:13, Siddharth Prajosh
Hey all, after this talk http://pyfound.blogspot.com/2019/05/amber-brown-batteries-included-but.html on how useful standard libraries are this has been in talks in multiple channels. I just wanted to present my idea on the same.
Why not keep the essentials (ensurepip) and strip off everything else. When someone imports a package like datetime, we can catch the error (ImportError) and install it. Or something similar.
Please do let me know your feedback on why this might or might not be a good option so that we can come up with better solutions for issues.
Thank you! _______________________________________________ Python-ideas mailing list -- python-ideas@python.org To unsubscribe send an email to python-ideas-leave@python.org https://mail.python.org/mailman3/lists/python-ideas.python.org/ Message archived at https://mail.python.org/archives/list/python-ideas@python.org/message/OEESWN... Code of Conduct: http://python.org/psf/codeofconduct/
On 8 Jul 2019, at 12:19, Xavier Combelle
wrote: Fifth, as far as I understand the core reason of the initial talk is that twisted will support forever python 2, and as such demand that the whole python ecosystem find workaround to allow this attitude.
As I understand it Twisted has done its last feature release for python 2 and that branch will only get critical fixes going forward. The python 3 version will be where the new features will land. Barry
participants (4)
-
Barry Scott
-
Cameron Simpson
-
Siddharth Prajosh
-
Xavier Combelle