Hi pythonistas. A couple of moths ago I opened an issue in the bug tracker for adding a new syscall to the os module. It's based on new developments in the Linux kernel. Here's the link: https://bugs.python.org/issue26826 After two months and a half I managed to create a nice patch with unit tests and documentation (yay!), but then the issue went cold. I would like to know how to proceed to get the issue going again. Cheers, -- Marcos. -- (Not so) Random fortune: Terrorism isn't a crime against people or property. It's a crime against our minds, using the death of innocents and destruction of property to make us fearful. -- Bruce Schneier
I wonder if the issue isn't that there are so many Linux syscalls that we probably should have a process for deciding which ones are worth supporting in the os module, and that process should not necessarily start with a patch review. What fraction of Linux syscalls do we currently support? What fraction of BSD syscalls? How much of this is better served by a 3rd party module? Certainly it's not rocket science to write a C extension module that wraps a syscall or a bunch of them. On Wed, Aug 3, 2016 at 10:23 AM, Marcos Dione <mdione@grulic.org.ar> wrote:
Hi pythonistas. A couple of moths ago I opened an issue in the bug tracker for adding a new syscall to the os module. It's based on new developments in the Linux kernel. Here's the link:
https://bugs.python.org/issue26826
After two months and a half I managed to create a nice patch with unit tests and documentation (yay!), but then the issue went cold. I would like to know how to proceed to get the issue going again.
Cheers,
-- Marcos.
-- (Not so) Random fortune: Terrorism isn't a crime against people or property. It's a crime against our minds, using the death of innocents and destruction of property to make us fearful. -- Bruce Schneier _______________________________________________ Python-Dev mailing list Python-Dev@python.org https://mail.python.org/mailman/listinfo/python-dev Unsubscribe: https://mail.python.org/mailman/options/python-dev/guido%40python.org
-- --Guido van Rossum (python.org/~guido)
On Wed, Aug 03, 2016 at 10:46:13AM -0700, Guido van Rossum wrote:
I wonder if the issue isn't that there are so many Linux syscalls that we probably should have a process for deciding which ones are worth supporting in the os module, and that process should not necessarily start with a patch review. [...] Certainly it's not rocket science to write a C extension module that wraps a syscall or a bunch of them.
I agree, but also notice that some of these syscalls, specially those which are optimizations for certain situations like this one or sendfile(), could also be used by the rest of python's core modules if they're available. In this case in particular, it could be used to speed up copyfile(), copy(), copy2() and probably copytree() from the shutil module. In fact, if this patch goes in, I'm planning to implement such optimizations. -- (Not so) Random fortune: Terrorism isn't a crime against people or property. It's a crime against our minds, using the death of innocents and destruction of property to make us fearful. -- Bruce Schneier
Then again are people really concerned about the speed of those file copy functions? Or are we just offering a solution in search of a problem? (I honestly don't know. At Dropbox we don't use Python for scripting much, we use it to write dynamic web servers. Static files are served by a CDN so e.g. sendfile() is not interesting to us either.) On Wed, Aug 3, 2016 at 11:16 AM, Marcos Dione <mdione@grulic.org.ar> wrote:
On Wed, Aug 03, 2016 at 10:46:13AM -0700, Guido van Rossum wrote:
I wonder if the issue isn't that there are so many Linux syscalls that we probably should have a process for deciding which ones are worth supporting in the os module, and that process should not necessarily start with a patch review. [...] Certainly it's not rocket science to write a C extension module that wraps a syscall or a bunch of them.
I agree, but also notice that some of these syscalls, specially those which are optimizations for certain situations like this one or sendfile(), could also be used by the rest of python's core modules if they're available. In this case in particular, it could be used to speed up copyfile(), copy(), copy2() and probably copytree() from the shutil module. In fact, if this patch goes in, I'm planning to implement such optimizations.
-- (Not so) Random fortune: Terrorism isn't a crime against people or property. It's a crime against our minds, using the death of innocents and destruction of property to make us fearful. -- Bruce Schneier
-- --Guido van Rossum (python.org/~guido)
On Wed, Aug 03, 2016 at 11:31:46AM -0700, Guido van Rossum wrote:
Then again are people really concerned about the speed of those file copy functions? Or are we just offering a solution in search of a problem?
At kernel level: clearly yes, otherwise their BDFL would noy allow those[1] patches to go in. Now, should/could Python benefit from them? I personally think so, that it should. That's why I developed the patch in the first place. As for the cost, in terms of maintainability, I just noticed that copy() and copy2() use copyfile(), and that copytree() uses copy2(), so only one function should be modified. True, the code will be ~50% more complex (it needs to check the availability of the function and the suitability for the parameters given; copy_file_range() only works on files on the same filesystem[1]). Hmm... Maybe you're right. Maybe, to keep Python's own code simple, we could skip these optimizations, and leave them in a 3rd party module. -- [1] I still don't understand why all these optimizations are exposed separately for the specific cases in which they work; I would expect one function that would take care of the details, but at least provide copy functionality without using user space buffers but kernel ones. But then, I'm not a kernel developer, just a puny Python one... -- (Not so) Random fortune: Terrorism isn't a crime against people or property. It's a crime against our minds, using the death of innocents and destruction of property to make us fearful. -- Bruce Schneier
On Wed, Aug 3, 2016, at 16:32, Marcos Dione wrote:
(it needs to check the availability of the function and the suitability for the parameters given; copy_file_range() only works on files on the same filesystem[1]). Hmm...
What is the benefit to using copy_file_range over sendfile in this scenario? Or does sendfile not work with regular files on Linux? Maybe os.sendfile should use copy_file_range if available/applicable, and the shutils functions can use it?
On 4 August 2016 at 06:32, Marcos Dione <mdione@grulic.org.ar> wrote:
Maybe you're right. Maybe, to keep Python's own code simple, we could skip these optimizations, and leave them in a 3rd party module.
Having the scandir package on PyPI made it possible for folks to quantify the benefits of the new os.scandir() call for different workloads before we committed to adding it to the standard library. It also had the benefit of allowing folks to achieve the speedups by installing the library and changing their code if that was easier for them than waiting for a new release of CPython and upgrading to it (e.g. that's common for library authors that focus on Linux can often require an extra C dependency without too much hassle, but also frequently target the older versions of Python shipped by long term support distros). It seems to me that a dedicated "os_linux" addon module on PyPI could serve a dual purpose in making updated os module APIs available on older versions of Python, as well as in providing a venue for folks to experiment with the performance of new syscalls before proposing them for stdlib inclusion. Cheers, Nick. -- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia
On 8/3/2016 1:23 PM, Marcos Dione wrote:
Hi pythonistas. A couple of moths ago I opened an issue in the bug tracker for adding a new syscall to the os module. It's based on new developments in the Linux kernel. Here's the link:
I suggest that at some point you summarized Guido's questions and any other discussion here on the issue for the benefit of people not reading this thread. Adding a link to the thread (in the archive) would be even better. -- Terry Jan Reedy
On 03/08/2016, Marcos Dione <mdione@grulic.org.ar> wrote:
Hi pythonistas. A couple of moths ago I opened an issue in the bug tracker for adding a new syscall to the os module. It's based on new developments in the Linux kernel. Here's the link:
To give more context, this is about adding support for Linux’s copy_file_range() system call.
After two months and a half I managed to create a nice patch with unit tests and documentation (yay!), but then the issue went cold. I would like to know how to proceed to get the issue going again.
I thought the main problem remaining was getting concensus about adding it, since a couple people mentioned waiting for glibc support. I don’t have much personal opinion on this, but FWIW I don’t see much disadvantage in adding it now. I did mean to look over your latest patch, but that has drifted towards the bottom of my list of things to do :)
participants (6)
-
Guido van Rossum
-
Marcos Dione
-
Martin Panter
-
Nick Coghlan
-
Random832
-
Terry Reedy