Improve os.times() resolution
It turns out we could use resource.getrusage() which provides micro seconds (tested on Linux and macOS): import os, resource for x in range(10000000): # warm up pass for x in range(5): a = os.times() b = resource.getrusage(resource.RUSAGE_SELF) print(a.user, a.system) print(b.ru_utime, b.ru_stime) ...it prints: 0.39 0.01 0.394841 0.011963999999999999 0.39 0.01 0.394899 0.011966 0.39 0.01 0.394908 0.011966 0.39 0.01 0.394936 0.011967 0.39 0.01 0.394963 0.011968 getrusage(RUSAGE_CHILDREN) can be used to calculate "children_user" and "children_system". I see 2 possibilities here: 1) doc fix, mentioning that resource.getrusage provides a better resolution 2) if available (it should always be as it's a POSIX standard), just use getrusage in Modules/posixmodule.c. It seems we can check availability by reusing HAVE_SYS_RESOURCE_H and HAVE_SYS_TIME_H definitions which are already in place. I'm not sure what's best to do as os.* functions usually expose the original C function with the same name, but given that "elapsed" field is not part of times(2) struct and that on Windows "elapsed", "children_user" and "children_system" are set to 0 it appears there may be some space for flexibility here. Thoughts? -- Giampaolo - http://grodola.blogspot.com
Have you checked how much overhead the two functions have? That seems like an obvious way this proposal could go south.
On 24 Mar 2019, at 13:15, Giampaolo Rodola' <g.rodola@gmail.com> wrote:
It turns out we could use resource.getrusage() which provides micro seconds (tested on Linux and macOS):
import os, resource for x in range(10000000): # warm up pass for x in range(5): a = os.times() b = resource.getrusage(resource.RUSAGE_SELF) print(a.user, a.system) print(b.ru_utime, b.ru_stime)
...it prints:
0.39 0.01 0.394841 0.011963999999999999 0.39 0.01 0.394899 0.011966 0.39 0.01 0.394908 0.011966 0.39 0.01 0.394936 0.011967 0.39 0.01 0.394963 0.011968
getrusage(RUSAGE_CHILDREN) can be used to calculate "children_user" and "children_system". I see 2 possibilities here:
1) doc fix, mentioning that resource.getrusage provides a better resolution 2) if available (it should always be as it's a POSIX standard), just use getrusage in Modules/posixmodule.c. It seems we can check availability by reusing HAVE_SYS_RESOURCE_H and HAVE_SYS_TIME_H definitions which are already in place.
I'm not sure what's best to do as os.* functions usually expose the original C function with the same name, but given that "elapsed" field is not part of times(2) struct and that on Windows "elapsed", "children_user" and "children_system" are set to 0 it appears there may be some space for flexibility here.
Thoughts?
-- Giampaolo - http://grodola.blogspot.com _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/
On Sun, Mar 24, 2019 at 2:29 PM Anders Hovmöller <boxed@killingar.net> wrote:
Have you checked how much overhead the two functions have? That seems like an obvious way this proposal could go south.
Without patch: $ ./python -m timeit -s "import os" "os.times()" 500000 loops, best of 5: 546 nsec per loop With patch: $ ./python -m timeit -s "import os" "os.times()" 200000 loops, best of 5: 1.23 usec per loop The patch: diff --git a/Modules/posixmodule.c b/Modules/posixmodule.c index 3f76018357..ad91ed702a 100644 --- a/Modules/posixmodule.c +++ b/Modules/posixmodule.c @@ -8035,6 +8035,14 @@ os_times_impl(PyObject *module) #else /* MS_WINDOWS */ { +#if defined(HAVE_SYS_RESOURCE_H) + struct rusage ruself; + struct rusage ruchildren; + if (getrusage(RUSAGE_SELF, &ruself) == -1) + return posix_error(); + if (getrusage(RUSAGE_CHILDREN, &ruchildren) == -1) + return posix_error(); +#endif struct tms t; clock_t c; @@ -8043,10 +8051,18 @@ os_times_impl(PyObject *module) if (c == (clock_t) -1) return posix_error(); return build_times_result( + +#if defined(HAVE_SYS_RESOURCE_H) + doubletime(ruself.ru_utime), + doubletime(ruself.ru_stime), + doubletime(ruchildren.ru_utime), + doubletime(ruchildren.ru_stime), +#else (double)t.tms_utime / ticks_per_second, (double)t.tms_stime / ticks_per_second, (double)t.tms_cutime / ticks_per_second, (double)t.tms_cstime / ticks_per_second, +#endif (double)c / ticks_per_second); } #endif /* MS_WINDOWS */ -- Giampaolo - http://grodola.blogspot.com
On Sun, Mar 24, 2019 at 5:16 AM Giampaolo Rodola' <g.rodola@gmail.com> wrote:
It turns out we could use resource.getrusage() which provides micro seconds (tested on Linux and macOS):
import os, resource for x in range(10000000): # warm up pass for x in range(5): a = os.times() b = resource.getrusage(resource.RUSAGE_SELF) print(a.user, a.system) print(b.ru_utime, b.ru_stime)
...it prints:
0.39 0.01 0.394841 0.011963999999999999 0.39 0.01 0.394899 0.011966 0.39 0.01 0.394908 0.011966 0.39 0.01 0.394936 0.011967 0.39 0.01 0.394963 0.011968
getrusage(RUSAGE_CHILDREN) can be used to calculate "children_user" and "children_system". I see 2 possibilities here:
1) doc fix, mentioning that resource.getrusage provides a better resolution 2) if available (it should always be as it's a POSIX standard), just use getrusage in Modules/posixmodule.c. It seems we can check availability by reusing HAVE_SYS_RESOURCE_H and HAVE_SYS_TIME_H definitions which are already in place.
I'm not sure what's best to do as os.* functions usually expose the original C function with the same name, but given that "elapsed" field is not part of times(2) struct and that on Windows "elapsed", "children_user" and "children_system" are set to 0 it appears there may be some space for flexibility here.
Thoughts?
I'd just document that resource.getrusage() provides better data. That is what man pages for times(3) have done as well. It is good to keep the os module close to the underlying library/system calls and leave it to higher level code to abstract and choose as deemed appropriate. -gps
participants (3)
-
Anders Hovmöller
-
Giampaolo Rodola'
-
Gregory P. Smith