Unfortunately, the C version of pickle lacks the extensibility of the
pure Python, so the pure Python has to be used in some cases. One such
example is the
cloudpickle project, which extends pickle to
support many more types, such as local functions.
often used by distributed executors to allow shipping Python code for
remote execution on a cluster.
On Wed, 5 Apr 2017 01:31:20 +1000 Nick Coghlan firstname.lastname@example.org wrote:
On 4 April 2017 at 21:43, Victor Stinner email@example.com wrote:
2017-04-04 12:06 GMT+02:00 Serhiy Storchaka firstname.lastname@example.org:
I consider it as a benchmark of Python interpreter itself.
Don't we have enough benchmarks to test the Python interpreter?
I would prefer to have more realistic use cases than "reimplement pickle in pure Python".
"unpickle_pure_python" name can be misleading as well to users exploring speed.python.org data, no?
The split benchmark likely made more sense in Python 2, when "import pickle" gave you the pure Python version by default, and you had to do "import cPickle as pickle" to get the accelerated version - you'd get very different performance characteristics based on which import the application used.
It makes significantly less sense now that Python 3 always using the accelerated version by default and only falls back to pure Python if the accelerator module is missing for some reason. If anything, the appropriate cross-version comparison would be between the pure Python version in 2.7, and the accelerated version in 3.x, since that reflects the performance change you get when you do "import pickle".
However, that argument only applies to whether or not to include it in the default benchmark set used to compare the overall performance across versions and implementations - it's still valid as a microbenchmark looking for major regressions in the speed of the pure Python fallback.
On 5 April 2017 at 01:47, Antoine Pitrou email@example.com wrote:
Unfortunately, the C version of pickle lacks the extensibility of the pure Python, so the pure Python has to be used in some cases. One such example is the
cloudpickleproject, which extends pickle to support many more types, such as local functions.
cloudpickleis often used by distributed executors to allow shipping Python code for remote execution on a cluster.
Perhaps a more suitable benchmark could be formulated based on that?
That way the benchmark could be pinned to a particular version of cloudpickle, and it would be testing a known real-world use case that explicitly requires the pure Python version of the underlying pickle module.