[Python-Dev] Can I make marshal.dumps() slower but stabler?
INADA Naoki
songofacandy at gmail.com
Thu Jul 12 04:15:39 EDT 2018
On Thu, Jul 12, 2018 at 3:22 PM Serhiy Storchaka <storchaka at gmail.com> wrote:
>
> 12.07.18 08:43, INADA Naoki пише:
> > I'm working on making pyc stable, via stablizing marshal.dumps()
> > https://bugs.python.org/issue34093
>
> This is not enough for making pyc stable. The order in frozesets still
> is arbitrary.
But we can use PYTHONHASHSEED to make pyc stable.
Currently, refcnt is the only known issue for reproducible pyc build.
>
> > Sadly, it makes marshal.dumps() 40% slower.
> > Luckily, this overhead is small (only 4%) for dumps(compile(source)) case.
>
> What about the memory consumption?
No overhead, because we already used same hashtable for w_ref.
I just make it two-pass, instead of one-pass.
>
> > So my question is: May I remove unstable but faster code?
> >
> > Or should I make this optional and we maintain two complex code?
> > If so, should this option enabled by default or not?
>
> My concern is that even if not make it optional, this will complicate
> the code.
When it's not optional, it makes almost duplicate of w_object for
reference counting in object tree.
https://github.com/python/cpython/pull/8226/commits/e170116e80dfd27f923c88fc11e42f0d6f687a00
>
> > For example, xmlrpc uses marshal. But xmlrpc has significant overhead
> > other than marshaling, like dumps(compile(source)) case. So I expect
> > marshal.dumps() performance is not critical for it too.
>
> xmlrpc doesn't use the marshal module. It uses terms marshalling and
> unmarshalling, but in different meaning.
>
Oh, I just grepped and misunderstood.
> > Is there any real application which marshal.dumps() performance is critical?
> EVE Online is a well known example.
>
Do they use version>=3?
In version 3, FLAG_REF is introduced and it made significant runtime
overhead already.
If marshaling speed is very important, version<2 should be used.
> What if write a script which loads .pyc files and stabilize them? This
> could solve the problem for applications which need stable .pyc files,
> with zero impact on common use.
>
Hmm, do you mean which?:
* Adding marshal.dump_stable_pyc() and use it like
`marshal.dump_stable_pyc(marshal.loads(code))`
* Implementing pure Python marshal.dumps in distutils
--
INADA Naoki <songofacandy at gmail.com>
More information about the Python-Dev
mailing list