[Python-ideas] Move optional data out of pyc files

Eric Fahlgren ericfahlgren at gmail.com
Tue Apr 10 20:43:42 EDT 2018


On Tue, Apr 10, 2018 at 5:03 PM, Steven D'Aprano <steve at pearwood.info>
wrote:

>
>     __pycache__/spam.cpython-38.pyc
>     __pycache__/spam.cpython-38-doc.pyc
>     __pycache__/spam.cpython-38-lno.pyc
>     __pycache__/spam.cpython-38-ann.pyc
>

​Our product uses the doc strings for auto-generated help, so we need to
keep those.  We also allow users to write plugins and scripts, so getting
valid feedback in tracebacks is essential for our support people, so we'll
keep the lno files, too.  Annotations can probably go.

Looking at one of our little pyc files, I see:

-rwx------+ 1 efahlgren admins  9252 Apr 10 17:25 ./lm/lib/config.pyc*​

Since disk blocks are typically 4096 bytes, that's really a 12k file.
Let's say it's 8k of byte code, 1k of doc, a bit of lno.  So the proposed
layout would give:

config.pyc -> 8k
config-doc.pyc -> 4k
config-lno.pyc -> 4k

So now I've increased disk usage by 25% (yeah yeah, I know, I picked that
small file on purpose to illustrate the point, but it's not unusual).

These files are often opened over a network, at least for user plugins.
This can take a really, really long time on some of our poorly connected
machines, like 1-2 seconds per file (no kidding, it's horrible).  Now
instead of opening just one file in 1-2 seconds, we have increased the time
by 300%, just to do the stat+open, probably another stat to make sure
there's no "ann" file laying about.  Ouch.

-1 from me.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20180410/e1127177/attachment.html>


More information about the Python-ideas mailing list