[Python-ideas] Move optional data out of pyc files

Chris Angelico rosuav at gmail.com
Wed Apr 11 00:21:17 EDT 2018


On Wed, Apr 11, 2018 at 1:02 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Wed, Apr 11, 2018 at 10:08:58AM +1000, Chris Angelico wrote:
>
>> File system limits aren't usually an issue; as you say, even FAT32 can
>> store a metric ton of files in a single directory. I'm more interested
>> in how long it takes to open a file, and whether doubling that time
>> will have a measurable impact on Python startup time. Part of that
>> cost can be reduced by using openat(), on platforms that support it,
>> but even with a directory handle, there's still a definite non-zero
>> cost to opening and reading an additional file.
>
> Yes, it will double the number of files. Actually quadruple it, if the
> annotations and line numbers are in separate files too. But if most of
> those extra files never need to be opened, then there's no cost to them.
> And whatever extra cost there is, is amortized over the lifetime of the
> interpreter.

Yes, if they are actually not needed. My question was about whether
that is truly valid. Consider a very common use-case: an OS-provided
Python interpreter whose files are all owned by 'root'. Those will be
distributed with .pyc files for performance, but you don't want to
deprive the users of help() and anything else that needs docstrings
etc. So... are the docstrings lazily loaded or eagerly loaded? If
eagerly, you've doubled the number of file-open calls to initialize
the interpreter. (Or quadrupled, if you need annotations and line
numbers and they're all separate.) If lazily, things are a lot more
complicated than the original description suggested, and there'd need
to be some semantic changes here.

> Serhiy is experienced enough that I think we should assume he's not
> going to push this optimization into production unless it actually does
> reduce startup time. He has proven himself enough that we should assume
> competence rather than incompetence :-)

Oh, I'm definitely assuming that he knows what he's doing :-) Doesn't
mean I can't ask the question though.

ChrisA


More information about the Python-ideas mailing list