On 04/11/18 06:21, Chris Angelico wrote:
On Wed, Apr 11, 2018 at 1:02 PM, Steven D'Aprano
wrote: On Wed, Apr 11, 2018 at 10:08:58AM +1000, Chris Angelico wrote:
File system limits aren't usually an issue; as you say, even FAT32 can store a metric ton of files in a single directory. I'm more interested in how long it takes to open a file, and whether doubling that time will have a measurable impact on Python startup time. Part of that cost can be reduced by using openat(), on platforms that support it, but even with a directory handle, there's still a definite non-zero cost to opening and reading an additional file.
Yes, it will double the number of files. Actually quadruple it, if the annotations and line numbers are in separate files too. But if most of those extra files never need to be opened, then there's no cost to them. And whatever extra cost there is, is amortized over the lifetime of the interpreter.
Yes, if they are actually not needed. My question was about whether that is truly valid. Consider a very common use-case: an OS-provided Python interpreter whose files are all owned by 'root'. Those will be distributed with .pyc files for performance, but you don't want to deprive the users of help() and anything else that needs docstrings etc.
Currently in Fedora, we ship *both* optimized and non-optimized pycs to make sure both -O and non--O will work nicely without root privilieges. So splitting the docstrings into a separate file would be, for us, a benefit in terms of file size.
So... are the docstrings lazily loaded or eagerly loaded? If eagerly, you've doubled the number of file-open calls to initialize the interpreter. (Or quadrupled, if you need annotations and line numbers and they're all separate.) If lazily, things are a lot more complicated than the original description suggested, and there'd need to be some semantic changes here.
Serhiy is experienced enough that I think we should assume he's not going to push this optimization into production unless it actually does reduce startup time. He has proven himself enough that we should assume competence rather than incompetence :-)
Oh, I'm definitely assuming that he knows what he's doing :-) Doesn't mean I can't ask the question though.
ChrisA _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/