
On Fri, Jul 3, 2015 at 6:20 AM, Nick Coghlan <ncoghlan@gmail.com> wrote:
On 3 July 2015 at 06:25, Neil Girdhar <mistersheik@gmail.com> wrote:
Why would it require "a lot of extra memory"? A program text size is measured in megabytes, and the AST is typically more compact than the code as text. A few megabytes is nothing.
It's more complicated than that.
What happens when we multiply that "nothing" by 10,000 concurrent processes across multiple servers. Is it still nothing? How about 10,000,000?
I guess we find a way to share data between the processes?
What does keeping the extra data around do to our CPU level cache efficiency? Is there a key data structure we're adding a new pointer to? What does *that* do to our performance?
Why would a few megabytes of data affect your CPU level cache? If I have a Python program that generates a data structure that's a few megabytes, does it slow down the rest of the program?
Where are the AST objects being kept? Do they become part of the serialised form of the affected object? If yes, what does that do to the wire protocol overhead for inter-process communication, or to the size of cached bytecode files? If no, does that mean these objects may be missing the AST data when deserialised?
When do you send code objects on the wire? I'm not even sure if pickle supports that yet. When we're talking about sufficiently central data structures, a few
*bytes* can end up counting as "a lot". Code and function objects aren't quite *that* central (unlike, say, tuple instances), but adding things to them can still have a significant impact (hence the ability to avoid creating docstrings).
Thanks, I'm interested in learning more about this. There are a lot of messages in this discussion. Was there a final consensus about how the AST for a given code object should be calculated? Was it re-parsing the source? Was it an import hook? Something else? I want to do this with a personal project. I realize we may not get the AST by default, but it would be nice to know how I should best determine it myself.
Cheers, Nick.
-- Nick Coghlan | ncoghlan@gmail.com | Brisbane, Australia