[Python-ideas] Enabling access to the AST for Python code
Neil Girdhar
mistersheik at gmail.com
Fri Jul 3 22:42:55 CEST 2015
On Fri, Jul 3, 2015 at 6:20 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 3 July 2015 at 06:25, Neil Girdhar <mistersheik at gmail.com> wrote:
> > Why would it require "a lot of extra memory"? A program text size is
> > measured in megabytes, and the AST is typically more compact than the
> code
> > as text. A few megabytes is nothing.
>
> It's more complicated than that.
>
> What happens when we multiply that "nothing" by 10,000 concurrent
> processes across multiple servers. Is it still nothing? How about
> 10,000,000?
>
I guess we find a way to share data between the processes?
>
> What does keeping the extra data around do to our CPU level cache
> efficiency? Is there a key data structure we're adding a new pointer
> to? What does *that* do to our performance?
>
Why would a few megabytes of data affect your CPU level cache? If I have a
Python program that generates a data structure that's a few megabytes, does
it slow down the rest of the program?
>
> Where are the AST objects being kept? Do they become part of the
> serialised form of the affected object? If yes, what does that do to
> the wire protocol overhead for inter-process communication, or to the
> size of cached bytecode files? If no, does that mean these objects may
> be missing the AST data when deserialised?
>
When do you send code objects on the wire? I'm not even sure if pickle
supports that yet.
When we're talking about sufficiently central data structures, a few
> *bytes* can end up counting as "a lot". Code and function objects
> aren't quite *that* central (unlike, say, tuple instances), but adding
> things to them can still have a significant impact (hence the ability
> to avoid creating docstrings).
>
Thanks, I'm interested in learning more about this.
There are a lot of messages in this discussion. Was there a final
consensus about how the AST for a given code object should be calculated?
Was it re-parsing the source? Was it an import hook? Something else? I
want to do this with a personal project. I realize we may not get the AST
by default, but it would be nice to know how I should best determine it
myself.
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan | ncoghlan at gmail.com | Brisbane, Australia
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-ideas/attachments/20150703/18bc190e/attachment.html>
More information about the Python-ideas
mailing list