[Python-ideas] Extending language syntax
Chris Angelico
rosuav at gmail.com
Tue Nov 12 15:52:31 CET 2013
On Tue, Nov 12, 2013 at 9:00 PM, Andrew Barnert <abarnert at yahoo.com> wrote:
> On Nov 11, 2013, at 21:27, Chris Angelico <rosuav at gmail.com> wrote:
>
>> On Tue, Nov 12, 2013 at 4:03 PM, Gregory Salvan <apieum at gmail.com> wrote:
>>> I suggested generating value object by freezing object states and to have
>>> object identities defined on their values instead of their memory
>>> allocation.
>>
>> I can see some potential in this if there were a way to say "This will
>> never change, don't refcount it or GC-check it"; that might improve
>> performance across a fork (or across threads), but it'd take a lot of
>> language support. Effectively, you would be forfeiting the usual GC
>> memory saving "this isn't needed, get rid of it" and fixing it in
>> memory someplace. The question would be: Is the saving of not writing
>> to that memory (updating refcounts, or marking for a mark/sweep GC, or
>> whatever the equivalent is for each Python implementation) worth the
>> complexity of checking every object to see if it's a frozen one?
>
> Is there any implementation (like one of the PyPy sub projects) that uses refcounting, with interlocked increments if two interpreter threads are live but plain adds otherwise? In such an implementation, I think the cost of checking a second flag to avoid the interlocked increment would, at least on many platforms (including x86, x86_64, and arm9), be comparatively very cheap, and if used widely could provide big benefits.
>
> I believe CPython and standard PyPy just use plain adds under the GIL, and Jython and IronPython leave all the gc up to the underlying VM, so it would probably be a lot harder to get enough benefit there without a lot more effort.
The only Python I actually work with is CPython, so I can't say what
does or doesn't exist. But plausibly, it's possible to have a one-bit
flag that says "this is permanent, don't GC it" and then it won't get
its refcount updated (in CPython), or won't get marked (in a
mark/sweep GC), or whatever. Then, if you have an entire page of
frozen objects, and you fork() using the standard Linux (and other)
semantics of copy-on-write, you would never need to write to that
page.
The reason this might require that the objects be immutable is this:
When an object is marked as frozen, everything it references can also
automatically be marked frozen. (By definition, they'll always be in
use too.) That would only work, though, if the set of objects thus
referenced doesn't change.
PI = complex(3.14159)
sys.freeze(PI) # should freeze the float value 3.14159
PI.real = 3.141592653589793
# awkward
I can imagine that this might potentially offer some *huge* benefits
in a system that does a lot of forking (maybe a web server?), but all
I have to go on is utter and total speculation. And the cost of
checking "Is this frozen? No, update its refcount" everywhere means
that there's a penalty even if fork() is never used. So actually, this
might work out best as a "special-build" Python, and maybe only as a
toy/experiment.
ChrisA
More information about the Python-ideas
mailing list