[Python-Dev] Encoding of PyFrameObject members

Maciej Fijalkowski fijall at gmail.com
Sat Feb 7 09:45:34 CET 2015


On Sat, Feb 7, 2015 at 12:48 AM, Francis Giraldeau
<francis.giraldeau at gmail.com> wrote:
> 2015-02-06 6:04 GMT-05:00 Armin Rigo <arigo at tunes.org>:
>
>> Hi,
>>
>> On 6 February 2015 at 08:24, Maciej Fijalkowski <fijall at gmail.com> wrote:
>> > I don't think it's safe to assume f_code is properly filled by the
>> > time you might read it, depending a bit where you find the frame
>> > object. Are you sure it's not full of garbage?
>>
>>
>> Yes, before discussing how to do the utf8 decoding, we should realize
>> that it is really unsafe code starting from the line before.  From a
>> signal handler you're only supposed to read data that was written to
>> "volatile" fields.  So even PyEval_GetFrame(), which is done by
>> reading the thread state's "frame" field, is not safe: this is not a
>> volatile.  This means that the compiler is free to do crazy things
>> like *first* write into this field and *then* initialize the actual
>> content of the frame.  The uninitialized content may be garbage, not
>> just NULLs.
>
>
> Thanks for these comments. Of course accessing frames withing a signal
> handler is racy. I confirm that code encoded in non-ascii is not accessible
> from the uft8 buffer pointer. However, a call to PyUnicode_AsUTF8() encodes
> the data and caches it in the unicode object. Later access returns the byte
> buffer without memory allocation and re-encoding.
>
> I think it is possible to solve both safety problems by registering a
> handler with PyPyEval_SetProfile(). On function entry, the handler will call
> PyUnicode_AsUTF8() on the required frame members to make sure the utf8
> encoded string is available. Then, we increment the refcount of the frame
> and assign it to a thread local pointer. On function return, the refcount is
> decremented. These operations occurs in the normal context and they are not
> racy. The signal handler will use the thread local frame pointer instead of
> calling PyEval_GetFrame(). Does that sounds good?
>
> Thanks again for your feedback!
>
> Francis

You still didn't explain what are you trying to achieve nor adressed
armins questions about volatile. However, you can't access thread
locals from signal handlers (since in some cases it mallocs, thread
locals are built lazily if you're inside the .so, e.g. if python is
built with --shared)


More information about the Python-Dev mailing list