[Python-Dev] Encoding of PyFrameObject members
M.-A. Lemburg
mal at egenix.com
Fri Feb 6 11:44:46 CET 2015
On 06.02.2015 00:27, Francis Giraldeau wrote:
> I need to access frame members from within a signal handler for tracing
> purpose. My first attempt to access co_filename was like this (omitting
> error checking):
>
> PyFrameObject *frame = PyEval_GetFrame();
> PyObject *ob = PyUnicode_AsUTF8String(frame->f_code->co_filename)
> char *str = PyBytes_AsString(ob)
>
> However, the function PyUnicode_AsUTF8String() calls PyObject_Malloc(),
> which is not reentrant. If the signal handler nest over PyObject_Malloc(),
> it causes a segfault, and it could also deadlock.
>
> Instead, I access members directly:
> char *str = PyUnicode_DATA(frame->f_code->co_filename);
> size_t len = PyUnicode_GET_DATA_SIZE(frame->f_code->co_filename);
>
> Is it safe to assume that unicode objects co_filename and co_name are
> always UTF-8 data for loaded code? I looked at the PyTokenizer_FromString()
> and it seems to convert everything to UTF-8 upfront, and I would like to
> make sure this assumption is valid.
The macros won't work in all cases, as they don't pay attention
to the different kinds used in the Unicode implementation.
I don't think there's any API you can use to extract the
underlying data without going through PyObject_Malloc()
at some point (you may be lucky if there already is a
UTF-8 version available, but it's not guaranteed).
I guess your best bet is to write your own UTF-8
codec which then copies the data to a buffer that
you can control. Have a look at Objects/stringlib/codecs.h:
utf8_encode.
Alternatively, you can copy the data to a Py_UCS4 buffer
which you allocate using code such as this (untested,
adapted from the UTF-8 encoder):
Py_UCS4 *p;
enum PyUnicode_Kind repkind;
void *repdata;
Py_ssize_t repsize, k;
if (PyUnicode_READY(rep) < 0)
goto error;
repkind = PyUnicode_KIND(rep);
repdata = PyUnicode_DATA(rep);
repsize = PyUnicode_GET_LENGTH(rep);
p = malloc((repsize + 1) * sizeof(Py_UCS4));
for(k=0; k<repsize; k++) {
*p++ = PyUnicode_READ(repkind, repdata, k);
}
/* 0-terminate */
*p++ = 0;
...
free(p);
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Feb 06 2015)
>>> Python Projects, Coaching and Consulting ... http://www.egenix.com/
>>> mxODBC Plone/Zope Database Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
________________________________________________________________________
::::: Try our mxODBC.Connect Python Database Interface for free ! ::::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
http://www.egenix.com/company/contact/
More information about the Python-Dev
mailing list