[Python-ideas] improving C structs layout

Charles-François Natali cf.natali at gmail.com
Wed May 8 14:39:29 CEST 2013


Hi,

I was recently looking at the PyThreadState data structure (for issue
#17912, but it's unimportant), and noticed that the layout of the
members leaves some holes (due to alignment).
While it doesn't import too much for PyThreadState (because of
trailing padding), I wondered whether other structures in the code
base could benefit from a better layout.
So I ran pahole [1], and found the following structures:

$ pahole -P python
PyMemberDef     40      32      8
wrapperbase     56      48      8
unicode_formatter_t     136     128     8
_expr   56      52      4
_stmt   72      68      4
_excepthandler  40      36      4
_node   40      32      8
compiler_unit   448     440     8
tok_state       992     984     8

The first column is the current size, and the second column the size
after a more judicious layout.

For example.
Before:
$ pahole -C wrapperbase python
struct wrapperbase {
        char *                     name;                 /*     0     8 */
        int                        offset;               /*     8     4 */

        /* XXX 4 bytes hole, try to pack */

        void *                     function;             /*    16     8 */
        wrapperfunc                wrapper;              /*    24     8 */
        char *                     doc;                  /*    32     8 */
        int                        flags;                /*    40     4 */

        /* XXX 4 bytes hole, try to pack */

        PyObject *                 name_strobj;          /*    48     8 */

        /* size: 56, cachelines: 1, members: 7 */
        /* sum members: 48, holes: 2, sum holes: 8 */
        /* last cacheline: 56 bytes */
};

After:
$ pahole -C wrapperbase -R python
struct wrapperbase {
        char *                     name;                 /*     0     8 */
        int                        offset;               /*     8     4 */
        int                        flags;                /*    12     4 */
        void *                     function;             /*    16     8 */
        wrapperfunc                wrapper;              /*    24     8 */
        char *                     doc;                  /*    32     8 */
        PyObject *                 name_strobj;          /*    40     8 */

        /* size: 48, cachelines: 1, members: 7 */
        /* last cacheline: 48 bytes */
};   /* saved 8 bytes! */

While some of the structs above aren't worth the trouble (like
tok_state), I think some might be interesting candidates.
This could lead to reduced memory usage (well, of course it depends on
the number of instances), and better cache usage/locality of
reference.

So what do you think, is it worth it?

cf

[1] https://github.com/acmel/dwarves



More information about the Python-ideas mailing list