
On Fri, 30 Mar 2018 15:28:47 +0900 INADA Naoki <songofacandy@gmail.com> wrote:
# Possible optimizations by 48bit pointer
## PyASCIIObject
[snip] unsigned int ready:1; /* Padding to ensure that PyUnicode_DATA() is always aligned to 4 bytes (see issue #19537 on m68k). */ unsigned int :24; } state; wchar_t *wstr; /* wchar_t representation (null-terminated) */ } PyASCIIObject;
Currently, state is 8bit + 24bit padding. I think we can pack state and wstr in 64bit.
We could also simply nuke wstr. I frankly don't think it's very important. It's only used when calling system functions taking a wchar_t argument, as an « optimization ». I'd be willing to guess that modern workloads aren't bottlenecked by the cost overhead of those system functions... Of course, the question is whether all this matters. Is it important to save 8 bytes on each unicode object? Only testing would tell. Regards Antoine.