[Python-Dev] Timing for removing legacy Unicode APIs deprecated by PEP 393
Serhiy Storchaka
storchaka at gmail.com
Wed Apr 18 15:16:23 EDT 2018
13.04.18 16:27, INADA Naoki пише:
> Then, I want to reschedule the removal of these APIs.
> Can we remove them in 3.8? 3.9? or 3.10?
> I prefer sooner as possible.
I suppose that many users will start porting to Python 3 only in 2020,
after 2.7 EOL. After that time we shouldn't support compatibility with
2.7 and can start emitting deprecation warnings at runtime. After 1 or 2
releases after that we can make corresponding public API always failing
and remove private API and data fields.
> Slightly off topic, there are 4bytes alignment gap in the unicode object,
> on 64bit platform.
>
> typedef struct {
> .....
> struct {
> unsigned int interned:2;
> unsigned int kind:3;
> unsigned int compact:1;
> unsigned int ascii:1;
> unsigned int ready:1;
> unsigned int :24;
> } state; // 4 bytes
>
> // implicit 4 bytes gap here.
>
> wchar_t *wstr; // 8 bytes
> } PyASCIIObject;
>
> So, I think we can reduce 12 bytes instead of 8 bytes when removing wstr.
> Or we can reduce 4 bytes soon by moving `wstr` before `state`.
>
> Off course, it needs siphash support 4byte aligned data instead of 8byte.
There are other functions which expect that data is aligned to
sizeof(long) or 8 bytes.
Siphash hashing is special because it is called not just for strings and
bytes, but for memoryview, which doesn't guarantee any alignment.
Note that after removing the wchar_t* field the gap will not gone,
because the size of the structure should be a multiple of the alignment
of the first field (which is a pointer).
More information about the Python-Dev
mailing list