we're seeing strange problems when trying to do reproducible builds of some python 3.6 modules.
Namely, from one build to another, there will be something like the following difference in the
00004e40 da 07 5f 5f 61 6c 6c 5f 5f da 0a 5f 5f 61 75 74 |..__all__..__aut|
-00004e50 68 6f 72 5f 5f da 07 64 65 63 69 6d 61 6c 72 0c |hor__..decimalr.|
+00004e50 68 6f 72 5f 5f 5a 07 64 65 63 69 6d 61 6c 72 0c |hor__Z.decimalr.|
00004e60 00 00 00 72 43 00 00 00 72 08 00 00 00 72 41 00 |...rC...r....rA.|
This specific one is in the top-level co_names segment and the 0x5a vs 0xda byte is
TYPE_SHORT_ASCII_INTERNED, with FLAG_REF set or unset.
I'm also seeing off-by-one differences in reference ids, i.e., the number appearing after TYPE_REF.
Not in all cases, but it seems that when a "part" is affected, all references in that "part" are
changed (for some value of "part"; all the knowledge of pycs I have was gained from about an hour of
reading marshal.c). So that seems to imply that there's a reference that is sometimes included and
This is most often found in __init__.py. Often this affects optimized pycs, but we can see it in
un-optimized as well.
The issue is rare -- 99% of all pycs are stable -- but when it occurs, it's easy to replicate it in
the same place. This also happens on different machines, so that seems to rule out hardware memory
The pycs in question are generated by normal "setup.py build" -> "setup.py install". It happens on
Python 3.6 but not on Python 2.7. I'm not sure about Python 3.5 because we don't currently use it.
It doesn't seem to depend on hash seed - the instability is observed even with PYTHONHASHSEED set to
zero. What seems to fix it, however, is running the build on disorderfs, which ensures that the
filesystem entries are in the same order.
Any ideas why something like this would happen and why would it be correlated with filesystem ordering?