[Wheel-builders] Building manylinux1 wheels with newer toolchains

Robert T. McGibbon rmcgibbo at gmail.com
Sat Jul 8 17:20:26 EDT 2017


On Sat, Jul 8, 2017 at 3:37 PM, Geoffrey Thomas <geofft at ldpreload.com>
wrote:

> In the general case, does this actually change the API/ABI that your
> application uses, or does it just change the ABI that your application
> _claims_ to expect? That is, does your memcpy.c emit code that actually
> intends to call memcpy at GLIBC_2.2.5, or does it emit code that intends to
> call memcpy@@GLIBC_2.14 but lies about it because of .symver?
>

Looking at the resulting libraries with `nm`, it actually emits code that
calls memcpy at GLIBC_2.2.5, to the best of my understanding.

>
> In the particular case of memcpy, I think this is fine because the ABI
> change from 2.2.5 to 2.14 is forwards-compatible, if I'm reading the
> manpage and glibc commit 0354e355 right. glibc used to have a memcpy that
> was safe to call with overlapping regions; the spec says memcpy requires
> the regions not to overlap, and you should use memmove if they overlap. In
> glibc 2.13, they optimized it assuming the regions didn't overlap, which
> broke older programs. So in glibc 2.14, they aliased memcpy at GLIBC_2.2.5
> to memmove, and added the symbol version GLIBC_2.14 to the new, optimized
> memcpy.



>
>
For an application calling memcpy correctly, I think this means _either_
> memcpy at GLIBC_2.2.5 or memcpy@@GLIBC_2.14 is fine to call, and if you have
> the option of either, memcpy@@GLIBC_2.14 is going to be faster (which is
> why newer toolchains default to it), but the two have the same calling
> convention and everything.
>
> This means that, if the only incompatibility is just memcpy, this approach
> should work -- but also you can probably define a weak symbol named memcpy@@GLIBC_2.14
> that just relocates to memcpy at GLIBC_2.2.5 (and perhaps auditwheel can
> stuff this symbol into your ELF objects, without needing to change the
> compilation process). If the final system's libc provides memcpy@@GLIBC_2.14,
> then you'll still get the faster version.
>

I'm not sure exactly how to do this, but it sounds like it would be a neat
trick as well.


>
> Is this the only incompatible symbol worth worrying about? If there are
> others that actually changed ABI in a backwards-incompatible way (that is,
> you can't call a program compiled with the new symbol against the old
> symbol, and glibc provides two disjoint versioned implementations) then I
> suspect this is unsafe.


I'm not sure exactly how general of a solution it is. But if you have a
wheel that really won't work if it uses the old glibc symbols, then it's
not as if compiling it on the manylinux1 centos 5 docker image will help
either.


>
>
> --
> Geoffrey Thomas
> https://ldpreload.com
> geofft at ldpreload.com
>
>
> On Fri, 7 Jul 2017, Robert T. McGibbon wrote:
>
> Hey all,
>> I think I may have figured out a new way to build manylinux1 on
>> non-CentOS 5 machines with newer toolchains, at least in relatively
>> simple cases. The thing that prevents most libraries from being
>> manylinux1-compatible is that they link against too-recent versioned
>> symbols in glibc. This suggests, then, that we might be able to fix the
>> problem during compilation by forcing the linker to link the
>> libraries against older (manylinux1 compatible) symbols. It seems like
>> there are some assembly + linker tricks (also this) that work to
>> force just that.
>>
>> In order to test this out, I took a look at the symbols that were causing
>> manylinux1 incompatibility in a project of mine when compiled
>> on CentOS 7. In this case, it was just memcpy@@GLIBC_2.14.
>>
>> So, I dropped this file into my project:
>> ```
>> $ cat memcpy.c
>> #include <string.h>
>>
>> asm (".symver memcpy, memcpy at GLIBC_2.2.5");
>> void *__wrap_memcpy(void *dest, const void *src, size_t n) {
>>   return memcpy(dest, src, n);
>> }
>> ```
>>
>> And then modified my setup.py to
>>
>> ````
>> +def manylinux1(extensions):
>> +    for ext in extensions:
>> +        ext.sources.append('memcpy.c')
>> +        ext.extra_link_args.append('-Wl,--wrap=memcpy')
>> +    return extensions
>> +
>>
>>  setup(name='project_name
>>        author='Robert McGibbon',
>>        author_email='rmcgibbo at gmail.com',',
>> -      ext_modules=extensions,
>> +      ext_modules=manylinux1(extensions),
>> ```
>>
>>
>> Lo and behold, it actually works! Obviously one would have to wrap more
>> symbols for other projects that make heavier use of glibc and
>> there's nothing that this can do about the fact for wheels that link
>> against external, precompiled libraries that auditwheel grafts into
>> the manylinux wheel, since it requires changes to the compile, but it's
>> still cool.
>>
>> Has anyone tried this kind of thing before?
>>
>> --
>> -Robert
>>
>>


-- 
-Robert
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/wheel-builders/attachments/20170708/271b7bff/attachment.html>


More information about the Wheel-builders mailing list