
On Sat, Dec 30, 2023 at 1:57 PM Dr. Thomas Orgis < thomas.orgis@uni-hamburg.de> wrote:
Am Fri, 29 Dec 2023 11:34:04 +0100 schrieb Ralf Gommers <ralf.gommers@gmail.com>:
If the library name is libcblas.so it will still be found. If it's also a nonstandard name, then yes it's going to fail. I'd say though that (a) this isn't a real-world situation as far as we know,
It can be more funny. I just notied on an Ubuntu system (following Debian for sure, here) that there are both
/usr/lib/x86_64-linux-gnu/libblas.so.3 /usr/lib/x86_64-linux-gnu/libcblas.so.3
but those belong to different packages. The first contains BLAS and CBLAS API and is installed from netlib code.
$ readelf -d -s /usr/lib/x86_64-linux-gnu/libblas.so.3 | grep cblas_ | wc -l 184
The second is installed alongside ATLAS.
$ readelf -d -s /usr/lib/x86_64-linux-gnu/libcblas.so.3 | grep cblas_ | wc -l 154
The symbols lists differ in that there are both functions unique to both.
$ ldd /usr/lib/x86_64-linux-gnu/libcblas.so.3 linux-vdso.so.1 (0x00007ffcb5720000) libatlas.so.3 => /lib/x86_64-linux-gnu/libatlas.so.3 (0x00007fd9b27ee000) libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fd9b25c6000) libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fd9b24df000) /lib64/ld-linux-x86-64.so.2 (0x00007fd9b2bae000)
I _guess_ this situation would be mostly fine since libblas has enough of the CBLAS symbols to prevent location of the wrong libcblas next to it by the meson search.
Quick followup regarding netlib splits. Debian only recently folded libcblas into libblas, as
https://lists.debian.org/debian-devel/2019/10/msg00273.html
notes. Not that long ago … esp. considering stable debian. Not sure when this appeared. And of course numpy is the point where things were broken:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=913567
I'm now looking into how Debian actually produces a combined BLAS+CBLAS from netlib, as we're using the CMake build system and I do not see an option to do that. The upstream build produces separate libraries, so I assumed that is a case that one should handle.
Yes, Debian made quite a mess there. We do have a CI job for Netlib on Debain though in NumPy, and indeed it works fine because of the CBLAS symbols already being found inside libblas.so
But it is a demonstration that any guess that libcblas belongs to libblas just from the name may be wrong in real-world installations.
Letting this sink in some more, I realized the more fundamental reason for treating them together: when we express dependencies, we do so for a *package* (i.e., a packaged version of some project), not for a specific build output like a shared library or a header file. In this case it's a little obscured by BLAS being an interface and the libblas/libcblas mix, but it's still the case that we're looking for multiple installed things from a single package. So we want "MKL" or "Netlib BLAS", where MKL is not only a shared library (or set of them), but for example also the corresponding header file (mkl_cblas.h rather than cblas.h). The situation you are worrying about is basically that of an unknown package with a set of shared libraries and headers that have non-standard names. I'd say that that's then simply a non-supported package, until someone comes to report the situation and we can add support for it (or file a bug against that package and convince the authors not to make such a mess). I think this point is actually important, and I hope you can appreciate it as a packager - we need to depend on packages (things that have URLs to source repos, maintainers, etc.), not random library names.
Here, it might be a strange installation remnant.
$ dpkg -L libatlas3-base /. /usr /usr/lib /usr/lib/x86_64-linux-gnu /usr/lib/x86_64-linux-gnu/atlas /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3.10.3 /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3.10.3 /usr/lib/x86_64-linux-gnu/libatlas.so.3.10.3 /usr/lib/x86_64-linux-gnu/libcblas.so.3.10.3 /usr/lib/x86_64-linux-gnu/libf77blas.so.3.10.3 /usr/lib/x86_64-linux-gnu/liblapack_atlas.so.3.10.3 /usr/share /usr/share/doc /usr/share/doc/libatlas3-base /usr/share/doc/libatlas3-base/README.Debian /usr/share/doc/libatlas3-base/changelog.Debian.gz /usr/share/doc/libatlas3-base/copyright /usr/lib/x86_64-linux-gnu/atlas/libblas.so.3 /usr/lib/x86_64-linux-gnu/atlas/liblapack.so.3 /usr/lib/x86_64-linux-gnu/libatlas.so.3 /usr/lib/x86_64-linux-gnu/libcblas.so.3 /usr/lib/x86_64-linux-gnu/libf77blas.so.3 /usr/lib/x86_64-linux-gnu/liblapack_atlas.so.3
An eclectic list of redundant libraries. But as it is, this is a case where
1. libatlas has BLAS, libcblas has corresponding CBLAS (only referencing ATLAS-specific ABI).
2. libcblas does _not_ work together with libblas (uses ATL_ symbols).
I'll note that the ATL_ symbols prefixes are an internal implementation detail; we don't prefix our own symbols when calling BLAS APIs. And there is no way Debian lets you install both of Netlib BLAS and ATLAS (it's one or the other, through `update-alternatives`), so this can't really go wrong in the way you imagine I think.
You might consider this full of packaging errors/weirdness, but it's there.
Also, I note that
1. Netlib libblas.so does contain BLAS+CBLAS symbols. 2. Netlib liblapack.so does only ontain LAPACK symbols. 3. Netlib liblapacke.so exists and provides LAPACKE symbols on top.
So from my full scenario, only the separate libcblas is missing. This hybrid is somewhat unfortunate … but, well, this liblapacke, you actually could combine with different liblapack implementations.
In any case, I would feel more comfortable if I knew that the build is not locating a library that is there but that it should not use.
We actually do have a CI job for ATLAS on Debian too: https://github.com/numpy/numpy/blob/36eefeabadeb4ebe07b4be3704da93a4c126b6ca.... Note that Debian in addition decided to use custom pkg-config names, so it works with `-Dblas=blas-atlas -Dlapack=lapack-atlas`.
(b) just don't do this as a
packager, and (c) if you really must, you can still make it work by providing a custom `cblas.pc`
Well, we can also hack around in your meson build files;-) The idea is to prepare also custom .pc files for various cases and be able to tell builds to use them. It feels like a unnecessary having to prepare another directory with custom cblas.pc for the build that pkg-config then can find there compared to just telling it to use cblas-foobarbaz as name for the file installed in the prefix alongside others.
(see
http://scipy.github.io/devdocs/building/blas_lapack.html#using-pkg-config-to...
).
I'm confused … this page doesn't mention blas vs cblas. You mean in addition to the blas.pc used there, I need to but a cblas.pc (or libblas.so) beside it. This background magic is bad. There should be no implicit guessing, ever, for a controlled packager build.
Yes, it's not ideal. But again, we're talking about a hypothetical situation, where you already did something bad as a packager, and I'm pointing out a workaround. If you're renaming build outputs from an upstream package from their defaults (e.g. libcblas.so -> libcblas-foobar.so), you are breaking things and get to keep the pieces. The correct name for Netlib BLAS is `libcblas.so` and `cblas.pc`. So provide `cblas.pc`, and you unbreak yourself.
We don't use LAPACKE, so that one can be ignored.
I had the impression that you are upstreaming something to meson about BLAS (and LAPACK) detection. So I am thinking about the full picture for all projects using meson with BLAS dependencies.
Sure, I was talking about NumPy/SciPy. In general, the LAPACKE situation is pretty much identical to the CBLAS one.
I see that combining any subset of BLAS, CBLAS, LAPACK, LAPACKE into a common library is a thing in practice. So I see how you'd want some special detection code to check if just loading a dependency called 'blas' gives you all APIs you need or of you need to add dependencies 'cblas', 'lapack', or 'lapacke', even.
So scipy locates cblas based on the name blas, but doesn't really use cblas.
It does in a few places, like SuperLU.
You are confusing me. I see no cblas usage anywhere in SciPy 1.11.4, did that come later?
...
No cblas in sight. Also, the SuperLU part specifically was what prompted this whole renewed discussion. We noticed breakage since scipy with -Dblas=cblas failed to link _superlu.so to libblas. This only hit on an depending package trying to use it.
I'm sorry, I was just wrong here. Looking closer, the C code in SuperLU that I thought was using CBLAS is actually manually handling Fortran compiler symbol handling and calling the mangled Fortran symbols (it's probably not very robust to unknown compilers): https://github.com/scipy/scipy/blob/main/scipy/sparse/linalg/_dsolve/SuperLU... .
The above command returns _no_ output when I build with -Dblas=cblas. The build works fine, but the resulting shared objects are _all_ left wanting an actual BLAS linkage. I provided a package that offers CBLAS and would pull in BLAS indirectly. Scipy build proceeds with that, but it tricks me with defunct algebra all around, including _superlu.so, where no object makes use of libcblas.
You said scipy isn't using the custom meson stuff yet. I see two occasions of BLAS use in numpy.
$ find /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/ -name '*.so'|while read f; do readelf -d $f|grep blas && echo "^^^^ $f ^^^^"; done 0x0000000000000001 (NEEDED) Biblioteca compartida: [libcblas.so.3] ^^^^ /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/core/_multiarray_umath.so ^^^^ 0x0000000000000001 (NEEDED) Biblioteca compartida: [libblas.so.3] ^^^^ /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/linalg/_umath_linalg.so ^^^^
So it needs both CBLAS and BLAS. (Btw. I rather like how I can tell that apart just from looking at the used libs, not the symbols themselves …;-)
Building again with -Dblas=cblas … FRICK!
$ find /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/ -name '*.so'|while read f; do readelf -d $f|grep blas && echo "^^^^ $f ^^^^"; done 0x0000000000000001 (NEEDED) Biblioteca compartida: [libcblas.so.3] ^^^^ /data/projekte/pkgsrc/work/math/py-numpy/work/.destdir/data/pkg/lib/python3.11/site-packages/numpy/core/_multiarray_umath.so ^^^^
Now _umath_linalg.so is left with undefined BLAS symbols, just like scipy before. I need to hot-fix that in pkgsrc, just after releasing the stable branch. Bad, bad, bad. I really took the message from you before that -Dblas=cblas is what I need to do. Now we ship broken numpy.
Got to fix that right away.
(update from second email:)
Luckily, it isn't. It links to liblapack, which brings in libblas. So that's fine, I suppose. The test output looks the same foor both after installing and then testing. So this was some interference.
I hope
=========================== short test summary info ============================ FAILED .destdir/data/pkg/lib/python3.11/site-packages/numpy/distutils/tests/test_log.py::test_log_prefix[info] FAILED .destdir/data/pkg/lib/python3.11/site-packages/numpy/distutils/tests/test_log.py::test_log_prefix[debug] 2 failed, 36807 passed, 1603 skipped, 1303 deselected, 33 xfailed, 1 xpassed, 62 warnings in 410.43s (0:06:50)
is OK. Those failures do sound like something basic or easy to fix, though. Log prefixing shouldn't be so complicated.
Yep, that should be okay - the two test failures should be pretty harmless.
Numpy is happy with libcblas bringing libblas in and calls it blas, but really uses the cblas interface. This looks a bit confusing.
It is happy, but not for a good result. The build just claims that it works, but produces defunct _umath_linalg. Or is this hidden in practical use by loading _multiarray_umath.so before? I'm no python user myself. What would be a minimal test for this?
Running `python -c "import numpy.linalg; numpy.test()" should do it and is pretty fast (needs `pytest` and `hypothesis` installed). If you need to avoid those test packages, maybe a simple `python -c "import numpy"` would already be enough, since that does call at least one linear algebra routine.
confusion. We need "BLAS with CBLAS symbols". CBLAS should simply not be considered as a completely separate dependency (it's one library with two interfaces). I can't think of a reason to do so, nor of a build system
I may be able to add something to the docs, but there should be no that
does it like that. There is no FindCBLAS in CMake for example, it handles it transparently within FindBLAS.
And it's unknown to me right now how CMake really handles CBLAS, as I cannot recall a CMake-using project right now that relies on CBLAS.
The asymmetry with LAPACKE in Debian confuses me, too. What do you consider standard packaging? BLAS+CBLAS together in libblas, but LAPACKE as a separate thing? Inconsistent. I guess I could reduce pkgsrc-specifc breakage by merging libcblas into libblas. But keep liblapacke separate? Would that confuse more or less? Should it be all in one or all separate?
Ideally, each packager for any BLAS library should run the default upstream build commands, install into a prefix and bundle up the result in a package. So what is standard == whatever configure/make produces for Netlib BLAS or OpenBLAS, without renaming or deleting anything.
Do you consider BLAS+CBLAS to be one thing, LAPACK+LAPACKE the other? All or nothing?
Yes indeed. At least, I cannot think of any package/project that doesn't provide CBLAS with BLAS. While BLAS and LAPACK clearly are separate from each other, and there are projects that provide only one (e.g., BLIS is BLAS-only).
Wouldn't the world be much simpler with
dependency(blas) dependency(cblas) dependency(lapack) dependency(lapacke)
No, I don't think so.
as requests for the specific API mentioned there, coming as separate pkg-config modules by meson default behaviour if they're present, but some logic that checks first if whatever provided blas (-Dblas=foobar) also provides lapack and hence dependency(lapack) is fulfilled already? Otherwise, look for lapack or whatever -Dlapack=baz was set to. Or is this already what is happening? I didn't have time to read up on everything, perhaps.
Yes, pretty much. E.g., for MKL and Accelerate we know for sure that they contain LAPACK symbols, so we don't even check. For OpenBLAS we know it's the default to include LAPACK but it can be built without, so we need to check symbols, and if they're missing we look for another library.
If you do dependency(blas with cblas) with modified/future meson … can I override what you try if my -Dblas does not contain cblas, but another library I can provide would?
Yes, the search order and which packages are searched for is fully customizable. But you should be good with the default `-Dblas=auto`. If not, I'd like to understand why.
And I need to ponder if I leave it at -Dblas=$CBLAS_PC for pkgsrc now. It's somewhat wrong, but also more correct, as NumPy _really_ means to use CBLAS API, not BLAS.
My opinion is that this is not right. It may work with pkg-config in your case, but I don't think it'll be robust for the fallback detection that is used when pkg-config is not installed.
See above. It's broken.
I assume that it actually was fine, given your follow-up email about the test suite having only 2 failures. So I'll skip replying to this part.
[...] Meson using CMake behind the scenes is also rather surprising. CMake configs are a nuisance when they take the focus away from pkg-config files … and a package that doesn't actually use cmake for building is normally not getting cmake in the build sandbox in pkgsrc. Maybe that has to be investigated if some packages really require meson to use cmake.
It's a fallback detection method to use CMake; it can be disabled indeed.
The attached output of numpy.test() has 2 failures. I don't know if they are related to BLAS. Re-doing the test with a properly linked numpy now… if they persist — how bad is that? Are those known failures?
They're not a problem, it's a minor thing that's not used at runtime.
OK … lots of errors regarding f2py in the test environment. Summary:
8 failed, 36022 passed, 1603 skipped, 1303 deselected, 33 xfailed, 1 xpassed, 62 warnings, 779 errors in 441.09s (0:07:21)
With the bad linkage, it is:
2 failed, 36807 passed, 1603 skipped, 1303 deselected, 33 xfailed, 1 xpassed, 62 warnings in 404.95s (0:06:44)
Maybe there is interference with the installed numpy. I cannot say if there is anything releated to the missing libblas linkage. I'd presume lots of actual tests would fail if that had an effect? How many tests need working _umath_linalg.so?
If something was really wrong, you'd get 100s to 1000s of failures/errors. I think you're fine here. Cheers, Ralf