[Cython] patch for #655

Thu Jun 27 20:05:48 CEST 2013

On Thu, Jun 27, 2013 at 10:25 AM, Felix Salfelder <felix at salfelder.org> wrote:
> On Thu, Jun 27, 2013 at 09:23:21AM -0700, Robert Bradshaw wrote:
>> > explicit dependency tracking would imply "manual". which is painful and
>> > error-prone. without running gcc -M (with all flags) you cannot even
>> > guess the headers used transitively. I haven't found a gcc -M call
>> > within the cython souce code.
>>
>> Why would it be needed?
>
> well it is not. if I can use something else to track dependendencies
> (like autotools), something else takes care of gcc -M. but this now also
> needs to call cython with -M, to know when cython needs to be called
> again.

And you're planning on calling cython manually, cutting distutils out
of the loop completely?

>> > Its still just that "cython does not track (all) build dependencies".
>> > but lets make a short story long:
>
> I'm probably wrong here, and it's that other tool, "distutils", that
> would be responsible. anyhow, it's cython I want to write out
> dependencies.
>
>> > look at /src/module_list.py within the sage project. it contains lots of
>> > references to headers at hardwired paths. these paths are wrong in most
>> > cases, and they require manual messing with build system internals
>> > *because* cythonize does not (can not?) keep track of them.
>>
>> Ah, I know a bit more here. module_list.py is structured so because it
>> grew up organically by people with a wide range programming
>> backgrounds and one of the explicit goals of cythonize was (among
>> other things) to remove the needs for such explicit and error-prone
>> declarations. module_list.py has not been "simplified" yet because it
>> was a moving target (I think it was rebased something like a dozen
>> times over a period of about a year before we decided to just get
>> cythonize() in and do module_list cleanup later).
>>
>> It should be entirely sufficient, even for sage.
>
> sage currently uses hardwired paths for all and everything. in
> particular for header locations. it works right now, but the plan is to
> support packages installed to the host system.

I don't see how that would change anything.

> and: sage is just *my* example, it wasnt the original reason for opening
> #655.
>
>> > building with make (read: autotools) just works the way it always did
>> > (+ some obvious quirks that are not currently included within upstram
>> > autotools) -- after patching cython.
>>
>> Can you explain? Are you saying you can type
>>
>>     cython -M *.pyx
>>     make
>
> no. the input for autotools contains a list of things, that you want.
> for example foo.so. now it creates makefiles that implement the rules
> that achieve this. for example foo.so will be built from foo.c, from
> foo.pyx (if foo.pyx exists, of course).
>
> deep down in the rules, the cython -M (and gcc -M) call just does the
> right thing without you even noticing.
>
>> > (i know, that many people hate autotools, and i don't want to start a rant
>> > about it, but it would be better for everybody if
>> > a) make/autotools was taken seriously
>> > b) the missing functionality will be implemented into cython(ize) some
>> >    day, start with dependencies, then port/reimplement the AC_* macros )
>>
>> One of the goals of Cythonize is to *only* handle the pyx -> c[pp]
>> step, and let other existing tools handle the rest.
>
> That's exactly what i want to do. use cython to translate .pyx->.c[pp]
> and nothing else. the existing tool (make) needs to know when cython
> needs to be called, so it has to know the dependency chain.

It also needs to know how cython needs to be called, and then how gcc
needs to be called (or, would you invoke setup.py when any .pyx file
changes, in which case you don't need a more granular rules).

In general, I'm +1 on providing a mechanism for exporting dependencies
for tools to do with whatever they like. I have a couple of issues
with the current approach:

(1) Doing this on a file-by-file basis is quadratic time (which for
something like Sage takes unbearably long as you have to actually read
and parse the entire file to understand its dependencies, and then
recursively merge them up to the leaves). This could be mitigated (the
parsing at least) by writing dep files and re-using them, but it's
still going to be sub-optimal. The exact dependencies may also depend
on the options passed into cythonize (e.g. the specific include
directories, some dynamically computed like numpy_get_includes()).

(2) I don't think we need to co-opt gcc's flags for this. A single
flag that writes its output to a named file should be sufficient. No
one expects to be able to pass gcc options to Cython, and Cython can
be used with more C compilers than just gcc.

(3) The implementation is a bit hackish, with global dictionaries and
random printing.

- Robert