[Python-Dev] Split unicodeobject.c into subfiles?

martin at v.loewis.de martin at v.loewis.de
Sun Oct 7 23:34:10 CEST 2012


Zitat von Victor Stinner <victor.stinner at gmail.com>:

>> The amount of code will not be reduced, but now you also need to guess what
>> file some piece of functionality may be in.
>
> How do you search a piece of code?

I type /<pattern> in vim, or Ctrl-s (incremental search) in Emacs.

> If you search for a function by its
> name, it does not matter in which file it is defined if you an IDE or
> vim/emacs with a correct configuration. For example, I type ":tag
> PyUnicode_Format" to go to the PyUnicode_Format() function.

I don't like tag files. I want to search in all source code (including
comments and strings), and I want to do a substring search (not sure
whether that is supported in tag files).

>> Instead of having my text editor
>> (Emacs) search in one file, it will have to search across multiple files -
>> but not across all open buffers, but only some of them (since I will have
>> many other source files open as well).
>
> Does it mean that it would be more practical to merge all C files into
> one unique file?

That would be extreme, of course. It may cause problems with the
responsiveness of the editor, and with compile times; it may also cause
problems with merging in version control. In addition, there might
be naming conflicts which make it impractical (e.g. many structures
containing the same tp_* struct slots, so when you search for tp_new,
for example, you would get too many hits).

But in principle, I don't mind maintaining *very* large source files.
unicodeobject.c isn't really *that* large.


>> What is it that you want to do that can be done easier if it's multiple
>> files?
>
> Another problem with huge files is to handle "dependencies" with
> static functions. If the function A calls the function B which calls
> the function C, you have to order A, B and C "correctly" if these
> functions are private and not declared at the top of the file.
>
> If functions are grouped correctly, you just lhave to add the function
> to the right file, or reorder the files.

I don't understand. Do you envision that A, B, and C are in separate files?
If so, they cannot be all static anymore, unless you still combine all files
with #include directives, or unless you put them still all in the same file.
I don't see how multiple files gives any improvement. It seems to make matters
worse:
- if you put A, B, C in the same file, you have the same issue that you
   had when unicodeobject.c was a large file - you have to order them
   "correctly".
- if you put them in different files, it gets worse: you need to place
   A in a file that gets included after the file that has B, even if it
   would be more logical to put them reverse.

> I also prefer short files beacuse it's easier to review/audit a small
> file. My brain cannot store too many functions :-)

This is what I don't understand. Why do you have to remember all functions
when reviewing or auditing a file? You can safely ignore all functions
but the one you are reviewing - whether the other functions are in a different
file or in the same file.

Why can you ignore the functions only if they are stored in a different
file?

Regards,
Martin





More information about the Python-Dev mailing list