Guido van Rossum firstname.lastname@example.org:
I have now re-read that discussion; it's in the archives starting this message:
As have I. All the stuff in this thread was before the checkin; you were in fact mistaken about the timing of most of the discussion.
There were several suggestions to merge it with fileinput and some suggestions to restructure it. You seem to have ignored these except the criticism on the name "ccframe" (by choosing an even worse name :-).
I did not ignore these suggestions (one that I took was Greg Ward's suggestion that, after all, just throwing an exception was the right thing). And I was in fact planning to merge this thing with fileinput.
Then I looked as what would have to be done to the documentation of fileinput -- in fact, I edited together a combined fileinput documentation page. The result was a mess that convinced me that this does indeed need to be a separate module. There wasn't enough coherence between the old fileinput stuff and my entry points to even make the *documentation* look like a logical unit, let alone the code.
What is going on here? Is it possible that you are mistaken about the timing of the checkin, and that what you thought was discussion afterwards was discussion before? Or am I somehow missing listmail?
Your mail was probably broken -- it wouldn't be the first time :-(.
In the event, my mail was not broken.
There are two posts in the archives that start with a quote from the checkin mail:
Right...one of which completely misses the point by suggesting that this is a filter framework, and the other one of which is a "me too" basically addressing the naming issue. Guido, you are yourself *notorious* for dismissing naming issues with "that's unimportant" and "we can fix it later". How can you criticize me for doing likewise?
As for process issues...I agree that we need better procedures and criteria for what goes into the library. As you know I've made a start on developing same, but my understanding has been that *you* don't think you'll have the bandwidth for it until 2.2 is out.
That's not an excuse for you to check in random bits of code.
So what, exactly, makes this 'random'?
That, Guido, is not a rhetorical question. We don't have any procedures. We don't have any guidelines. We don't have any history of anything but discussing submissions on python-dev before somebody with commit access checks them in. If no -1 votes and the judgment of somebody with commit privileges who has already got a lot of stuff in the library is not sufficient, *what is*?
I'm not trying to be difficult here, but this points at a weakness in our way of doing things. I want to play nice, but I can't if I don't know your actual rules. I don't know what *would* have been sufficient if what I did was not. I don't think anyone else does, either.
Some comments on the code:
This is the sort of critique I was looking for two weeks ago, not a bunch of bikeshedding about how the thing should be named.
- A framework like this should be structured as a class or set of related classes, not a bunch of functions with function arguments. This would make the documentation easier to read as well; instead of having a bunch of functions you pass in, you customize the framework byu overriding methods.
Yes, I thought of this. There's a reason I didn't do it that way. Method override would work just fine as a way to pass in the filename transformer, but not the data transformer.
The problem is this: the driver or "go do it" method of your hypothetical class (the one you'd pass sys.argv[1:]) can't know which overriden method to call in advance, because which one is right would depend on the argument signature of the hook function -- does it take filelike objects, does it take two strings, etc. Actually it's worse than that; two of the cases (the sponge and the line-by-line filtering) aren't even distinguishable by type signature.
So, what the driver function could do is step through three method names looking to see which if any is overridden in the user-created subclass. But would that really be a gain in clarity over having three functions in the module? I'm willing to listen if you think the answer is "yes" and want to tell me why, but it didn't seem so to me.
There's something else I could have done. I could have required that the hook function use specific unique formal argument names in each of the three cases and then had the driver code use inspect to dispatch among them -- but that seemed even more klugey.
Maybe there is a really elegant and low-overhead method of wrapping these functions in a class, and I have just not found it yet. But if so, it is not (as you appear to believe) for lack of looking. If you have an insight that I have missed, I will cheerfully accept instruction on this issue.
- The name "compilerlike" is a really poor choice (there's nothing compiler-like in the code).
No, there isn't. It's called "compilerlike" because it's a framework for making compilerlike interfaces out of functions. But I'm not attached to that name; CompilerFramework or something of that sort would be fine.
- I would like to see failure to open the file handled differently (so the caller can issue decent error message for inaccessible input files without having to catch all IOError exceptions), but again this is a policy issue that should be customizable.
Originally the code originally fielded file I/O errors by complaining to stderr and then exiting. At least two respondents argued that it should simply throw an exception and let the caller do policy, and upon reflection I came to agree with this (this is one of those suggestions you thought I was ignoring).
I realize it's tempting to try and embed a range of policy options in the module to save time, but unless we can have reasonable confidence that they will cover all important cases I don't judge the complexity overhead to be worth it. Again, I am open to instruction on this.
- The policy of not writing the output if it's identical to the input should be optional. There are contexts (like when the tool is invoked by a Makefile) where not writing the output could be harmful: if you touch the input without changing it, Make would invoke the tool over and over again because the output doesn't get touched by the tool.
Interesting point. A better rule, perhaps, would be to suppress writing of output only if both the content *and* the transformed filename are identical -- that would avoid doing a spurious touch on a no-op modification in pace, without confusing Make.
Moreover, there seems to be some bugs: if the
output is the same as the input, the output file is not written even if a filename transformation was requested (making Make even less happy); when a transformation is specified by a string, an undefined variable 'stem' is used. Hasty work, Eric. :-(
I'll take the hit for this; my test framework should have covered that case and didn't, because I was in a hurry to get in before the freeze. However; I know the other cases work because I'm *using* them.
OK, so here's how I see it:
1. I made a minor implementation error with one case; this can be fixed.
2. You were mistaken in believing that (a) there was no discussion or endorsement of the idea before hand, and that (b) I did not defend or justify the design.
3. Some of the respondents simply missed the point; this thing is *not* a framework for creating filters, and shouldn't be named like one or put in the wrong library bin because of it.
4. There is room for technical debate about the interface design, but no choice I'm aware of that is *obviously* better than three functions -- the class-wrapper approach would have unobvious problems doing the hook function dispatch properly.
5. I was trying to do the right thing, but we sorely lack a useful set of norms for what constitutes `good' vs. `bad' librsary checkins. I am actively interested in helping solve problem.