Treating tokenize.Untokenizer as private
I am working through the multiple bugs afflicting tokenize.untokenize, which is described in the tokenize doc and has an even longer docstring. While the function could be implemented as one 70-line function, it happens to be implemented as a 4-line wrapper for a completely undocumented (Untokenizer class with 4 methods. (It is unmentioned in the doc and there are currently no docstrings.) I view the class as a private implementation detail and would like to treat it as such, and perhaps even rename it _Untokenizer to make that clear. The issue arises in #9974. It appears that a fix may require the addition of an instance attribute or .add_whitespace parameter. If there is objection to treating the whole class as private, I would at least like to treat add_whitespace as the private helper that it is. There is no reason to call it directly except for testing. Otherwise, it could just as well have been left inline at the one call site. -- Terry Jan Reedy
On Tue, Feb 18, 2014 at 2:09 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I am working through the multiple bugs afflicting tokenize.untokenize, which is described in the tokenize doc and has an even longer docstring. While the function could be implemented as one 70-line function, it happens to be implemented as a 4-line wrapper for a completely undocumented (Untokenizer class with 4 methods. (It is unmentioned in the doc and there are currently no docstrings.)
I view the class as a private implementation detail and would like to treat it as such, and perhaps even rename it _Untokenizer to make that clear. The issue arises in #9974. It appears that a fix may require the addition of an instance attribute or .add_whitespace parameter. If there is objection to treating the whole class as private, I would at least like to treat add_whitespace as the private helper that it is. There is no reason to call it directly except for testing. Otherwise, it could just as well have been left inline at the one call site.
Is this still an open question, Terry? -eric
On 3/31/2014 2:30 PM, Eric Snow wrote:
On Tue, Feb 18, 2014 at 2:09 PM, Terry Reedy <tjreedy@udel.edu> wrote:
I am working through the multiple bugs afflicting tokenize.untokenize, which is described in the tokenize doc and has an even longer docstring. While the function could be implemented as one 70-line function, it happens to be implemented as a 4-line wrapper for a completely undocumented (Untokenizer class with 4 methods. (It is unmentioned in the doc and there are currently no docstrings.)
I view the class as a private implementation detail and would like to treat it as such, and perhaps even rename it _Untokenizer to make that clear. The issue arises in #9974. It appears that a fix may require the addition of an instance attribute or .add_whitespace parameter. If there is objection to treating the whole class as private, I would at least like to treat add_whitespace as the private helper that it is. There is no reason to call it directly except for testing. Otherwise, it could just as well have been left inline at the one call site.
Is this still an open question, Terry?
It is not currently for #9974 because after consideration of two proposed patches (one part of another issue), I decided that the conditions being guarded against either could not occur or could be checked in the caller, so that no api change was needed. However, I believe the fix for another bug, http://bugs.python.org/issue20387 will require copying the code that correctly formats indents in the old compat method to the newer method, which incorrectly assumes that indents are spaces only. I might end up wishing I could refactor the code. So this may well become a live issue again, and I would still like to know what people think. How do we do code searches (as for use of "Untokenize") these days? -- Terry Jan Reedy
On Mon, Mar 31, 2014 at 1:45 PM, Terry Reedy <tjreedy@udel.edu> wrote:
On 3/31/2014 2:30 PM, Eric Snow wrote:
Is this still an open question, Terry?
It is not currently for #9974 because after consideration of two proposed patches (one part of another issue), I decided that the conditions being guarded against either could not occur or could be checked in the caller, so that no api change was needed.
However, I believe the fix for another bug, http://bugs.python.org/issue20387 will require copying the code that correctly formats indents in the old compat method to the newer method, which incorrectly assumes that indents are spaces only. I might end up wishing I could refactor the code. So this may well become a live issue again, and I would still like to know what people think.
It simply depends on the utility of customizing the default behavior there. I seem to remember subclassing Untokenize for something in a personal project, but I expect doing so wasn't necessary.
How do we do code searches (as for use of "Untokenize") these days?
I believe there was one for which Guido was advocating, but I don't recall it's name. Such a search would be useful though. -eric
On Feb 18, 2014, at 1:09 PM, Terry Reedy <tjreedy@udel.edu> wrote:
While the function could be implemented as one 70-line function, it happens to be implemented as a 4-line wrapper for a completely undocumented (Untokenizer class with 4 methods. (It is unmentioned in the doc and there are currently no docstrings.)
I view the class as a private implementation detail and would like to treat it as such, and perhaps even rename it _Untokenizer to make that clear.
Yes, that would be reasonable. Raymond
participants (3)
-
Eric Snow
-
Raymond Hettinger
-
Terry Reedy