[issue15269] Document dircmp.left and dircmp.right

New submission from Chris Jerdonek <chris.jerdonek@gmail.com>: The documentation for the filecmp.dircmp class doesn't mention dircmp.left and dircmp.right. Being aware of this up front would make certain simplifications easier to think of. For example, knowing about these attributes opens up the possibility of passing dircmp instances around without having to pass the two paths separately (e.g. in certain recursive algorithms involving dircmp). Knowing this also means you can recover the two paths if using the subdirs attribute (whose values are dircmp instances). ---------- assignee: docs@python components: Documentation keywords: easy messages: 164781 nosy: cjerdonek, docs@python priority: normal severity: normal status: open title: Document dircmp.left and dircmp.right versions: Python 3.3 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Changes by Chris Jerdonek <chris.jerdonek@gmail.com>: ---------- keywords: +patch Added file: http://bugs.python.org/file26282/issue-15269-1.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Senthil Kumaran <senthil@uthcode.com> added the comment: Given that we have self.left_list and self.left_only ( and self.right_list and self.right_only), I am not sure how adding self.left/self.right is going to add more meaning? It would simply point to the dir1 and dir2 arguments that are being passed. Also, it may not be a helpful feature. If it is a must, then we can just rename self.left to self._left within the *code*, but otherwise, I do not see any change which is required here. I suggest closing this report as wont-fix. ---------- nosy: +orsenthil _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment:
Given that we have self.left_list and self.left_only ( and self.right_list and self.right_only), I am not sure how adding self.left/self.right is going to add more meaning?
It adds more meaning because you can't construct self.left and self.right from self.left_list, self.left_only, etc. The latter are children inside the two directories (expressed relatively), while the former are the parent directories. So it's strictly different information. As I said, there are cases where being able to access the left and right directories simplifies the calling code in a way that is not otherwise possible. For example, if you are recursively comparing directories, it is natural to recurse on the dircmp.subdirs attribute, whose values are dircmp instances. The caller did not construct these instances, so there is no other way to obtain the left and right directories that were used to construct those instances. Otherwise, the caller has to construct the dircmp instances manually from common_dirs, which reduces the usefulness of the subdirs attribute and the scenarios in which it can be used. Secondly, there are cases where it is natural to pass dircmp instances around between functions. If one cannot recover the left and right directories from the dircmp instances, then one would also have to pass the left and right directories to those functions, along with the dircmp instances, or else pass just the left and right directories and then reconstruct the dircmp instances from those arguments, which leads to uglier code. Thirdly, it is better to reuse existing dircmp instances where possible rather than create new ones, because of the penalty after creating a new instance of needing to recalculate the attributes. As the documentation notes, attributes are computed lazily and cached using __getattr__(), so it's better to hold onto existing instances. If you are still not convinced, I would like the opportunity to provide actual code examples before you close this issue. Then you can tell me if there is an equally simple alternative that does not involve accessing left and right. I don't see how it hurts to be able to access left and right as attributes and why you would consider concealing this information from the caller behind private attributes. It's a common idiom for constructor arguments to be stored as public attributes. In this case, I think it's worth documenting because the names of the attributes ("left" and "right") used to store the constructor arguments don't match the names of the constructor arguments themselves ("a" and "b"). ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Yes, code samples would help clarifying the rationale for this request ---------- nosy: +eli.bendersky versions: +Python 3.4 -Python 3.3 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment: Thanks for taking the time to look at this, Eli. In response to your question, here is one illustrated rationale. When recursing through a directory using dircmp, it is simplest and cleanest to be able to recurse on the subdirs attribute without having to pass pairs of paths around or reconstruct dircmp instances. You can see this for example in filecmp's own very concise implementation of dircmp.report_full_closure(): def report_full_closure(self): # Report on self and subdirs recursively self.report() for sd in self.subdirs.values(): print() sd.report_full_closure() However, dircmp's reporting functionality is self-admittedly "lousy": def report(self): # Print a report on the differences between a and b # Output format is purposely lousy print('diff', self.left, self.right) ... (Incidentally, observe above that dircmp.report() itself uses the 'left' and 'right' attributes.) Given the limitations of report_full_closure(), etc, it is natural that one might want to write a custom or replacement reporting function with nicer formatting. When doing this, it would be nice to be able to follow that same clean and concise recursion pattern. For example-- def diff(dcmp): for sd in dcmp.subdirs.values(): diff(sd) for name in dcmp.diff_files: print("%s differs in %s and %s" % (name, dcmp.left, dcmp.right)) dcmp = dircmp('dir1', 'dir2') diff(dcmp) If one isn't able to access 'left' and 'right' (or if one simply isn't aware of those attributes, which was the case for me at one point), the alternative would be to do something like the following, which is much uglier and less DRY: import os def diff2(dcmp, dir1, dir2): for name, sd in dcmp.subdirs.items(): subdir1 = os.path.join(dir1, name) subdir2 = os.path.join(dir2, name) diff2(sd, subdir1, subdir2) for name in dcmp.diff_files: print("%s differs in %s and %s" % (name, dir1, dir2)) dcmp = dircmp('dir1', 'dir2') diff2(dcmp, dir1='dir1', dir2='dir2') An example like diff() above might even be worth including in the docs as an example of how subdirs can be used to avoid having to manually call os.path.join(...), etc. There are also non-recursive situations in which being able to access dircmp.left and dircmp.right makes for cleaner code. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Makes sense. I agree that publicly exposing the left/right attributes makes sense. But let's do it properly: 1. Add an example to the documentation 2. Add some tests to Lib/test/test_filecmp.py that verify these attributes behave as expected In addition, I think it makes a lot of sense to add an optional "stream" argument to the report() and report_*() methods, to at leas allow reporting to some custom channel and not solely stdout. The report() method does a lot more than your simple example demonstrates, and it's not very easy to replace its functionality. Would you like to submit a full patch for this? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment: Thanks a lot, Eli, and for the suggestions. I would be happy to prepare a full patch. Regarding the stream argument, I think there are other changes to dircmp that would be more useful (e.g. issue 12932), but I agree that some form of your suggestion makes sense. Would you mind if I created a separate issue for it to discuss there? I have some suggestions on it, and I would be happy to work on it there. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment: Attaching a patch to address Eli's requests (1) and (2). Since this patch merely adds documentation and tests for existing functionality, is there any reason why this cannot go into Python 3.3? Thanks. ---------- Added file: http://bugs.python.org/file26516/issue-15269-2.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: I think it can go into 3.3 but only if it gets reviewed by another core dev (we're in release candidate stage now). Senthil - can you review the patch together with me? As for customizing the stream, yes, go ahead and open a new issue for it, and add me there as nosy. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment: Sounds good. And for the record, new issue created here: issue 15454 ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Changes by Chris Jerdonek <chris.jerdonek@gmail.com>: ---------- nosy: +asvetlov _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Senthil Kumaran added the comment: Hi Chris & Eli, - Sorry that I missed this issue. Chris - agree to your rationale. I can see how having self.left and self.right documented can add value, The diff example was useful. Initially, I did have some doubts in terms how it could be useful when the args are sent by the user, your example clarified. Thanks!. As Eli has looked at this one too, I shall commit the patch. Everything is good. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Senthil Kumaran added the comment: As this is not adding any feature, but just an additional clarification to the existing attribute together with some useful documentation, I believe this can go in 2.7, 3.2 and 3.3 Please correct me if I am wrong here. ---------- versions: +Python 2.7, Python 3.2, Python 3.3 -Python 3.4 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek added the comment: Thanks a lot, Senthil. I appreciate it. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek added the comment: Senthil, here is a recent e-mail and response in which I asked about documentation changes and adding tests during feature freeze: http://mail.python.org/pipermail/python-dev/2012-July/121138.html Also, here is a recent example of a documentation clarification that went into 2.7, 3.2, and tip: http://bugs.python.org/issue15554 Thanks again. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Roundup Robot added the comment: New changeset 7590dec388a7 by R David Murray in branch '3.2': #15269: document dircmp.left and right, and add tests for them. http://hg.python.org/cpython/rev/7590dec388a7 New changeset c592e5a8fa4f by R David Murray in branch 'default': Merge #15269: document dircmp.left and right, and add tests for them. http://hg.python.org/cpython/rev/c592e5a8fa4f New changeset e64d4518b23c by R David Murray in branch '2.7': #15269: document dircmp.left and right. http://hg.python.org/cpython/rev/e64d4518b23c ---------- nosy: +python-dev _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

R. David Murray added the comment: Thanks, Chris. ---------- nosy: +r.david.murray resolution: -> fixed stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Changes by Chris Jerdonek <chris.jerdonek@gmail.com>: ---------- keywords: +patch Added file: http://bugs.python.org/file26282/issue-15269-1.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Senthil Kumaran <senthil@uthcode.com> added the comment: Given that we have self.left_list and self.left_only ( and self.right_list and self.right_only), I am not sure how adding self.left/self.right is going to add more meaning? It would simply point to the dir1 and dir2 arguments that are being passed. Also, it may not be a helpful feature. If it is a must, then we can just rename self.left to self._left within the *code*, but otherwise, I do not see any change which is required here. I suggest closing this report as wont-fix. ---------- nosy: +orsenthil _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment:
Given that we have self.left_list and self.left_only ( and self.right_list and self.right_only), I am not sure how adding self.left/self.right is going to add more meaning?
It adds more meaning because you can't construct self.left and self.right from self.left_list, self.left_only, etc. The latter are children inside the two directories (expressed relatively), while the former are the parent directories. So it's strictly different information. As I said, there are cases where being able to access the left and right directories simplifies the calling code in a way that is not otherwise possible. For example, if you are recursively comparing directories, it is natural to recurse on the dircmp.subdirs attribute, whose values are dircmp instances. The caller did not construct these instances, so there is no other way to obtain the left and right directories that were used to construct those instances. Otherwise, the caller has to construct the dircmp instances manually from common_dirs, which reduces the usefulness of the subdirs attribute and the scenarios in which it can be used. Secondly, there are cases where it is natural to pass dircmp instances around between functions. If one cannot recover the left and right directories from the dircmp instances, then one would also have to pass the left and right directories to those functions, along with the dircmp instances, or else pass just the left and right directories and then reconstruct the dircmp instances from those arguments, which leads to uglier code. Thirdly, it is better to reuse existing dircmp instances where possible rather than create new ones, because of the penalty after creating a new instance of needing to recalculate the attributes. As the documentation notes, attributes are computed lazily and cached using __getattr__(), so it's better to hold onto existing instances. If you are still not convinced, I would like the opportunity to provide actual code examples before you close this issue. Then you can tell me if there is an equally simple alternative that does not involve accessing left and right. I don't see how it hurts to be able to access left and right as attributes and why you would consider concealing this information from the caller behind private attributes. It's a common idiom for constructor arguments to be stored as public attributes. In this case, I think it's worth documenting because the names of the attributes ("left" and "right") used to store the constructor arguments don't match the names of the constructor arguments themselves ("a" and "b"). ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Yes, code samples would help clarifying the rationale for this request ---------- nosy: +eli.bendersky versions: +Python 3.4 -Python 3.3 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment: Thanks for taking the time to look at this, Eli. In response to your question, here is one illustrated rationale. When recursing through a directory using dircmp, it is simplest and cleanest to be able to recurse on the subdirs attribute without having to pass pairs of paths around or reconstruct dircmp instances. You can see this for example in filecmp's own very concise implementation of dircmp.report_full_closure(): def report_full_closure(self): # Report on self and subdirs recursively self.report() for sd in self.subdirs.values(): print() sd.report_full_closure() However, dircmp's reporting functionality is self-admittedly "lousy": def report(self): # Print a report on the differences between a and b # Output format is purposely lousy print('diff', self.left, self.right) ... (Incidentally, observe above that dircmp.report() itself uses the 'left' and 'right' attributes.) Given the limitations of report_full_closure(), etc, it is natural that one might want to write a custom or replacement reporting function with nicer formatting. When doing this, it would be nice to be able to follow that same clean and concise recursion pattern. For example-- def diff(dcmp): for sd in dcmp.subdirs.values(): diff(sd) for name in dcmp.diff_files: print("%s differs in %s and %s" % (name, dcmp.left, dcmp.right)) dcmp = dircmp('dir1', 'dir2') diff(dcmp) If one isn't able to access 'left' and 'right' (or if one simply isn't aware of those attributes, which was the case for me at one point), the alternative would be to do something like the following, which is much uglier and less DRY: import os def diff2(dcmp, dir1, dir2): for name, sd in dcmp.subdirs.items(): subdir1 = os.path.join(dir1, name) subdir2 = os.path.join(dir2, name) diff2(sd, subdir1, subdir2) for name in dcmp.diff_files: print("%s differs in %s and %s" % (name, dir1, dir2)) dcmp = dircmp('dir1', 'dir2') diff2(dcmp, dir1='dir1', dir2='dir2') An example like diff() above might even be worth including in the docs as an example of how subdirs can be used to avoid having to manually call os.path.join(...), etc. There are also non-recursive situations in which being able to access dircmp.left and dircmp.right makes for cleaner code. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: Makes sense. I agree that publicly exposing the left/right attributes makes sense. But let's do it properly: 1. Add an example to the documentation 2. Add some tests to Lib/test/test_filecmp.py that verify these attributes behave as expected In addition, I think it makes a lot of sense to add an optional "stream" argument to the report() and report_*() methods, to at leas allow reporting to some custom channel and not solely stdout. The report() method does a lot more than your simple example demonstrates, and it's not very easy to replace its functionality. Would you like to submit a full patch for this? ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment: Thanks a lot, Eli, and for the suggestions. I would be happy to prepare a full patch. Regarding the stream argument, I think there are other changes to dircmp that would be more useful (e.g. issue 12932), but I agree that some form of your suggestion makes sense. Would you mind if I created a separate issue for it to discuss there? I have some suggestions on it, and I would be happy to work on it there. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment: Attaching a patch to address Eli's requests (1) and (2). Since this patch merely adds documentation and tests for existing functionality, is there any reason why this cannot go into Python 3.3? Thanks. ---------- Added file: http://bugs.python.org/file26516/issue-15269-2.patch _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Eli Bendersky <eliben@gmail.com> added the comment: I think it can go into 3.3 but only if it gets reviewed by another core dev (we're in release candidate stage now). Senthil - can you review the patch together with me? As for customizing the stream, yes, go ahead and open a new issue for it, and add me there as nosy. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek <chris.jerdonek@gmail.com> added the comment: Sounds good. And for the record, new issue created here: issue 15454 ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Changes by Chris Jerdonek <chris.jerdonek@gmail.com>: ---------- nosy: +asvetlov _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Senthil Kumaran added the comment: Hi Chris & Eli, - Sorry that I missed this issue. Chris - agree to your rationale. I can see how having self.left and self.right documented can add value, The diff example was useful. Initially, I did have some doubts in terms how it could be useful when the args are sent by the user, your example clarified. Thanks!. As Eli has looked at this one too, I shall commit the patch. Everything is good. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Senthil Kumaran added the comment: As this is not adding any feature, but just an additional clarification to the existing attribute together with some useful documentation, I believe this can go in 2.7, 3.2 and 3.3 Please correct me if I am wrong here. ---------- versions: +Python 2.7, Python 3.2, Python 3.3 -Python 3.4 _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek added the comment: Thanks a lot, Senthil. I appreciate it. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Chris Jerdonek added the comment: Senthil, here is a recent e-mail and response in which I asked about documentation changes and adding tests during feature freeze: http://mail.python.org/pipermail/python-dev/2012-July/121138.html Also, here is a recent example of a documentation clarification that went into 2.7, 3.2, and tip: http://bugs.python.org/issue15554 Thanks again. ---------- _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

Roundup Robot added the comment: New changeset 7590dec388a7 by R David Murray in branch '3.2': #15269: document dircmp.left and right, and add tests for them. http://hg.python.org/cpython/rev/7590dec388a7 New changeset c592e5a8fa4f by R David Murray in branch 'default': Merge #15269: document dircmp.left and right, and add tests for them. http://hg.python.org/cpython/rev/c592e5a8fa4f New changeset e64d4518b23c by R David Murray in branch '2.7': #15269: document dircmp.left and right. http://hg.python.org/cpython/rev/e64d4518b23c ---------- nosy: +python-dev _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________

R. David Murray added the comment: Thanks, Chris. ---------- nosy: +r.david.murray resolution: -> fixed stage: -> committed/rejected status: open -> closed _______________________________________ Python tracker <report@bugs.python.org> <http://bugs.python.org/issue15269> _______________________________________
participants (5)
-
Chris Jerdonek
-
Eli Bendersky
-
R. David Murray
-
Roundup Robot
-
Senthil Kumaran