Option Read/Write Files without discarding comments

It occurs to me that there are many situations where files are human authored and can include comments but by default when python reads/modifies/writes those files by default the comments are lost. Some examples include configfile, json and even python itself with the special case that docstrings are read without being discarded. If we wish to generate a commented file, (such as an initial version of a config file), then file type specific code is needed but in general if we read in a file the comments are discarded – usually of course this is what we need but not if we are going to be modifying values are re-writing. In the case of files that are read and parsed into python objects we could quite easily have an option to make comments a docstring on the resulting objects and a write option to output the docstrings (marked as comments). It would be necessary to have a mechanism for preserving file level comments that are interleaved with values/objects and after the last value/object. Of course, in many cases, it would be nice to have find commented object and uncomment object, possibly comment out object methods but I still believe that even without such methods could be useful to be able to round trip files that include comments. Writing this has also sparked a thought of wouldn’t it be nice to have a format for python line comments where the object so commented gets a docstring from the line comment, e.g. class StockItem: “”” Represents an item available in store “”” lineno: str #! Catalogue Item Number in the format AA-nnnn-AAA : The humble docstring can be so useful and as far as I know any python object can carry one but currently base classes such as int, etc. Have read only __doc__ members. Of course I am probably going to get told off for 2 related but divergent elements in one thread! Steve Barnes Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10

On 8/28/20 6:56 AM, Steve Barnes wrote:
This is just about the first part of your mail: There are external libraries for reading and writing various formats while preserving comments (and even more, e.g. spacing): For Python files, there is RedBaron, for YAML there is ruamel.yaml, and JSON doesn't allow comments as far as I know. For configfile I'm not sure, doesn't look like there is anything... but one could have a look at how comments are represented in RedBaron and ruamel.yaml and write a similar library for configfiles. Cheers

On Fri, Aug 28, 2020 at 5:45 AM Shahriar Heidrich < smheidrich@weltenfunktion.de> wrote:
This is the key point -- each text file format has its own structure, etc, there is no "standard" that could be generally supported. And on the Python side, there is no standard way to represent arbitrary text files as Python objects. Sure, for the most part,most file readers build up a tree of dicts and lists, but where would comments fit into that -- how would you even know which object to attach the comment to? that specification would have to be part of the file format -- kind of like docstrings, actually. there is a spec in Python code for where to put strings that you want to preserve and "attach" to certain objects. But notably, they are NOT comments -- comments, by their very nature, are more free form.
and JSON doesn't allow comments as far as I know.
It does not -- though many extensions to JSON do -- notably JSON-5 -- which I wish were supported in the stdlib. But this makes a good point, if you want conforming JSON, you can't use comments at all -- so maybe your own stqndard would be helpful: an_object = {"__comment__": " an arbitrary comment about this object", "this": "a regular attribute", "that": [3,4,5,6]} And then you could have a "CommentedDict" Python object as well. But having said all that -- I have, in fact, written my own file format that looked a lot like INI files, but had comments thqt were preserved. It is nice to have a simple way to handle comments in the raw file format, while still capturing them. And tthe JSON example above is not simpel enough for easy text editor action. Maybe a standard way to do it with YAML would be doable. So *maybe* a set of "commented" versions of the core builtins would be useful. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Fri, Aug 28, 2020 at 09:34:43AM -0700, Christopher Barker wrote:
And on the Python side, there is no standard way to represent arbitrary text files as Python objects.
Um, sure there is: unstructured text, which is exactly what Python does. And contrary to Steve Barnes' comments, Python doesn't discard comments when reading and writing text files, because it has no concept of a generic "text file comment". It's all just text, regardless of whether the text includes "#" or "{...}" or "--" or "REM" or ";" or "/* ... */". -- Steve

On 8/28/20 6:56 AM, Steve Barnes wrote:
This is just about the first part of your mail: There are external libraries for reading and writing various formats while preserving comments (and even more, e.g. spacing): For Python files, there is RedBaron, for YAML there is ruamel.yaml, and JSON doesn't allow comments as far as I know. For configfile I'm not sure, doesn't look like there is anything... but one could have a look at how comments are represented in RedBaron and ruamel.yaml and write a similar library for configfiles. Cheers

On Fri, Aug 28, 2020 at 5:45 AM Shahriar Heidrich < smheidrich@weltenfunktion.de> wrote:
This is the key point -- each text file format has its own structure, etc, there is no "standard" that could be generally supported. And on the Python side, there is no standard way to represent arbitrary text files as Python objects. Sure, for the most part,most file readers build up a tree of dicts and lists, but where would comments fit into that -- how would you even know which object to attach the comment to? that specification would have to be part of the file format -- kind of like docstrings, actually. there is a spec in Python code for where to put strings that you want to preserve and "attach" to certain objects. But notably, they are NOT comments -- comments, by their very nature, are more free form.
and JSON doesn't allow comments as far as I know.
It does not -- though many extensions to JSON do -- notably JSON-5 -- which I wish were supported in the stdlib. But this makes a good point, if you want conforming JSON, you can't use comments at all -- so maybe your own stqndard would be helpful: an_object = {"__comment__": " an arbitrary comment about this object", "this": "a regular attribute", "that": [3,4,5,6]} And then you could have a "CommentedDict" Python object as well. But having said all that -- I have, in fact, written my own file format that looked a lot like INI files, but had comments thqt were preserved. It is nice to have a simple way to handle comments in the raw file format, while still capturing them. And tthe JSON example above is not simpel enough for easy text editor action. Maybe a standard way to do it with YAML would be doable. So *maybe* a set of "commented" versions of the core builtins would be useful. -CHB -- Christopher Barker, PhD Python Language Consulting - Teaching - Scientific Software Development - Desktop GUI and Web Development - wxPython, numpy, scipy, Cython

On Fri, Aug 28, 2020 at 09:34:43AM -0700, Christopher Barker wrote:
And on the Python side, there is no standard way to represent arbitrary text files as Python objects.
Um, sure there is: unstructured text, which is exactly what Python does. And contrary to Steve Barnes' comments, Python doesn't discard comments when reading and writing text files, because it has no concept of a generic "text file comment". It's all just text, regardless of whether the text includes "#" or "{...}" or "--" or "REM" or ";" or "/* ... */". -- Steve
participants (4)
-
Christopher Barker
-
Shahriar Heidrich
-
Steve Barnes
-
Steven D'Aprano