Mailman 3 JSON loading - Python-ideas

Nov. 24, 2020

      I was thinking about the "Load JSON file as single line" thread from a bit
back, and had an idea for a neat solution, albeit one that has potential
stylistic issues.

Idea:
Create two new methods on pathlib.Path objects:

    Path.load(loader, **kwargs)
and
    Path.dump(dumper, obj, **kwargs)

Here, `loader` should be an object that implements a method `load(fileobj:
BinaryIO, *args, **kwargs) -> object`
and `dumper` should be an object that implements `dump(obj: object,
fileobj: BinaryIO, *args, **kwargs)`

This approach takes advantage of the standard `dump/load` pattern used by
json, pickle and other serialization libraries.

Example usage:
data = Path("/path/to/file.json").load(json)
Path("/dest.json").dump(json, my_obj, indent=2)

The implementation should look something like:

    def load(self, loader, *args, **kwargs):
        with self.open('rb') as fh:
            return loader.load(fh, *args, **kwargs)

and the equivalent for `dump`

Reasons I like this:
------------------------
 * Reading the code, the intent seems clear (generally unambiguous)
 * It's explicit in naming the serialization method
 * The code boilerplate is quite low (compared to context managers etc)
 * Implementation is simple & decoupled
 * Using other serialization is trivial:

    my_path.load(json)
    # vs.
    my_path.load(pickle)
    my_path.load(msgpack)
    ...

 * Unlike other similar proposals, this doesn't force the solution to read
the entire file into memory before decoding (although optional support for
`loads`/`dumps` could be added if desired)

Reasons I don't like it:
-------------------------------
 * Relying on the dump/load convention isn't something I've seen much of
elsewhere
 * It may be hard to type-check (is this a concern?)
 * We're just replacing 2 lines of code with one (previous discussions
around this topic seemed to suggest that consensus was that this pattern is
so common, that maybe this is a good-enough reason).
 * The load/dump naming pattern clashes with pathlib's read/write naming.
I'm not concerned about exactly what these methods are called but this
feels like it's a side-effect of either: "read/write and dump/load being
slightly different operations", or "a tiny existing inconsistency in the
standard library" either way, the problem here is we're bringing these
existing things into the same module, which seems fairly minor.

Steve

JSON loading

Stestagg

tags

participants (1)