You don't need JSON at all in order to serialize and deserialize instances of Python objects and primitives.
Pickle (or e.g. Arrow + [parquet,]) handles nested, arbitrary complex types efficiently and without dataclasses and/or type annotations.
I don't see the value in using JSON to round-trip from Python to the same Python code.
External schema is far more useful than embedding part of an ad-hoc nested object schema in type annotations that can't also do or even specify data validations.
You can already jsonpickle data classes. If you want to share or just publish data, external schema using a web standard is your best bet.
On Wed, Apr 8, 2020, 3:30 AM Andrew Barnert email@example.com wrote:
On Apr 7, 2020, at 18:10, Wes Turner firstname.lastname@example.org wrote:
*That* should read as "are not sufficient".
Stuffing all of those into annotations is going to be cumbersome;
resulting in there being multiple schema definitions to keep synchronized and validate data according to.
I think generating some amalgamation of JSONLD @context & SHACL and JSON
schema would be an interesting exercise. You'd certainly want to add more information to the generated schema than just the corresponding Python types :
- type URI(s) that correspond to the Python primitive types
- JSON schema format strings
- JSON schema length
- TODO JSON schema [...]
Not everything in the world has to be built around RDF semantic triples. In fact, most things don’t have to be. That’s why are a lot more things out there using plain old JSON Schema for their APIs and formats than using JSON-LD. And even more things just using free form JSON.
And for either of those, type annotations are sufficient. You can serialize any instance of Spam to JSON, and deserialize JSON (that you know represents a Spam) to an equal Spam instance, as long as you know what the name and type of every attribute of Spam is (and all of those types are number/string/book/null, types that match the same qualifications as Spam, lists of such a type, and dicts mapping str to such a type). Which is guaranteed to be knowable for dataclasses even without any external information. Or any classes with a (correct) accompanying schema. Or just any classes you design around such a serialization system.
The fact that you don’t have, e.g., a URI with metadata about Spam doesn’t in any way stop any of that from working, or being useful. Type annotations are sufficient for this purpose.
In fact, even type annotations aren’t necessary. Any value that can pickle, you can just msg=json.dumps(b64_encode(pickle.dumps(obj)))) and obj=pickle.loads(b64_decode(json.loads(msg)))) and you’ve got working JSON serialization. What type annotations add is JSON serialization that’s human readable/editable, or computer verifiable, or both. You don’t need JSON-LD unless you’re not just building APIs, but meta-indexes of APIs or automatic API generators or something.