[Python-ideas] Saving state or data from Python objects (Was: Serializable method)

Sat Mar 10 15:02:42 CET 2012

On Sat, Mar 10, 2012 at 2:59 PM, Masklinn <masklinn at masklinn.net> wrote:
> but it still ends
> up with the format's semantics not necessarily mapping 1:1 to Python
> semantics

This is the right problem.

If we take user approach to see what is the hard time users have -
there are two different user stories:

1. Is about saving current state of a program by magically taking all
object space (or a portion of) and save it in into a file
Goals: Don't make me think
Security: Not important (well, you're saving your program - what did
you expect if everybody can alter it?)
Concerns: Independence mismatch - what if source code for the saved
state changes? - this needs further experiments
Status: pickle or marshal seems to be designed just for this purpose,
but does they provide an easy (one shot) solution for this scenario?

2. Is about saving data stored in Python objects into an some format
Goals: Transform data into the format suitable for external requirement
Goals: Exchange/alter data outside of Python process
Goals: Make it easy, transparent and automatic (note no 'g' here)

(May I call this Data Transformation Theory?)

Basically, what we need it is to transform data. This requires some assumptions:
1. Transformation can be symmetrical (lossless) and non-symmetrical (lossy)
2. Non-symmetrical transformation can be made symmetrical by providing
additional data about what is missed

Now about Python serialization:
1. The problem is complex.
2. Up to the point that it becomes complicated
3. So there should be a systematic approach

Systematic approach:
1. Define the scope of serialization (data in Python objects are
fields and their values)
2. Define output format (JSON - for example)
3. Create mapping
3.1   Make sure lossy transformations are marked
3.2   Make sure 'additional data' to make them symmetrical is analysed
4. Create transformation rules
5. Alter rules to warn users (or raise) when transformation fails with
explanation why
6. Summarize fundamental problems when transformations are not
possible in documentation

So, with such Data Transformation Framework it doesn't matter which is
the target format. You can choose any. You can create your rules and
test your particular chain of objects against default rules to see if
transformation is possible, and if not - know why exactly. Full
control -> confidence -> better tools and problem descriptions.

Of course, such framework doesn't exist yet. ;)
--
anatoly t.