.ini vs JSON vs YAML meta data format (Was: Simplify 426: Deprecate Author-email and Maintainer-email)

On Thu, Apr 25, 2013 at 4:58 PM, Daniel Holth <dholth@gmail.com> wrote:
I would prefer to see PEP 390 withdrawn and I think this has been suggested before. The metadata is already sourced from different files depending on your build system.
The .ini format and our parsers for it are really awful. I always resented having to learn it in order to use distutils/setuptools when every other language (RFC822, Python, JSON) is both better and already familiar.
Everyone here has an excellent user stories. I am stunned that there is not process to help PEP authors collect them? Original user stories are much better to learn and draft new standards than when reworked into PEP specification text.
FWIW bdist_wheel does something half-PEP-390 inspired with setup.cfg:
[metadata] provides-extra = tool signatures faster-signatures requires-dist = distribute (>= 0.6.34) argparse; python_version == '2.6' keyring; extra == 'signatures' dirspec; sys.platform != 'win32' and extra == 'signatures' ed25519ll; extra == 'faster-signatures' license-file = LICENSE.txt
(https://bitbucket.org/dholth/wheel/src/tip/setup.cfg?at=default)
Why reinvent own format when there is already YAML with indented sections? Yes, I have to see examples to write in this format, but it is readable and familiar across a broad range of products. It is better than JSON at least, because it doesn't require to wrap anything in quotes.

anatoly techtonik <techtonik <at> gmail.com> writes:
Why reinvent own format when there is already YAML with indented sections? Yes, I have to see examples to write in this format, but it is readable and familiar across a broad range of products. It is better than JSON at least, because it doesn't require to wrap anything in quotes.
I too originally thought YAML might be better, but unfortunately the most mature implementation there is (PyYAML) is still not quite ready, as there are many open issues around dump() and load(). See for example http://pyyaml.org/ticket/264 "yaml.load() fails to load a dict just saved by yaml.dump()" Together with the security issues around YAML (which bit the Rails community not that long ago) means that JSON is probably a better bet for the moment. Regards, Vinay Sajip

On Apr 27, 2013, at 5:24 AM, Vinay Sajip <vinay_sajip@yahoo.co.uk> wrote:
anatoly techtonik <techtonik <at> gmail.com> writes:
Why reinvent own format when there is already YAML with indented sections? Yes, I have to see examples to write in this format, but it is readable and familiar across a broad range of products. It is better than JSON at least, because it doesn't require to wrap anything in quotes.
I too originally thought YAML might be better, but unfortunately the most mature implementation there is (PyYAML) is still not quite ready, as there are many open issues around dump() and load(). See for example
"yaml.load() fails to load a dict just saved by yaml.dump()"
Together with the security issues around YAML (which bit the Rails community not that long ago) means that JSON is probably a better bet for the moment.
Regards,
Vinay Sajip
_______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig
Luckily the JSON is only the format that is used inside of the sdist. It does not need to be the user facing format. PArt of the work is making it so that one tool does not own the entire process so that people are free to make their own tools that generate sdists and wheels and what not. These tools could easily consume YAML and then spit out a sdist/wheel with JSON inside. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA

On Sat, Apr 27, 2013 at 12:24 PM, Vinay Sajip <vinay_sajip@yahoo.co.uk>wrote:
Why reinvent own format when there is already YAML with indented
anatoly techtonik <techtonik <at> gmail.com> writes: sections?
Yes, I have to see examples to write in this format, but it is readable and familiar across a broad range of products. It is better than JSON at least, because it doesn't require to wrap anything in quotes.
I too originally thought YAML might be better, but unfortunately the most mature implementation there is (PyYAML) is still not quite ready, as there are many open issues around dump() and load(). See for example
"yaml.load() fails to load a dict just saved by yaml.dump()"
This one is huge and requires example to be run to see the actual error. Is that a minimal example? I don't even have an idea fails there.
Together with the security issues around YAML (which bit the Rails community not that long ago) means that JSON is probably a better bet for the moment.
Google AppEngine uses it. Ansible uses it. I'd like to know more about that.

This one is huge and requires example to be run to see the actual error. Is that a minimal example? I don't even have an idea fails there.
Obviously small files might work, but that's no help. I didn't have the time to spend to make the example smaller. It is not *that* large, but it is the size of metadata that we have to handle. There are plenty of other issues on the tracker relating to dumping()/loading().
Google AppEngine uses it. Ansible uses it. I'd like to know more about that.
That doesn't matter. For dealing with packaging metadata (the application I was using it for) it failed (on more than a trivial number of metadata samples) and that means it's not currently fit for the purpose under discussion. Regards, Vinay Sajip

Vinay Sajip <vinay_sajip@yahoo.co.uk> writes:
This one is huge and requires example to be run to see the actual error. Is that a minimal example? I don't even have an idea fails there.
Obviously small files might work, but that's no help. I didn't have the time to spend to make the example smaller. It is not *that* large, but it is the size of metadata that we have to handle. There are plenty of other issues on the tracker relating to dumping()/loading().
Looking at the script, it exhibits the problem only when you tweak the style of the emitter, ie for aesthetic reasons I presume, so *that* cannot be the main reason IMO in this thread's context. The security issues are a much bigger problem (but there is a safe_load() that maybe can alleviate those); maybe speed issues are also worth a consideration. ciao, lele. -- nickname: Lele Gaifax | Quando vivrò di quello che ho pensato ieri real: Emanuele Gaifas | comincerò ad aver paura di chi mi copia. lele@metapensiero.it | -- Fortunato Depero, 1929.

Reminder that the JSON file format is a data exchange format and does not need to be the user facing format. We should use JSON because it's already in the Python stdlib (and has been since 2.6) and it does not have any of a numerous amount of issues. ----------------- Donald Stufft PGP: 0x6E3CBCE93372DCFA // 7C6B 7C5D 5E2B 6356 A926 F04F 6E3C BCE9 3372 DCFA
participants (4)
-
anatoly techtonik
-
Donald Stufft
-
Lele Gaifax
-
Vinay Sajip