[Python-ideas] PEP 426, YAML in the stdlib and implementation discovery

Brett Cannon brett at python.org
Fri May 31 20:35:48 CEST 2013


On Fri, May 31, 2013 at 12:35 PM, Philipp A. <flying-sheep at web.de> wrote:
> Hi, reading PEP 426, I made a connection to a (IMHO) longstanding issue:
> YAML not being in the stdlib.
>
> I’m no big fan of JSON, because it’s so strict and comparatively verbose
> compared with YAML. I just think YAML is more pythonic, and a better choice
> for any kind of human-written data format.
>
> So i devised 3 ideas:
>
> YAML in the stdlib
> The stdlib shouldn’t get more C code; that’s what I’ve gathered.
> So let’s put a pure-python implementation of YAML into the stdlib.
> Let’s also strictly define the API and make it secure-by-naming™. What i
> mean is let’s use the safe load function that doesn’t instantiate
> user-defined classes (in PyYAML called “safe_load”) as default load function
> “load”, and call the unsafe one by a longer, explicit name (e.g.
> “unsafe_load” or “extended_load” or something)
> Let’s base the parser on generators, since generators are cool, easy to
> debug, and allow us to emit and test the token stream (other than e.g. the
> HTML parser we have)

So yaml is not going to end up in the stdlib. The format is not used
widely enough to warrant being added nor have to maintain a parser for
such a complicated format.

> Implementation discovery
> People want fast parsing. That’s incompatible with a pure python
> implementation.
> So let’s define (or use, if there is one I’m not aware of) a discovery
> mechanism that allows implementations of certain APIs to register themselves
> as such.
> Let “import yaml” use this mechanism to import a compatible 3rd party
> implementation in preference to the stdlib one
> Let’s define a property of the implementation that tells the user which
> implementation he’s using, and a way to select a specific implementation
> (Although that’s probably easily done by just not doing “import yaml”, but
> “import std_yaml” or “import pyyaml2”)

The standard practice to to place any accelerated code in something
like _yaml and then in yaml.py do a ``from _yaml import *``.

> Allow YAML to be used besides JSON as metadata like in PEP 426. (so
> including either pymeta.yaml or pymeta.json makes a valid package)
> I don’t propose that we exclusively use YAML, but only because I think that
> PEP 426 shouldn’t be hindered from being implemented ASAP by waiting for a
> new std-library to be ready.

But that then creates a possible position where just to read metadata
you must have a 3rd-party library installed, and I view that as
non-starter.

>
> What do you think?

While I appreciate what you are suggesting, I don't see it happening.

>
> Is there a reason for not including a YAML lib that i didn’t cover?

Yes, see above.

>
> Is there a reason JSON is used other than YAML not being in the stdlib?

It's simpler, it's Python syntax, it's faster to parse.

If you don't like json and would rather specify metadata using YAML, I
would write a tool that read YAML and then emitted the metadata.json
file. That way you get to write your metadata in the format you want
but without requiring YAML support in the stdlib. But making YAML a
first-class citizen in all of this won't happen as long as YAML is not
in the stdlib and that is not a viable option.


More information about the Python-ideas mailing list