Clark C. Evans cce at
Mon Sep 20 21:45:19 CEST 2004


Thank you for taking some serious time looking at PyYaml, I'm not
surprised you have found problems; the entire code base was written
in a very short amount of time and numerous short-cuts were taken.
Tim Parkin has decided on a complete rewrite of PyYaml and that's
great news. For now, you may want to consider using syck, but even
then, you can probably find exploits if you dig hard enough. Patches
are, of course, warmly received.

Primary comments on this thread:

  - YAML was intended from the first day to be a cross-language
    serialization tool. In a mixed-language environment (we use
    Ruby, Python and on occasion Perl) YAML is a very nice to use.

  - Unlike XML, YAML has a information model which closely matches
    the needs of programming languages.  I can't express how
    important this is.  We have spent a great deal of time on the
    model, YAML simply isn't a data format.  We are working on
    transformation languages, and other generic tools.

  - YAML was created for human reading / authoring.  We have spent
    an enormous amount of time working with real use cases of data
    to find a very clean expression of structured data.   If you
    like Python's use of whitespace to show structure, you will
    probably like YAML.  While automated generation of YAML isn't
    that pretty, it eventually will be.

  - This is a long term project; YAML is designed with the idea that
    data lives far longer than programs.  We are taking our time. We
    have also strived for 'consensus' when possible, this may seem
    to slow down specification and implementation work, however, we
    are better for it.  There are lots of people who have provided
    critical insights for YAML and it's been a delightful community.
  - Implementing YAML isn't easy.  At every step of the way the
    consensus has been to keep a clean information model and have
    lots of human presentation options.  The only time we favored an
    implementation issue over presentation is when it would prevent
    YAML from being used in a streaming application.  Therefore we
    have stuck with very minimal look-ahead requirements.  That
    said, if you are looking for a LR(1) grammar for ANTLR or Bison,
    I don't think one exists; but, alas I'll be gladly proven wrong.

  - YAML isn't an efficient binary format.  Pickle or something like
    Jelly's s-expressions will be far faster to parse and load.

  - Finally, everyone working on YAML has full time job; we do not
    have grant funding or university backing.  Therefore, implementations 
    will take time to mature; especially considering the complexity
    of the tradeoffs.

I hope this is helpful to you.


More information about the Python-list mailing list