Clark C. Evans
cce at clarkevans.com
Mon Sep 20 21:45:19 CEST 2004
Thank you for taking some serious time looking at PyYaml, I'm not
surprised you have found problems; the entire code base was written
in a very short amount of time and numerous short-cuts were taken.
Tim Parkin has decided on a complete rewrite of PyYaml and that's
great news. For now, you may want to consider using syck, but even
then, you can probably find exploits if you dig hard enough. Patches
are, of course, warmly received.
Primary comments on this thread:
- YAML was intended from the first day to be a cross-language
serialization tool. In a mixed-language environment (we use
Ruby, Python and on occasion Perl) YAML is a very nice to use.
- Unlike XML, YAML has a information model which closely matches
the needs of programming languages. I can't express how
important this is. We have spent a great deal of time on the
model, YAML simply isn't a data format. We are working on
transformation languages, and other generic tools.
- YAML was created for human reading / authoring. We have spent
an enormous amount of time working with real use cases of data
to find a very clean expression of structured data. If you
like Python's use of whitespace to show structure, you will
probably like YAML. While automated generation of YAML isn't
that pretty, it eventually will be.
- This is a long term project; YAML is designed with the idea that
data lives far longer than programs. We are taking our time. We
have also strived for 'consensus' when possible, this may seem
to slow down specification and implementation work, however, we
are better for it. There are lots of people who have provided
critical insights for YAML and it's been a delightful community.
- Implementing YAML isn't easy. At every step of the way the
consensus has been to keep a clean information model and have
lots of human presentation options. The only time we favored an
implementation issue over presentation is when it would prevent
YAML from being used in a streaming application. Therefore we
have stuck with very minimal look-ahead requirements. That
said, if you are looking for a LR(1) grammar for ANTLR or Bison,
I don't think one exists; but, alas I'll be gladly proven wrong.
- YAML isn't an efficient binary format. Pickle or something like
Jelly's s-expressions will be far faster to parse and load.
- Finally, everyone working on YAML has full time job; we do not
have grant funding or university backing. Therefore, implementations
will take time to mature; especially considering the complexity
of the tradeoffs.
I hope this is helpful to you.
More information about the Python-list