I don't think creating new answer tests is hard as such.  What's difficult is setting up the testing environment, uploading new answers to S3, and figuring out how to use the nose answer testing plugin to actually run the tests. I think Kacper mentioned that he could add some jenkins magic to automate most of that process which should ease the burden of adding new answer test results in the future. While I like your idea, I still do think we should be doing quantitative answer testing of all the frontends.

Do we have datasets for the new 3.0 only frontends?  Can we put them on yt-project.org/data?

I like your idea and think it nicely integrates some degree of lightweight testing and, more importantly, documents how to use the frontends and validates that accessing them works using the latest and greatest development version of yt.  I see no problem with hosting it under the main yt_analysis account.

On Wed, Aug 28, 2013 at 1:46 PM, Matthew Turk <matthewturk@gmail.com> wrote:
Hi all,

I'd like to bring up the topic of frontend tests.

As it stands, we have answer tests for a few frontends, and a large
amount of sample data for other frontends.  The answer tests are fine,
but they are also somewhat ... difficult to write and come up with.  I
believe that they should still exist inside the main repo.

However, what I'd like to propose is a new repository
("frontend_tests" perhaps?) that includes scripts for each frontend
that load data, save images, and that we will then *version* a set of
results images and data inside.  This repository will be allowed to
grow larger than we'd like the main yt repository to grow, and it
would also mean that we could use normal pull request mechanisms for
updating results, rather than AWS S3 uploads with keys and so on.

My idea was that it would include something like a helper function
library (for common routines like "what is the current version of the
code" and "commit a new version of this image") and would also include
image-generating scripts and mechanisms for viewing images.  The idea
is that you would run a script at the top level and it would spit out
a bunch of images or data, and there would be templates of HTML that
you could view old-versus-new.  This could then be integrated into our
CI system (I spoke with Kacper about this previously).  It would serve
two purposes:

1) Display results as a function of the current iteration (these
results would not be quantitatively assessed; this would be the
function of the answer testing we already have)
2) Tell us if loading a dataset or frontend breaks
3) Light quantitative result analysis

I'm planning to create a repository similar to this regardless (for
demonstrating frontend scripts) but I'm bringing it to the list to see
if it is alright to host under the main yt_analysis team on Bitbucket
and to integrate into testing.  Does this appeal to anyone?  I imagine
it could be much simpler than creating new answer tests.

yt-dev mailing list