I'm writing to follow up on the discussion of a Dask feature branch at last week's development meeting. I wanted to summarize the plan to move forward with a Dask feature branch, add some additional notes and make sure everyone has a chance for initial feedback before we move forward with creating a feature branch.
Generally, the plan is to:
1. create a feature branch named `dask`
2a. submit dask-specific PRs to `dask` feature branch
2b. submit general PRs that come up in dask development to `main`
3. weekly (at a minimum, more frequent cadence is welcome) merges of `main` into `dask` branch
The use of a feature branch will allow PR review throughout development, avoiding a massive review when finally merging the `dask` branch back into `main`. It also allows multiple developers to work on Dask development more easily.
Some extra notes and clarification:
Dask-specific vs general PRs:
Some changes to add dask support may be better as PRs to `main`. These changes should be non-breaking and not rely on Dask itself. For example, PR 2934 (https://github.com/yt-project/yt/pull/2934 ) added pickle support to some selection objects to help with Dask development but it has general applicability so was submitted to `main`. If it's not clear whether a PR is general enough, it should be submitted to the `dask` branch and reviewers can suggest re-targeting to `main` if general enough.
Feature branch name:
I'm using `dask` everywhere here… but it could be more exciting. `thedaskening` perhaps (credit to Madicken for this name!)? Feedback on feature branch name is encouraged :)
Current development makes Dask a hard dependency, but a full Dask install is not required. At present the minimal install requires the `array`, `distributed` and `delayed` dependency sets. e.g.,
python -m pip install "dask[array,delayed,distributed]"
This is not far from a full Dask install, so it may be simpler to just require `dask[complete]`. The complete install adds the dask dataframe, dask bag and dask diagnostic features. The extra dependencies that these subsets include are pandas, ffspec, toolz and bokeh (see https://github.com/dask/dask/blob/master/setup.py). Of those, perhaps bokeh is extraneous enough that we should only require the minimal install (bokeh is only used for Dask's interactive browser-dashboard for monitoring Client/cluster activity).
planning to start drafting a YTEP after a feature branch is started and some development has proceeded and we have some broader input.
Short Term Development Targets:
A couple of short term work directions I've been planning include:
1. a daskified particle reader, currently in my fork of yt here: https://github.com/chrishavlin/yt/tree/dask_init_particle
2. Daskification of derived quantities and "simpler" chunked operations
Please reply with any comments you may have! I'm excited about getting feedback and moving this work forward!
It's been a while since we've had scheduled triage meetings during the
week. Since we're trying to get the 4.0 release out the door, I think this
is a good opportunity to restart them. We can also use this time to cowork
on new features and bugfixes!
I've created two polls. The first is morning in central US timezone to
hopefully catch times where devs located in Europe can join, and the second
is for afternoon/early evening in the central US timezone so devs on the
west coast and asia can hopefully join. Vote in whatever poll applies to
you (or both if you find all times work). We will choose times to
accommodate the most people, and I personally would love to see all of you
I've put times for next week, but know that these polls are your *general
availability *weekly for the next few months.
Hello everyone! I hope you're all doing well!
I've been working on converting yt's test suite from nose to pytest for a while now, and I'm excited to tell everyone that it's (provisionally) done! All of the answer tests have now been converted to use pytest. All of the tests (answer and unit) run with pytest. I've also gone through and compared the results generated by nose and by pytest and compared them to one another. For the most part, the results match. There are, however, a few discrepancies (hence the aforementioned provisional completion), and I'm investigating those now (if anyone wants to help, you're more than welcome!). Here are some links to more info if you're interested:
PR (I haven't pushed to this in a bit, but I will soon): https://github.com/yt-project/yt/pull/2817
The code I used to do the nose-pytest comparisons: https://github.com/jcoughlin11/comp_repo
The comparison failures: https://github.com/jcoughlin11/gists
A handy table showing the status of each test set: https://hackmd.io/ScakP9ELTdCxnnNEmIChMQ
Also, in addition to filling everyone in on the state of the PR, I figured now was as good a time as any to invite people who are interested to begin (again) the review process, since this is a rather large set of changes.
If you have any questions or comments, let me know! Thanks, and I hope you all have a nice day!
I have a suspicion that the pmods module is not used anymore (at least its unused internally), but your feedback would help in case I’m wrong.
Here’s the PR proposing to remove ithttps://github.com/yt-project/yt/pull/3061
There were a few things that came out of the team meeting on Friday, but
one was the consensus that we should move toward prioritizing
things labeled with the yt-4.0 label so that we can move to a release.
The other part of this is that for now, we shouldn't add the yt-4.0 label
to new PRs or issues unless discussed on slack or yt-dev.
PS I think I represented this right but if not, please somebody hop in and
The next yt team meeting will be this Friday, February 19 at 3pm GMT (10am
EST). I will post a link to the meeting in the slack channel just
before the meeting. All are welcome to attend!
It's time for our next yt team meeting! If you'd like to participate, *please
fill out the poll below by Friday, February 12, 2021*.
The yt team meeting is a time where we discuss all things related to the
project, including ongoing development projects, releases, governance,
events, etc. Most of the steering committee will attend, but all are
welcome to come and bring items of discussion.
I have been working of a relaunch of the yt blog with Matt over the past
few weeks (it was previewed at the RHytHM workshop if you want to watch the
talk about it (linked below)! Check it out at https://blog.yt-project.org/
Some of the things we tried to do when relaunching the blog was:
- find a format that allowed for easy filtering of posts (in this blog we
have tags, categories, and a search feature)
- Find a format that has a nice post preview (with images)
- Choose something that is accessible to various levels of contribution.
We would love your feedback and contributions on the new blog! What can
make it *even better*? What topics would you like to see on the blog? What
could *you* add to the blog???? I personally think it would be cool to see
- How you use yt for YOUR SCIENCE
- What cool tips and tricks do you use in yt?
- Do you have a favorite feature? What is it and why?
- Did you write a cool extension that uses yt? Let's hear about it!
- Do you have a very cool colormap theme that you use with yt based on
album covers of an artist??? What does science data look like with them???
- And so much more!
We have two different ways of contributing a blog post: either as an issue
or as a PR. You can check out the differences in the contributor guide.
We've moved the blog repo into the yt project organization, it is now
located at https://github.com/yt-project/blog
You can find the blog hosted at https://blog.yt-project.org/
I can't wait to see all of your posts!!!
PS -- Chris Havlin has already contributed two new posts to the blog! Dask!
Napari! So many cool things! Check them out.