About sixteen months ago, I launched the SciPy Documentation Project and its Marathon. Dozens pitched in and now numpy docs are rapidly approaching a professional level. The "pink wave" ("Needs Review" status) is at 56% today! There is consensus among doc writers that much of the rest can be labeled in the "unimportant" category, so we're close to starting the review push (hold your fire, there is a web site mod to be done first). We're also nearing the end of the summer, and it's time to look ahead. The path for docs is clear, but the path for SciPy is not. I think our weakest area right now is organization of the project. There is no consensus-based plan for improvement of the whole toward a stated goal, no centralized coordination of work, and no funded work focused on many of our weaknesses, notwithstanding my doc effort and what Enthought does for code. I define success as popular adoption in preference to commercial packages. I believe in vote-with-your-feet: this goal will not be reached until all aspects of the package and its presentation to the world exceed those of our commercial competition. Scipy is now a grass roots effort, but that takes it only so far. Other projects, such as OpenOffice and Sage, don't follow this model and do produce quality products that compete with commercial offerings, at least on open-source platforms. Before we can even hope for that, we have to do the following: - Docs - Rest of numpy reference pages reviewed and proofed or marked unimportant - Scipy reference pages - User manual for the whole toolstack - Multiple commercial books - Packaging - Personal Package Archive or equivalent for every release of every OS for the full toolstack (There are tools that do this but we don't use them. NSF requires Metronome - http://nmi.cs.wisc.edu/ - for funding most development grants, so right now we're not even on NSF's radar.) - Track record of having the whole toolstack installation "just work" in a few command lines or clicks for *everyone* - Regular, scheduled releases of numpy and scipy - Coordinated releases of numpy, scipy, and stable scikits into PPA system - Public communication - A real marketing plan - Executing on that plan - Web site geared toward multiple audiences, run by experts at that kind of communication - More webinars, conference booths, training, aimed at all levels - Demos, testimonials, topical forums, all showcased - Code - A full design review for numpy 2.0 - No more inconsistencies like median(), lacking "out", degrees option for angle functions? - Trimming of financial functions, maybe others, from numpy? - Package structure review (eliminate "fromnumeric"?) - Goal that this be the last breakage for numpy API (the real 1.0) - Scipy - Is it maintainable? should it be broken up? - Clear code addition path (or decide never to add more) - Docs (see above) - Add-on packages - Both existence of and good indexing/integration/support for field-specific packages - Clearer development path for new packages - Central hosting system for packages (svn, mailing lists, web, build integration, etc.) - Simultaneous releases of stable packages along with numpy/scipy I posted a basic improvement plan some years back. The core ideas have not changed; it is linked from the bottom of http://scipy.org/Developer_Zone. I chose our major weakness to begin with and started the doc project, using some money I could justify spending simply for the utility of docs for my own research. I funded the work of two doc coordinators, one each this summer and last. Looking at http://docs.scipy.org/numpy/stats/, you can see that when a doc coordinator was being paid (summers), work got done. When not, then not. Without publicly announcing what these guys made, I'll be the first to admit that it wasn't a lot. Yet, those small sums bought a huge contribution to numpy through the work of several dozen volunteers and the major contributions of a few. My conclusion is that active and constant coordination is central to motivating volunteer work, and that without a salary we cannot depend on coordination remaining active. On the other hand, I have heard Enthought's leaders bemoan the high cost of devoting employee time to this project, and the low returns available from selling support to universities and non-profit research institutes. Their leadership has moved us forward, particularly in the area of code, but has not provided the momentum necessary to carry us forward on all fronts. It is time for the public and education sectors to kick in some resources and organizational leadership. We are, after all, benefitting immensely. Since the cost of employee time is not so high for us in the public and education sectors, I propose to continue hiring people like Stefan and David as UCF employees or contractors, and to expand to hiring others in areas like packaging and marketing, provided that funding for those hires can be found. However, my grant situation is no longer as rich as it has been the past two years, and the needs going forward are greater than in the past if we're now to tackle all the points above. So, I will not be hiring another doc guru from my research grants next year. I am confident that others are willing to pitch in financially, but few will pitch in a full FTE, and we need several. We can (and will) set up a donations site, but donation sites tend to receive pizza money unless a sugar daddy comes along. Those benefitting most from the software, notably education, non-profit research, and government institutions, are *forbidden* from making donations by the terms of their grants. NSF doesn't give you money so you can give it away. We need to provide services they can buy on subcontract and a means for handling payments from them. Selling support does not solve the problem, as that requires spending most of the income on servicing that particular client. Rather, we need to sell a chunk of documentation or the packaging of a particular release, and then provide the product not just to that client but to everyone. We can also propose directly for federal and corporate grant funds. I have spoken with several NASA and NSF program managers and with Google's Federal Accounts Representative, and the possibilities for funding are good. But, I am not going to do this alone. We need a strong proposal team to be credible. So, I am seeking a group that is willing to work with me to put up the infrastructure of a funded project, to write grant proposals, and to coordinate a financial effort. Members of this group must have a track record of funded grants, business success, foundation support, etc. We might call it the SciPy Foundation. It could be based at UCF, which has a low overhead rate and has infrastructure (like an HR staff), or it might be independent if we can find a good director willing to devote significant time for relatively low pay compared to what they can likely make elsewhere. I would envision hiring permanent coordinators for docs, packaging, and marketing communications. Enthought appears to have code covered by virtue of having hired Travis, Robert, etc.; how to integrate that with this effort is an open question but not a difficult one, I think, as code is our strongest asset at this point. I invite discussion of this approach and the task list above on the scipy-dev@scipy.org mailing list. If you are seeing this post elsewhere, please reply only on scipy-dev@scipy.org. If you are eligible to lead funding proposals and are interested in participating in grant writing and management activities related to work in our weak areas, please contact me directly. Thanks, --jh-- Prof. Joseph Harrington Planetary Sciences Group Department of Physics MAP 414 4000 Central Florida Blvd. University of Central Florida Orlando, FL 32816-2385 jh@physics.ucf.edu planets.ucf.edu
participants (1)
-
Joe Harrington