[Twisted-Python] Let's talk about maintaining Lore (and validity of tickets)
Jean-Paul recently closed a Lore ticket as invalid, and suggested we have a discussion about Lore's future direction. This strikes me as a very good idea, and so I wrote a message which is a bit too long (for which I apologize) to kick that off. The discussion began here: <http://twistedmatrix.com/trac/ticket/6313#comment:7>. In said suggestion, JP said:
I am rejecting this Lore feature as unnecessary for Twisted's current documentation needs.
With regard to this specific point: the bug was discovered when building the documentation for the systemd-howto-5601 branch: <https://buildbot.twistedmatrix.com/builders/documentation/builds/2919/>. Presumably a better error would have facilitated development. So, while better error reporting of this error case may not be "necessary" I think it clearly would have been beneficial in this case and would perhaps be beneficial in similar cases in the future. Is necessity or benefit the standard which keeps a ticket open in our tracker? Do we have a documented standard for how necessary (or beneficial) a ticket must be anywhere?
However if there is genuine interest in enhancing Lore (specifically: obsoleting the lore2sphinx conversion tool), then I'm open to reconsidering that.
I don't think these two paths (lore2sphinx and continuing to maintain lore) are necessarily mutually exclusive. Also I think it implies something about the current state of affairs that isn't accurate - e.g. that the Twisted team has agreed that Sphinx will surely replace Lore and that we are making progress on that process of placement more than we are maintaining Lore itself. Unfortunately, I think it will be clear to anyone following its progress that lore2sphinx is unmaintained and the sphinx migration effort is stalled. Nobody has committed to <https://bitbucket.org/khorn/lore2sphinx> in a year and a half, about the same amount of time that <http://twistedmatrix.com/trac/browser/branches/sphinx-conversion-4500> has been idle as well. By contrast, <http://twistedmatrix.com/trac/browser/trunk/twisted/lore> has seen commits - albeit not many - within only a couple of weeks. So, empirically, we're already maintaining lore and lore2sphinx is currently "obsolete"; really the question should be if we want to reverse that path. I also have no objection if someone wants to complete the lore2sphinx work, but if the lore2sphinx buildbot were to die tomorrow and go offline, I wouldn't be particularly anxious to spend a lot of resources to fix it. My position on this was always that if someone wanted to improve the documentation, they were welcome to do so, and if they wanted to use Sphinx to do it, that's great too. I just wasn't willing to tolerate any period where our toolchain was broken and we couldn't generate documentation for a release. And a good thing we didn't, by the way! If we had said "go ahead, pull the trigger, whatever, it's OK to break trunk for a little while!" we wouldn't have had any documentation toolchain for the last 2 years. (I hope that everyone takes this to heart the next time someone wants to break our development process "for a little while, just during the migration" to move to Github, or Jenkins, or Travis-CI or whatever.)
Basically, this ticket is a demonstration of "stumble around in the dark" development in action. We don't need more of that (and I know I'm as guilty as anyone else). If someone wants to turn on a light, great. Otherwise, everyone out of the basement and find something more valuable on which to spend your time.
I don't think that this metaphor is particularly... illuminating. While I can sort of guess what you're talking about, it's all pretty implicit and seems to make several assumptions I am not sure that I agree with. What's wrong with stumbling around in the dark? If we had a hierarchically-managed product-driven organization, then having focus and a clearly communicated, consistently enforced shared goal would be important to effectively produce that product, but community projects don't seem to work that way. Consensus is important, but even given a consensus, pool of resources for development that we can allocate via executive decision is fairly small, and is just about sufficient to pay for code reviews of the contributions that we receive and to take care of administrivia, not to do substantial new development. We have to rely on volunteer contributions for that. I'm also sure our tools have a million boring little niggling bugs that need to be discovered and addressed so that the average experience of using and working on Twisted is as pleasant as possible, and we don't want to discourage people from reporting them; that's also a useful volunteer function. Does it harm any members of the Twisted development team to have other members of said team (by the way hi rwall congrats on your commit access) to file these sorts of legitimate, but trivial bugs in uninteresting bits of support code in Twisted, like lore or our release-management tools? To play my own debate opponent here: perhaps it does. The bug tracker is a resource, new bugs consume attention of core developers as we each probably pay attention to see if users are reporting serious problems we should fix. Collectively, that attention is arguably our most precious resource and we should be careful not to waste it. So we don't want the shared resource of the issue tracker to suffer from a tragedy of the commons and get filled up with junk bugs so we can't find the good ones. Closing tickets as invalid to draw a line around what we're trying to get accomplished and to prevent future attention from being wasted. But, attention is worthless without enthusiasm and skill, and having one's tickets closed as invalid does potentially sap one's enthusiasm and thereby one's motivation to acquire further skills. So more determinedly closing things as invalid may be robbing Peter to pay Paul. Also, in this case, I would question the classification of "invalid"; I like to use the "invalid" on bugs which are clearly not actionable. #6313 describes a clear problem (a traceback), and after clarification, a clear course of action (improve the error message). If we don't believe the problem should be fixed, then we should say "wontfix". I think this distinction is important because actually invalid (too vague as to be actionable in any way) bugs are in fact a waste of time, and provoke a good deal of pointless discussion before they die. Wontfix bugs are more of a good-faith mistake on the part of the reporter :-). With tickets such as this one, I think that what we (members of the Inner Circle, I guess, we should have secret handshake or something) ought to be doing is: setting the priority to 'lowest' (while this has very little real practical or process-enforced consequence, it should at least help others not get distracted by it in the future if they're looking for something to do) directing the bug reporter to a more useful ticket by linking to something that we wish someone would work on Once there's a positive pointer towards something more useful, explaining that (maintaining lore/changing the background color of the website/changing the order that we send response headers in HTTP) is peripheral to Twisted's mission of providing awesome internet APIs to programmers everywhere, but that we'd still be happy to receive a patch that addressed the issue with our code, provided that it adheres with all the relevant testing, coding standard, and compatibility requirements and doesn't waste a reviewer's time It's challenging to put useful comments on tickets, especially apparently pointless or ill-defined tickets. It's also just tiring: a lot of the comments one needs to make are incredibly repetitive and redundant. But, since I believe it's clear that few, if any people actually get their priorities of what to do for Twisted by scanning the bugtracker for recently-filed open issues, I posit that there's not a lot of value in ticket triage that doesn't make its primary goal the repeated communication of documented project policy, existing consensus, and constant positive suggestions as to what contributors should take as a next step. In this particular case, that means that "everybody out of the basement" is a vague, confusing, and unhelpful comment that just makes feels mildly insulting to the other people participating in the discussion on the bug. "I would prefer it if you would work on a high-priority ticket like ticket 84 instead of this one, since I believe the Twisted team has a general consensus that lore will be obsoleted and no-one wants to be responsible for it; see ticket 4500 for more details on one effort to do that.". More generally, I think that when one of us is tempted to shut down a bug like this, a better thing to do would be to write a wiki page or a blog post that can be refined by discussion, and can be an artifact that can be the point of reference for some rough consensus (like, e.g. <http://twistedmatrix.com/trac/wiki/CompatibilityPolicy>) updated by subsequent discussions, and then link to that discussion. This does all sort of raise the question of "why do we bother to keep a database of tickets around, anyway", and how we should address the warehousing of a potentially increasing number of hypothetically valid bugs that we just don't care enough about to fix. I haven't really addressed those questions very well here, so I do hope to hear more from all of you about that issue. So, rwall, hopefully now you'll go close #84 instead of either updating 6313 or responding to this message :). -glyph
On Fri, Mar 1, 2013 at 2:29 AM, Glyph <glyph@twistedmatrix.com> wrote:
Jean-Paul recently closed a Lore ticket as invalid, and suggested we have a discussion about Lore's future direction. This strikes me as a very good idea, and so I wrote a message which is a bit too long (for which I apologize) to kick that off.
I don't think these two paths (lore2sphinx and continuing to maintain lore) are necessarily mutually exclusive. Also I think it implies something about the current state of affairs that isn't accurate - e.g. that the Twisted team has agreed that Sphinx will surely replace Lore and that we are making progress on that process of placement more than we are maintaining Lore itself.
Unfortunately, I think it will be clear to anyone following its progress that lore2sphinx is unmaintained and the sphinx migration effort is stalled. Nobody has committed to <https://bitbucket.org/khorn/lore2sphinx> in a year and a half, about the same amount of time that < http://twistedmatrix.com/trac/browser/branches/sphinx-conversion-4500> has been idle as well. By contrast, < http://twistedmatrix.com/trac/browser/trunk/twisted/lore> has seen commits - albeit not many - within only a couple of weeks. So, empirically, we're already maintaining lore and lore2sphinx is currently "obsolete"; really the question should be if we want to reverse that path.
Some what orthagonal to your point, but this is incorrect. lore2sphinx was some time ago into "lore2sphinx-ng" and "rstgen". https://bitbucket.org/khorn/lore2sphinx-ng https://bitbucket.org/khorn/rstgen This was initially done as an experiment in using a more explicit "formatting model" for the generation for the Sphinx docs (and somewhat due to _your_ prodding, Glyph), and so I didn't initially make a big announcement or anything. Once it became apparent that it was actually going to work out better, I sent out some emails to those who had expressed interest in helping with the whole lore2sphinx project, though I don't believe I sent out anything to the twisted list in general, as I probably should have. I'll point out that I can count people who have shown interest in moving this forward on one hand, though. And I've specifically mentioned that I had done said forking to you, Glyph, in IRC ;) (though it's IRC after all...who remembers what happens in IRC?) I thought I had put a notice up in the readme file in the lore2sphinx repo, but as it isn't there, I presume I either forgot, or never got it merged, or something. So, totally my bad for not communicating better, but I have NOT given up on converting things from Lore into Sphinx. (Nor do I intend to.) Thinking about it, I suppose I've been somewhat reticent to do much communicating about any work I do on this, as what seems to happen is that it just gives everyone an excuse bring up some new objection to actually getting the conversion done. I hadn't really realized this consciously until just now, though. I also have no objection if someone wants to complete the lore2sphinx work,
but if the lore2sphinx buildbot were to die tomorrow and go offline, I wouldn't be particularly anxious to spend a lot of resources to fix it.
My position on this was always that if someone wanted to improve the documentation, they were welcome to do so, and if they wanted to use Sphinx to do it, that's great too. I just wasn't willing to tolerate any period where our toolchain was broken and we couldn't generate documentation for a release. And a good thing we didn't, by the way! If we had said "go ahead, pull the trigger, whatever, it's OK to break trunk for a little while!" we wouldn't have had any documentation toolchain for the last 2 years.
And since we didn't break the toolchain, I've been in no particular hurry. I've accepted that this will take approximately a billion years. So no rush. On the other hand, I have at several points been willing to make the "cutover", and for various different reasons, been told it wasn't happening until things were closer to "perfect" (for some value of "perfect") than they were at the time. The current output of the old lore2sphinx branch is functional, though has a few warts (mostly extraneous spaces in the output). These warts were apparently enough to block adoption. It has been a pretty discouraging effort at times, I have to say, as I seem to garner agreement/support/buy-in/whatever for a particular course of action (e.g. getting 99% of the way there, and then fixing Sphinx markup manually, which was the original plan, way back when), and focusing my efforts in that direction. Then when we're ready to proceed on that basis, had another task/challenge/set of requirements/whatever added to the work that needs to be done. In fact I still think that if the Twisted community had actually wanted to, we could have switched over to Sphinx at the first PyCon Atlanta (2010?). Anyway, I'm not giving up. If nothing else, I'll end up with a nice restructuredText-generating library. And if Twisted never ends up adopting Sphinx as a doc tool, eventually I'll still be able to read the Twisted docs in a format that I can navigate and doesn't hurt my eyes to look at. :) But I'd really rather see Twisted adopt Sphinx, and get rid of Lore. Help accepted. -- Kevin Horn
On Fri, Mar 1, 2013 at 11:44 AM, Kevin Horn <kevin.horn@gmail.com> wrote: Arg. Why do you always notice the errors _right_ after you send the mail?
Some what orthagonal to your point, but this is incorrect. lore2sphinx was SPLIT some time ago into "lore2sphinx-ng" and "rstgen".
On Mar 1, 2013, at 9:44 AM, Kevin Horn <kevin.horn@gmail.com> wrote:
On Fri, Mar 1, 2013 at 2:29 AM, Glyph <glyph@twistedmatrix.com> wrote: Jean-Paul recently closed a Lore ticket as invalid, and suggested we have a discussion about Lore's future direction. This strikes me as a very good idea, and so I wrote a message which is a bit too long (for which I apologize) to kick that off.
I don't think these two paths (lore2sphinx and continuing to maintain lore) are necessarily mutually exclusive. Also I think it implies something about the current state of affairs that isn't accurate - e.g. that the Twisted team has agreed that Sphinx will surely replace Lore and that we are making progress on that process of placement more than we are maintaining Lore itself.
Unfortunately, I think it will be clear to anyone following its progress that lore2sphinx is unmaintained and the sphinx migration effort is stalled. Nobody has committed to <https://bitbucket.org/khorn/lore2sphinx> in a year and a half, about the same amount of time that <http://twistedmatrix.com/trac/browser/branches/sphinx-conversion-4500> has been idle as well. By contrast, <http://twistedmatrix.com/trac/browser/trunk/twisted/lore> has seen commits - albeit not many - within only a couple of weeks. So, empirically, we're already maintaining lore and lore2sphinx is currently "obsolete"; really the question should be if we want to reverse that path.
Some what orthagonal to your point, but this is incorrect. lore2sphinx was split some time ago into "lore2sphinx-ng" and "rstgen".
Hi Kevin! Long time no see! (Too long, obviously!)
https://bitbucket.org/khorn/lore2sphinx-ng https://bitbucket.org/khorn/rstgen
This was initially done as an experiment in using a more explicit "formatting model" for the generation for the Sphinx docs (and somewhat due to _your_ prodding, Glyph), and so I didn't initially make a big announcement or anything.
I do remember this. The previous output of lore2sphinx really was unreliable enough that it was creating a never-ending treadmill of irrelevant / unpredictable Lore source fixes that were really dragging the whole process out. Thanks for working on improving it.
Once it became apparent that it was actually going to work out better, I sent out some emails to those who had expressed interest in helping with the whole lore2sphinx project, though I don't believe I sent out anything to the twisted list in general, as I probably should have. I'll point out that I can count people who have shown interest in moving this forward on one hand, though.
More discussion on this list would be almost always be better. We are a *long* way from too much traffic here. (And, this update is honestly a surprise to me.)
And I've specifically mentioned that I had done said forking to you, Glyph, in IRC ;) (though it's IRC after all...who remembers what happens in IRC?)
Based on this exchange, my understanding was simply that you had started to try to improve lore2sphinx, but then wandered off again.
I thought I had put a notice up in the readme file in the lore2sphinx repo, but as it isn't there, I presume I either forgot, or never got it merged, or something.
So, totally my bad for not communicating better, but I have NOT given up on converting things from Lore into Sphinx. (Nor do I intend to.)
OK. Let's move things along then. Several people showed up on IRC yesterday and voiced an interest in helping out, although what to do next - especially what to do next for a new contributor who does *not* want to try to reverse-engineer the conversion itself - needs to be made much, much clearer.
Thinking about it, I suppose I've been somewhat reticent to do much communicating about any work I do on this, as what seems to happen is that it just gives everyone an excuse bring up some new objection to actually getting the conversion done. I hadn't really realized this consciously until just now, though.
Communicate constantly. The biggest objection that _I_ have to getting the conversion done at this point is that the people working on it (well, okay: you) are uncommunicative, unreliable and frequently unavailable. ;-) If you were just keeping us all up to date - even just to complain! - I'd be much more sanguine about the whole thing. And apparently some of your misconceptions would have been corrected a lot earlier.
I also have no objection if someone wants to complete the lore2sphinx work, but if the lore2sphinx buildbot were to die tomorrow and go offline, I wouldn't be particularly anxious to spend a lot of resources to fix it.
My position on this was always that if someone wanted to improve the documentation, they were welcome to do so, and if they wanted to use Sphinx to do it, that's great too. I just wasn't willing to tolerate any period where our toolchain was broken and we couldn't generate documentation for a release. And a good thing we didn't, by the way! If we had said "go ahead, pull the trigger, whatever, it's OK to break trunk for a little while!" we wouldn't have had any documentation toolchain for the last 2 years.
And since we didn't break the toolchain, I've been in no particular hurry. I've accepted that this will take approximately a billion years. So no rush.
It does not have to take a billion years. The criteria ought to be clear - and if they aren't, you should have asked for clarification :).
On the other hand, I have at several points been willing to make the "cutover", and for various different reasons, been told it wasn't happening until things were closer to "perfect" (for some value of "perfect") than they were at the time.
Let's be specific: <http://twistedmatrix.com/trac/ticket/5312> is in need of some final code-review. Despite several reviews and an apparently extensive final response pass, it's not currently in review, which means it's still in your court for some reason. There is no reason to hold back on this and try to do *everything* in one big bang: this code just needs to be production-quality and land on trunk _before_ the ReST sources themselves are ready to go. Probably something needs to happen to the buildbot build steps, too, since there's this nastiness that did an end-run around our development process to get checked in to the buildbot config without tests instead of into twisted with tests, <http://buildbot.twistedmatrix.com/builders/documentation/builds/2994/steps/p...>, and that needs to be replaced with a command that's just like "build the docs, whether they be lore or sphinx or docbook or whatever". But, Tom's got your back here; if you can get this done during his fellowship (see today's post, <http://labs.twistedmatrix.com/2013/03/welcome-our-new-twisted-fellow-tom.htm...>) I estimate you will see a completed reconfiguration within hours. Once that's done, then it's a matter of putting <http://tm.tl/4500> into code-review with the output of the lore2sphinx builder. That review can be somewhat expedited, and can be done in parallel by lots of people since there are no unit tests to be worried about, and formatting fixes can be done quickly by multiple people, we don't need a big formal code review.
The current output of the old lore2sphinx branch is functional, though has a few warts (mostly extraneous spaces in the output). These warts were apparently enough to block adoption.
Let's not under-state the problem: thanks to the jaw-droppingly weird arbitrariness of the ReST format, "extraneous spaces" can mean "arbitrarily mangled output". But no, even these "warts" were not enough to block adoption. What blocked adoption is that the painstakingly hand-tweaked lore sources that did not have any more "warts" were left to languish (and bit-rot, and now probably require more manual fixing) while we waited for 2 years for someone to actually finish the sphinx development and release management tools and get them finalized. As I recall we basically finished fixing them all up, at the time. There were three reasons that I personally kept pressing for a more thorough lore -> sphinx converter. One is not necessarily necessary. First, and most importantly, is the bit-rot problem: people are working on lore docs in parallel with this effort. And, despite this exchange, I want to be clear that they should keep doing so: nobody should stop working on docs in the meanwhile, since we have no way to tell how much longer this will take. Looking at the modified docs on the sphinx buildbot is challenging, and keeping track of random whitespace jiggling is not documented on <http://twistedmatrix.com/trac/wiki/ReviewProcess#Reviewers:Howtoreviewachang...>. *I* can't even remember how to do the math to associate one of the results in <http://buildbot.twistedmatrix.com/builds/sphinx-html/>. And now that there have been so many changes (as I predicted there might be) we have to figure out what's changed, and re-review to make sure that everything (or at least a big enough majority of everything) is OK to go to trunk. If the tool itself could be verified to produce correct output for all the cases we've encountered where it falls over, we wouldn't have to do this manual verification step; we could just trust that it was right, because it has tests that indicate it's correct. Of course it's possible there might be *some* corner-case it still doesn't handle and that we didn't find, but if the tool is known to be broken in a large number of cases that we just have to magically know to avoid, then it's likely people will keep unknowingly re-introducing those problems. Second, there are going to be some doc patches in-progress whenever the cutover happens. Now, this is a bit less of a concern, because we can just manually translate one or two paragraphs to the new markup if necessary. But it would still be nice to have a tool that does the job well enough that someone could grab the buildbot output for an in-progress doc fix and keep working on it without having to learn how to re-express everything in Sphinx first. Third, the output is just hella grody right now. Have a look here, for example: <http://buildbot.twistedmatrix.com/builds/sphinx-html/989-37334/_sources/proj...>. *Tons* of peculiarly and unnecessary vertical whitespace, and very ragged right edges where the word wrap doesn't seem to respect line lengths. This means that every change that hits these documents is going to produce a lot of unnecessary delta when authors try to clean up some of this mess to make it nicer to edit. Spot-checking some of the output now, it seems like the tool must have been upgraded, or we've been lucky, since I can't spot any obvious bit-rot (and I could swear the docs look a lot less grody; the problems I mentioned there). So maybe you've already addressed these problems, or they're not actually that serious any more. But, as I said in the first point, spot-checking isn't enough.
It has been a pretty discouraging effort at times, I have to say, as I seem to garner agreement/support/buy-in/whatever for a particular course of action (e.g. getting 99% of the way there, and then fixing Sphinx markup manually, which was the original plan, way back when), and focusing my efforts in that direction. Then when we're ready to proceed on that basis, had another task/challenge/set of requirements/whatever added to the work that needs to be done. In fact I still think that if the Twisted community had actually wanted to, we could have switched over to Sphinx at the first PyCon Atlanta (2010?).
By 'actually wanted to' you mean 'be willing to abandon the development process for this one thing'. We do not abandon the development process. Every past attempt at doing so to facilitate some feature has been a road to ruin. Although this process has been frustrating for you, I am still happier with the current outcome (Twisted has perfectly functional documentation in our downloads and on our website) than with the alternative (create a situation where we could not produce a release for two years because the tools were languishing unfinished while we waited for you to say something about it). I'm sorry that this has been a frustrating process for you. And I'm not just saying that to be polite: I genuinely *am* sorry that our communication has not been clear, and that we have had wasted effort all around because of that. But I am fairly sure that we have had basically the same requirements for this process from day one. Let me state them here: We need to have release-automation tools that allow developers to produce a release, including documentation. These tools need to be subjected to the same development process as the rest of those tools, which is to say the same process as for the rest of Twisted. The documentation itself needs to be able to be generated from any version of trunk. While one or two formatting snafus are acceptable to be fixed after the fact, the documentation needs to be in a comprehensible state in every revision of trunk, which means that in order to land on trunk, the ReST output. Really, most of the work has been done here already. The docs appear to be in a mostly-workable state. lore2sphinx looks like maybe it's doing a good enough job, maybe better than the last time I looked at it. The _major_ hang-up is getting the release management tools over their final hump and just driving the trac tickets to completion. With Tom keeping the review queue basically empty right now, this is an excellent opportunity to get that done. It may make sense to schedule an event where we all show up on IRC, everyone claims a documentation component, and we all do a final review pass to make sure that the formatting problems aren't too bad before going to trunk with the cut-over. This pre-supposes that the release/building tools are done and on trunk though.
Anyway, I'm not giving up. If nothing else, I'll end up with a nice restructuredText-generating library. And if Twisted never ends up adopting Sphinx as a doc tool, eventually I'll still be able to read the Twisted docs in a format that I can navigate and doesn't hurt my eyes to look at. :)
But I'd really rather see Twisted adopt Sphinx, and get rid of Lore.
Help accepted.
All right! I hope this exchange has gotten some people fired up to cross the finish line. It's surprisingly close! Thanks for updating us, Kevin - better late than never :). -glyph P.S.: apologies for any errors. I didn't even really have the time to write this email, let alone copy-edit it.
On Fri, Mar 1, 2013 at 4:15 PM, Glyph <glyph@twistedmatrix.com> wrote:
On Mar 1, 2013, at 9:44 AM, Kevin Horn <kevin.horn@gmail.com> wrote:
On Fri, Mar 1, 2013 at 2:29 AM, Glyph <glyph@twistedmatrix.com> wrote:
Jean-Paul recently closed a Lore ticket as invalid, and suggested we have a discussion about Lore's future direction. This strikes me as a very good idea, and so I wrote a message which is a bit too long (for which I apologize) to kick that off.
I don't think these two paths (lore2sphinx and continuing to maintain lore) are necessarily mutually exclusive. Also I think it implies something about the current state of affairs that isn't accurate - e.g. that the Twisted team has agreed that Sphinx will surely replace Lore and that we are making progress on that process of placement more than we are maintaining Lore itself.
Unfortunately, I think it will be clear to anyone following its progress that lore2sphinx is unmaintained and the sphinx migration effort is stalled. Nobody has committed to < https://bitbucket.org/khorn/lore2sphinx> in a year and a half, about the same amount of time that < http://twistedmatrix.com/trac/browser/branches/sphinx-conversion-4500> has been idle as well. By contrast, < http://twistedmatrix.com/trac/browser/trunk/twisted/lore> has seen commits - albeit not many - within only a couple of weeks. So, empirically, we're already maintaining lore and lore2sphinx is currently "obsolete"; really the question should be if we want to reverse that path.
Some what orthagonal to your point, but this is incorrect. lore2sphinx was split some time ago into "lore2sphinx-ng" and "rstgen".
Hi Kevin! Long time no see! (Too long, obviously!)
https://bitbucket.org/khorn/lore2sphinx-ng https://bitbucket.org/khorn/rstgen
This was initially done as an experiment in using a more explicit "formatting model" for the generation for the Sphinx docs (and somewhat due to _your_ prodding, Glyph), and so I didn't initially make a big announcement or anything.
I do remember this. The previous output of lore2sphinx really was unreliable enough that it was creating a never-ending treadmill of irrelevant / unpredictable Lore source fixes that were really dragging the whole process out. Thanks for working on improving it.
That "never-ending" series of Lore source fixes took place over the course of a couple of weeks. Doing things that way was not my idea, though it seemed reasonable at the time because the idea was that we would do the cutover at the end of it.
Once it became apparent that it was actually going to work out better, I sent out some emails to those who had expressed interest in helping with the whole lore2sphinx project, though I don't believe I sent out anything to the twisted list in general, as I probably should have. I'll point out that I can count people who have shown interest in moving this forward on one hand, though.
More discussion on this list would be almost always be better. We are a *long* way from too much traffic here. (And, this update is honestly a surprise to me.)
And I've specifically mentioned that I had done said forking to you, Glyph, in IRC ;) (though it's IRC after all...who remembers what happens in IRC?)
Based on this exchange, my understanding was simply that you had started to try to improve lore2sphinx, but then wandered off again.
I never "wandered off". Been here the whole time. I've been in #twisted almost continually for about the last 3 years, and in #twisted-dev for about a year (I didn't relaize it existed before that). I just got tired of (my perception) talking to myself about doing the conversion. So I was being quiet. Granted, I shouldn't have been, and that's on me. but it's not like I'm hard to get a hold of.
I thought I had put a notice up in the readme file in the lore2sphinx repo, but as it isn't there, I presume I either forgot, or never got it merged, or something.
So, totally my bad for not communicating better, but I have NOT given up on converting things from Lore into Sphinx. (Nor do I intend to.)
OK. Let's move things along then.
Yes lets.
Several people showed up on IRC yesterday and voiced an interest in helping out, although what to do next - especially what to do next for a new contributor who does *not* want to try to reverse-engineer the conversion itself - needs to be made much, much clearer.
The last day or two have probably not been the best to try and get my attention, especially yesterday, as I essentially worked a 14 hr day trying to meet a deadline. But I see the conversation on IRC. I'll note that noone seems to have considered asking me anything about it. Looks like it was about 4am, though, so perhaps that wouldn't have done much good, as I was asleep. :) But hey...I have email! Ask me! I'll talk your ear off about it! (As an aside, lore2sphinx is in no way a "broken pile of regexes". Not to say that it isn't broken in some really significant ways, because it is, but it doesn't use regexes at all. Just sayin'.)
Thinking about it, I suppose I've been somewhat reticent to do much communicating about any work I do on this, as what seems to happen is that it just gives everyone an excuse bring up some new objection to actually getting the conversion done. I hadn't really realized this consciously until just now, though.
Communicate constantly. The biggest objection that _I_ have to getting the conversion done at this point is that the people working on it (well, okay: you) are uncommunicative, unreliable and frequently unavailable. ;-) If you were just keeping us all up to date - even just to complain! - I'd be much more sanguine about the whole thing. And apparently some of your misconceptions would have been corrected a lot earlier.
I got tired of complaining. And arguing.
I also have no objection if someone wants to complete the lore2sphinx
work, but if the lore2sphinx buildbot were to die tomorrow and go offline, I wouldn't be particularly anxious to spend a lot of resources to fix it.
My position on this was always that if someone wanted to improve the documentation, they were welcome to do so, and if they wanted to use Sphinx to do it, that's great too. I just wasn't willing to tolerate any period where our toolchain was broken and we couldn't generate documentation for a release. And a good thing we didn't, by the way! If we had said "go ahead, pull the trigger, whatever, it's OK to break trunk for a little while!" we wouldn't have had any documentation toolchain for the last 2 years.
And since we didn't break the toolchain, I've been in no particular hurry. I've accepted that this will take approximately a billion years. So no rush.
It does not have to take a billion years. The criteria ought to be clear - and if they aren't, you should have asked for clarification :).
I have asked for clarification more times than I can count about more aspects of this than I can possibly keep track of.
On the other hand, I have at several points been willing to make the "cutover", and for various different reasons, been told it wasn't happening until things were closer to "perfect" (for some value of "perfect") than they were at the time.
Let's be specific: <http://twistedmatrix.com/trac/ticket/5312> is in need of some final code-review. Despite several reviews and an apparently extensive final response pass, it's not currently in review, which means it's still in your court for some reason. There is no reason to hold back on this and try to do *everything* in one big bang: this code just needs to be production-quality and land on trunk _before_ the ReST sources themselves are ready to go.
Despite numerous attempts to prod someone into responding to my requests for clarification ;) on the ticket, I never got any response. Specifically, I could never get an answer on whether the sphinx build tool should require whomever was running it to specify a version or whether the tool should guess. The existing tools (at the time, I haven't looked at the current state of these) do/did both, in different places. And I admit, my impetus for immediacy kind of crashed when I had spent several weeks (I thought) getting everything ready to switch over the docs (in 4500) and then being told "oh we have some release stuff, we need to have a tool for that too". My impression prior to this was that sphinx-build would be used to build the sphinx docs, which turned out to be erroneous. I didn't even know that those tools (twisted.python._release) even existed prior to that point. Anyway, after a while it looked like fixing the lore sources would have to be done all over again, so I started looking into whether the conversion process itself could be improved, so that we didn't have to keep doing that. Also, please elaborate on what you mean by "do *everything* in one big bang. My intention was never to do anything but get the SphinxBuilder working on that branch. Was there something else you thought I was doing? Was there something else I should (or should not) have been doing?
Probably something needs to happen to the buildbot build steps, too, since there's this nastiness that did an end-run around our development process to get checked in to the buildbot config without tests instead of into twisted with tests, < http://buildbot.twistedmatrix.com/builders/documentation/builds/2994/steps/p...>, and that needs to be replaced with a command that's just like "build the docs, whether they be lore or sphinx or docbook or whatever". But, Tom's got your back here; if you can get this done during his fellowship (see today's post, < http://labs.twistedmatrix.com/2013/03/welcome-our-new-twisted-fellow-tom.htm...>) I estimate you will see a completed reconfiguration within hours.
I have no idea about how the buildbots are configured. But the linked buildbot log looks like part of the official release process. http://twistedmatrix.com/trac/wiki/ReleaseProcess#Buildhowtodocumentsforwebs...
Once that's done, then it's a matter of putting <http://tm.tl/4500> into code-review with the output of the lore2sphinx builder. That review can be somewhat expedited, and can be done in parallel by lots of people since there are no unit tests to be worried about, and formatting fixes can be done quickly by multiple people, we don't need a big formal code review.
The current output of the old lore2sphinx branch is functional, though has a few warts (mostly extraneous spaces in the output). These warts were apparently enough to block adoption.
Let's not under-state the problem: thanks to the jaw-droppingly weird arbitrariness of the ReST format, "extraneous spaces" can mean "arbitrarily mangled output". But no, even these "warts" were not enough to block adoption. What blocked adoption is that the painstakingly hand-tweaked lore sources that did *not *have any more "warts" were left to languish (and bit-rot, and now probably require more manual fixing) while we waited for 2 years for someone to actually finish the sphinx development and release management tools and get them finalized. As I recall we basically finished fixing them all up, at the time.
They got left alone because of the release tools hangup. Ideally the release tools would have been done before the whole lore-source-tweaking process, but they weren't. I'll admit my frustration played a part in this, but so did the deafening silence I got when I asked for anyone to comment on the ticket.
There were three reasons that I personally kept pressing for a more thorough lore -> sphinx converter. One is not necessarily necessary.
First, and most importantly, is the bit-rot problem: people are working on lore docs in parallel with this effort. And, despite this exchange, I want to be clear that they should keep doing so: nobody should stop working on docs in the meanwhile, since we have no way to tell how much longer this will take. Looking at the modified docs on the sphinx buildbot is challenging, and keeping track of random whitespace jiggling is not documented on < http://twistedmatrix.com/trac/wiki/ReviewProcess#Reviewers:Howtoreviewachang...>. *I* can't even remember how to do the math to associate one of the results in <http://buildbot.twistedmatrix.com/builds/sphinx-html/>. And now that there have been so many changes (as I predicted there might be) we have to figure out what's changed, and re-review to make sure that everything (or at least a big enough majority of everything) is OK to go to trunk. If the tool itself could be verified to produce correct output for all the cases we've encountered where it falls over, we wouldn't have to do this manual verification step; we could just trust that it was right, because it has tests that indicate it's correct. Of course it's possible there might be *some* corner-case it still doesn't handle and that we didn't find, but if the tool is known to be broken in a large number of cases that we just have to magically know to avoid, then it's likely people will keep unknowingly re-introducing those problems.
Second, there are going to be some doc patches in-progress whenever the cutover happens. Now, this is a bit less of a concern, because we can just manually translate one or two paragraphs to the new markup if necessary. But it would still be nice to have a tool that does the job well enough that someone could grab the buildbot output for an in-progress doc fix and keep working on it without having to learn how to re-express everything in Sphinx first.
This is why I think (at this point) we need to build Sphinx docs for every branch as part of the buildbot process. More below.
Third, the output is just hella grody right now. Have a look here, for example: < http://buildbot.twistedmatrix.com/builds/sphinx-html/989-37334/_sources/proj...>. *Tons* of peculiarly and unnecessary vertical whitespace, and very ragged right edges where the word wrap doesn't seem to respect line lengths. This means that every change that hits these documents is going to produce a lot of unnecessary delta when authors try to clean up some of this mess to make it nicer to edit.
Yep, its' ugly. Lore2sphinx-ng does a better job, but isn't finished. More below.
Spot-checking some of the output now, it seems like the tool must have been upgraded, or we've been lucky, since I can't spot any obvious bit-rot (and I could swear the docs look a lot less grody; the problems I mentioned there). So maybe you've already addressed these problems, or they're not actually that serious any more. But, as I said in the first point, spot-checking isn't enough.
It has been a pretty discouraging effort at times, I have to say, as I seem to garner agreement/support/buy-in/whatever for a particular course of action (e.g. getting 99% of the way there, and then fixing Sphinx markup manually, which was the original plan, way back when), and focusing my efforts in that direction. Then when we're ready to proceed on that basis, had another task/challenge/set of requirements/whatever added to the work that needs to be done. In fact I still think that if the Twisted community had actually wanted to, we could have switched over to Sphinx at the first PyCon Atlanta (2010?).
By 'actually wanted to' you mean 'be willing to abandon the development process for this one thing'.
We do not abandon the development process. Every past attempt at doing so to facilitate some feature has been a road to ruin. Although this process has been frustrating for you, I am still happier with the current outcome (Twisted has perfectly functional documentation in our downloads and on our website) than with the alternative (create a situation where we could not produce a release for two years because the tools were languishing unfinished while we waited for you to say something about it).
You keep saying that I wanted to "abandon the development process", and I'm not sure what you mean by that. My perception has been that I would say "what do we need to do to make this happen"? There would be some hemming and hawing (and at least several times long discussions about how documentation didn't really fit the regular UQDS process) and a sort of plan would be invented. I would proceed according to the plan as I understood it. I would then say "OK, we're ready"! And then be told that some other thing not in the plan needed to be done. The cycle would then repeat.
I'm sorry that this has been a frustrating process for you. And I'm not just saying that to be polite: I genuinely *am* sorry that our communication has not been clear, and that we have had wasted effort all around because of that. But I am fairly sure that we have had basically the same requirements for this process from day one. Let me state them here:
1. We need to have release-automation tools that allow developers to produce a release, including documentation. These tools need to be subjected to the same development process as the rest of those tools, which is to say the same process as for the rest of Twisted.
No this was not brought up until well into the process. I (sort of) understand the desire for this, but it seems pretty weird to be building what is essentially a wrapper for an existing tool, along with tests for said wrapper,
1. The documentation itself needs to be able to be generated from any version of trunk. While one or two formatting snafus are acceptable to be fixed after the fact, the documentation needs to be in a comprehensible state in every revision of trunk, which means that in order to land on trunk, the ReST output.
So...you didn't finish that sentence. I realize you apologized for errors
at the end of your mail, but I have a feeling you were going to say something rather important there... :) I'll talk more about this below (I think...depending on what you actually mean tot say here).
Really, most of the work has been done here already. The docs appear to be in a mostly-workable state. lore2sphinx looks like maybe it's doing a good enough job, maybe better than the last time I looked at it. The _major_ hang-up is getting the release management tools over their final hump and just driving the trac tickets to completion. With Tom keeping the review queue basically empty right now, this is an excellent opportunity to get that done.
It may make sense to schedule an event where we all show up on IRC, everyone claims a documentation component, and we all do a final review pass to make sure that the formatting problems aren't too bad before going to trunk with the cut-over. This pre-supposes that the release/building tools are done and on trunk though.
Anyway, I'm not giving up. If nothing else, I'll end up with a nice restructuredText-generating library. And if Twisted never ends up adopting Sphinx as a doc tool, eventually I'll still be able to read the Twisted docs in a format that I can navigate and doesn't hurt my eyes to look at. :)
But I'd really rather see Twisted adopt Sphinx, and get rid of Lore.
Help accepted.
All right! I hope this exchange has gotten some people fired up to cross the finish line. It's surprisingly close! Thanks for updating us, Kevin - better late than never :).
Experience shows that it's unlikely to be surprisingly close. I like your optimism though.
-glyph
P.S.: apologies for any errors. I didn't even really have the time to write this email, let alone copy-edit it.
Now that I've replied to all of that, let me give you a rundown of what I've been thinking and planning, so that you have an idea of where I'm coming from. Here are the various things that I have perceived to be necessary/required in order to get the conversion to happen: a) The conversion process needs to be able to be run concurrently with Lore for an extended period of time. In other words, Lore would be the "official" version of the docs, and the Sphinx docs would be built in some form of automated fashion until everyone was happy with them and/or ready to deprecate/abandon Lore. b) Because of a), there needs to be tooling to run lore2sphinx (or whatever) on a regular basis. (This was sort of being done via the Sphinx-building buildbot, but in a very ad-hockery sort of way, which was brittle, broke a couple of times, and needed to be improved.) c) There needs to be release management tooling to build the Sphinx docs from ReST into whatever formats we want to publish (HTML and PDF to start, maybe others later on) d) Convert the Lore sources to better ReST documents without all the problems that the current lore2sphinx output has. I at one time thought this was pretty impractical. My first attempt at a conversion tool tried to use an intermediate object model, but I ran into trouble when trying to combine the various objects. So I abandoned the effort and created what became lore2sphinx, which basically just combined a bunch of strings. I then figured out a way of making the intermediate object thing work, and that was lore2sphinx-ng. Then it became convenient to split out the intermediate object model from the documetn processing code, so I put all of that into a library and that became rstgen. (For anyone who is curious, the lore2sphinx-ng repo is forked off from the lore2sphinx repo, primarily because I didn't want to break the Sphinx buildbot by making drastic changes.) Here's what my plan was prior to this whole discussion getting started again. 1) Finish rstgen, where "finished" in this instance is defined as "is capable of generating all the vanilla docutils and sphinx-specific ReST elements that we need for converting the Twisted documentation. 2) Finish lore2sphinx-ng (which would probably have ended with merging it back into the lore2sphinx repo), where "finished" means that it would be capable of processing all the XHTML Lore tags that were defined in the Lore documentation and used in the Twisted documentation, and generating a tree of rstgen elements, which could then be rendered into ReST. (this would also serve to satisfy b) above, as the CLI in lore2sphinx-ng is less...well, let's just call it broken than lore2sphinx's was/is.) 3) Go back and finish SphinxBuilder (release tooling for building a sphinx project, which is basically a wrapper for sphinx-build, plus some vague "version feature"). 4) Get someone to use something less hackish than what's currently building the Sphinx docs on the buildbot, and preferably in such a way that the results of those builds could be published somewhere and have persistent links. Currently the results of what the Sphinx buildbot does are stored for a time, and then go away, so you'll see links to build results in some trac tickets that go nowhere, which is decidedly unhelpful. My plan was that we'd set up something where the Sphinx docs would get generated and published someplace for every buildbot build so that we could always have the current results for the lore to sphinx conversion for the tip of each branch. I have no idea whether this is actually feasible or practical, but it seemed like it would be useful. 5) Proceed with Sphinx docs being built from lore sources, making tweaks as necessary to lore2sphinx(ng) for as long as it took for the generated docs to be good enough to justify switching to Sphinx entirely. 6) Switch to Sphinx entirely. I really wasn't planning on trying to get people excited about switching to Sphinx again until 1) and 2) were at least "mostly" done (for certain values of done) and I had gone back to finish 3). So. I guess at this point the question is whether to try and go with what's there (lore2sphinx) or finish up the "new stuff" (lore2sphinx-ng + rstgen). I think 3-6 in my above plan need to happen in any case, and I think those will be much easier with lore2sphinx-ng+rstgen. I think I have some changes to lore2sphinx and rstgen which I haven't pushed yet. I'll try to get those out there soonish (sometime over the weekend) in case people want to look at them. IIRC, rstgen has support for most of the vanilla docutils elements, with the notable exception of tables (and maybe definition lists...can't recall whether I finished those). It has a basic level of test coverage (of course you can never have too many tests) for rendering the elements individually, and some test for elements in combination (particularly nested lists). Footnotes and Citations I think also need some work, which I have a plan for, but haven't implemented yet (i don't think). The "new" lore2sphinx CLI tool needs more work, but is relatively straightforward. Like the old tool, it's basically an elementtree processor, except instead of spitting out strings that get joined together (which granted was an unholy mess), it generates rstgen elements, which all have a .render() method. After processing a Lore document, you shoudl end up with a rstgen.Document object. You call it's render() method, which calls it's children's render() methods, etc. and it's turtles all the way down. The framework is there for the new CLI tool, it's mostly a matter of writing a bunch of short methods that take elementtree elements as input and return appropriate rstgen objects. Obviously these tools aren't finished, but they produce much better output than the old version of lore2sphinx w.r.t. whitespace handling, paragraph wrapping, etc. Some of the code is still pretty messy, but nowhere near the train wreck that the current/old version of lore2sphinx is. By which I mean it _can_ be cleaned up, it just hasn't been yet. In particular there's some places in rstgen where the API is (to me at least) obviously awful, but I haven't gotten around to fixing it yet. Please review the code. Please feel free to ask questions if you're interested. Personally, I've gotten over being in a hurry about all this, and I think a robust tool is more likely to succeed in the long run, though finishing it may make the run a bit longer. So I'm for finishing lore2sphinx-ng+rstgen. What are others' opinions? Make the "old" tool work? Or make the "new" tool work? Damn. Talk about long emails. -- Kevin Horn
On Fri, Mar 1, 2013 at 11:35 PM, Kevin Horn <kevin.horn@gmail.com> wrote:
I think I have some changes to lore2sphinx and rstgen which I haven't pushed yet. I'll try to get those out there soonish (sometime over the weekend) in case people want to look at them.
FYI. This turned out not to be the case. What I have is already in the repo(s) on bitbucket. For those who may have lost track in the voluminous emails, they are here: https://bitbucket.org/khorn/lore2sphinx-ng https://bitbucket.org/khorn/rstgen -- Kevin Horn
On the other hand, I have at several points been willing to make the "cutover", and for various different reasons, been told it wasn't happening until things were closer to "perfect" (for some value of "perfect") than they were at the time.
Well, the way the cut-over will eventually happen is that a ticket+branch is given a postive review. So you say that "[you] have [...] been willing to make the cutover", without having put an associated ticket into review sounds somewhat like abandoning the development process.
Despite numerous attempts to prod someone into responding to my requests for clarification ;) on the ticket, I never got any response.
Side note: The best way way to get a response to a ticket is probably to put it into review.
Specifically, I could never get an answer on whether the sphinx build tool should require whomever was running it to specify a version or whether the tool should guess. The existing tools (at the time, I haven't looked at the current state of these) do/did both, in different places.
Having a look at the current release automation tools, it looks like the only one that takes a version is `change-versions`, and the rest of the tools use the version from the tree.
*I* can't even remember how to do the math to associate one of the results in <http://buildbot.twistedmatrix.com/builds/sphinx-html/>.
I've update the buildbot to create a link from the build to the generated documentation.
1. We need to have release-automation tools that allow developers to produce a release, including documentation. These tools need to be subjected to the same development process as the rest of those tools, which is to say the same process as for the rest of Twisted.
I (sort of) understand the desire for this, but it seems pretty weird to be building what is essentially a wrapper for an existing tool, along with tests for said wrapper,
If the command is now, and always will be, just 'sphinx-build .' then we might be able to get away without doing this, but since sphinx isn't under our control, we can't insure that. Thus, we need somewhere to record how to run sphinx-build. If we have a wrapper, then we have an obvious place to record that information. This also gives us an easy place to add things like using a different template when building docs on the buildbot, as opposed to the release documentation, for example.
4) Get someone to use something less hackish than what's currently building the Sphinx docs on the buildbot,
I can help with this.
and preferably in such a way that the results of those builds could be published somewhere and have persistent links.
I'm not sure if this makes sense to keep all the old builds around. It only takes ~2 minutes to regenerate them, as needed.
My plan was that we'd set up something where the Sphinx docs would get generated and published someplace for every buildbot build
We currently do this for every trunk revision, and it is possible to do by hand for any branch version. It is straightforward to add this to the list of builds that get done by running force-build.py.
So. I guess at this point the question is whether to try and go with what's there (lore2sphinx) or finish up the "new stuff" (lore2sphinx-ng + rstgen). I think 3-6 in my above plan need to happen in any case, and I think those will be much easier with lore2sphinx-ng+rstgen.
From what has been said, I'd be inclined to take the long approach. There isn't any rush, and it sounds like the final results will be better, if we wait for lore2sphinx-ng + rstgen.
Tom
On Mar 2, 2013, at 12:18 AM, Tom Prince <tom.prince@ualberta.net> wrote:
I've update the buildbot to create a link from the build to the generated documentation.
Oh my goodness, Tom. You are like a god of buildbot. It did not even occur to me to ask for this, as I assumed it would be too complex. Thanks so much! -glyph
On Mar 1, 2013, at 9:35 PM, Kevin Horn <kevin.horn@gmail.com> wrote:
That "never-ending" series of Lore source fixes took place over the course of a couple of weeks. Doing things that way was not my idea, though it seemed reasonable at the time because the idea was that we would do the cutover at the end of it.
Well, let's go to the video tape. Based on this comment - <http://twistedmatrix.com/trac/ticket/4500#comment:12> - these tickets were closed over a period ranging from 2010/07 to 2011/03. 6 months isn't quite "weeks", but okay I guess it wasn't "never-ending" either :).
I never "wandered off". Been here the whole time. I've been in #twisted almost continually for about the last 3 years, and in #twisted-dev for about a year (I didn't relaize it existed before that). I just got tired of (my perception) talking to myself about doing the conversion. So I was being quiet. Granted, I shouldn't have been, and that's on me. but it's not like I'm hard to get a hold of.
Fair enough. I had the inaccurate impression that you "weren't around" but you were just being quiet. You never actually failed to respond, so that's not a fair impression.
OK. Let's move things along then.
Yes lets.
Right on.
The last day or two have probably not been the best to try and get my attention, especially yesterday, as I essentially worked a 14 hr day trying to meet a deadline.
24 hours is a perfectly reasonable response latency, don't worry about it :).
But I see the conversation on IRC. I'll note that noone seems to have considered asking me anything about it. Looks like it was about 4am, though, so perhaps that wouldn't have done much good, as I was asleep. :)
But hey...I have email! Ask me! I'll talk your ear off about it!
This email was written after said conversation as an explicit attempt to ask about just that, so, there you go :).
(As an aside, lore2sphinx is in no way a "broken pile of regexes". Not to say that it isn't broken in some really significant ways, because it is, but it doesn't use regexes at all. Just sayin'.)
Actually yeah, "regex" is just a curse-word here :). It's the emitter I'm complaining about, anyway, not the parser, so deriding it as a "regex" is in no way accurate.
I got tired of complaining. And arguing.
And since we didn't break the toolchain, I've been in no particular hurry. I've accepted that this will take approximately a billion years. So no rush.
It does not have to take a billion years. The criteria ought to be clear - and if they aren't, you should have asked for clarification :).
I have asked for clarification more times than I can count about more aspects of this than I can possibly keep track of.
Where? On <http://twistedmatrix.com/trac/ticket/5312> I see exactly one un-answered question in your review response, "Is re-raising the exception enough here? Or should I do something entirely different?" Except it never actually got back into review, so it never got bubbled back up to get attention for an official response.
Let's be specific: <http://twistedmatrix.com/trac/ticket/5312> is in need of some final code-review. Despite several reviews and an apparently extensive final response pass, it's not currently in review, which means it's still in your court for some reason. There is no reason to hold back on this and try to do *everything* in one big bang: this code just needs to be production-quality and land on trunk _before_ the ReST sources themselves are ready to go.
Despite numerous attempts to prod someone into responding to my requests for clarification ;) on the ticket, I never got any response.
Like I said, I see one un-answered question. On a ticket which is not in review: according to the development process, that means you still think you have some stuff to do on it, and it's not ready for anyone to take a look at it yet. If you want a response, put it into review and someone will look at it as soon as time allows. Or post here. A comment on a ticket doesn't necessarily show up in anyone's in-box and won't necessarily get a response that isn't a code-review.
Specifically, I could never get an answer on whether the sphinx build tool should require whomever was running it to specify a version or whether the tool should guess. The existing tools (at the time, I haven't looked at the current state of these) do/did both, in different places.
The word "version" does not appear on 4500 at all, and on 5312 the only comment you make related to versions is you saying "Not sure which direction to go here. Deferring to sometime not the middle of the night". It's exarkun asking the question about the versions though, not you :). Sorry to be overly pedantic here: I'm not trying to assign blame, since that is fairly pointless now. I'm just meaning to say that, based on what I see here, I am wondering what we could have improved. I know we chatted on the mailing list, and in person, as well as on the tickets, so not all of this is necessarily public or even written down, but it really seems like you developed an impression of having to repeatedly ask questions and argue about things far more than you actually did :).
And I admit, my impetus for immediacy kind of crashed when I had spent several weeks (I thought) getting everything ready to switch over the docs (in 4500) and then being told "oh we have some release stuff, we need to have a tool for that too". My impression prior to this was that sphinx-build would be used to build the sphinx docs, which turned out to be erroneous. I didn't even know that those tools (twisted.python._release) even existed prior to that point.
The release stuff was new-ish at the time, and is obviously not super publicly documented (it's for "internal" use only on Twisted itself right now). So it's understandable that it didn't get communicated well, but it hardly seems like a reason to tank the whole process.
Anyway, after a while it looked like fixing the lore sources would have to be done all over again, so I started looking into whether the conversion process itself could be improved, so that we didn't have to keep doing that.
That part of the conversation, at least, jives with my understanding :).
Also, please elaborate on what you mean by "do *everything* in one big bang. My intention was never to do anything but get the SphinxBuilder working on that branch. Was there something else you thought I was doing? Was there something else I should (or should not) have been doing?
My reasoning goes like this: the ticket for the release tools is still not in review, so you must be waiting for something to re-submit it. It looks like you responded to the code, so the only thing I could think you were still waiting for would be for the lore sources themselves to be ready.
I have no idea about how the buildbots are configured. But the linked buildbot log looks like part of the official release process. http://twistedmatrix.com/trac/wiki/ReleaseProcess#Buildhowtodocumentsforwebs...
Yeah. Ugh. I hate that part of that wiki page. But that part can be Tom's problem, since he's responsible for the buildbot :).
[the fixed-up Lore sources] got left alone because of the release tools hangup. Ideally the release tools would have been done before the whole lore-source-tweaking process, but they weren't. I'll admit my frustration played a part in this, but so did the deafening silence I got when I asked for anyone to comment on the ticket.
Where and how did you ask people to comment on the ticket? I don't recall being asked, and I tend to be pretty good about leaving prompts like that in my inbox until I've done what was asked. (Not *perfect*, of course, and if you asked a list then there might have been some bystander effect.) It seems like we might have avoided this whole mess if you had just attached the 'review' keyword :).
You keep saying that I wanted to "abandon the development process", and I'm not sure what you mean by that.
As I recall, we discussed this process in person at PyCon and you were quite keen to just check the documentation in in a broken state, and fix it all up in one gigantic branch while nobody did any Lore work. To be fair, when I described the problems this would create, you did agree that we shouldn't do it that way.
My perception has been that I would say "what do we need to do to make this happen"? There would be some hemming and hawing (and at least several times long discussions about how documentation didn't really fit the regular UQDS process) and a sort of plan would be invented. I would proceed according to the plan as I understood it. I would then say "OK, we're ready"! And then be told that some other thing not in the plan needed to be done. The cycle would then repeat.
The only "cycle" I can either see on the tickets or recall here is where the release tools didn't come in to the initial plan.
No [the need for release automation] was not brought up until well into the process. I (sort of) understand the desire for this, but it seems pretty weird to be building what is essentially a wrapper for an existing tool, along with tests for said wrapper,
OK. I can believe that this did not happen. One problem is that we (the inner-circle old-school Twisted developers) tend to engage in conversations about how a thing might be done while at the same time we discuss what must be done. And we also tend to discuss what policy is (or what all or some of us believe it ought to be in some case, further confusing the issue) without making explicit what the purpose of that requirement is. I would ask the community to help us with this by doing a couple of things. If somebody says "X is policy", always ask for a link to it. If there is a link, it'll help you understand it better. If there isn't a link, then the authority telling you it's "policy" might just be remembering that it's the way we've done things since forever and of course it's a good idea. There are definitely things that I have thought were in the coding standard that are not actually written down anywhere, on more than one occasion. If a meandering discussion is happening - here, on the mailing list, on the ticket - never be afraid to break it up and separate out the different concerns which are being discussed: what is necessary for compliance with our development process, what would be a good idea from a design point of view, how the work might be broken up to get through review more manageably, what other concerns are in play. Especially, if you ever see a code review where a reviewer says "I think..." without making it clear what you should do, you should always ask, 'is this a requirement of the review or just some thoughts you have'. There's also the problem of "I think you should..." being interpreted as "You must...". It is very hard to consistently separate design feedback from code review, although we try very hard; but, it's hard to separate it out when reading it as well. So one important point to keep in mind is that, as the author of a proposed change, outside the things that are agreed upon policy consensus, you always have some degree of discretion to disagree with a reviewer. And you should freely do so when submitting anything for re-review. It's best to just do this as quickly as possible, so that it gets back to the reviewer without a whole lot of delay, and they can respond with either "I still disagree, but you're doing the work, so OK go ahead" or "No, you really have to do this, it's required by policy document X, here's a link" ;-).
The documentation itself needs to be able to be generated from any version of trunk. While one or two formatting snafus are acceptable to be fixed after the fact, the documentation needs to be in a comprehensible state in every revision of trunk, which means that in order to land on trunk, the ReST output. So...you didn't finish that sentence. I realize you apologized for errors at the end of your mail, but I have a feeling you were going to say something rather important there...
Well yes, that was the point of the apology. That was a rather important thing. What I was probably going to say was just: The ReST output needs to be in good enough shape to be generally readable, with a manageable number of errors. But, we need to be able to *verify* that it has not too many errors. And I'd already discussed that somewhat above.
Experience shows that it's unlikely to be surprisingly close. I like your optimism though.
Experience just teaches it that it's not done yet. And experience has taught us that about every change, and it was right up until the exact moment when it wasn't right any more ;-).
Now that I've replied to all of that, let me give you a rundown of what I've been thinking and planning, so that you have an idea of where I'm coming from.
Here are the various things that I have perceived to be necessary/required in order to get the conversion to happen:
a) The conversion process needs to be able to be run concurrently with Lore for an extended period of time. In other words, Lore would be the "official" version of the docs, and the Sphinx docs would be built in some form of automated fashion until everyone was happy with them and/or ready to deprecate/abandon Lore.
Your understanding of this requirement is slightly off, I think, although possibly the consequences are the same. As per the difficulties I laid out above, about separating the requirements from the strategies for satisfying said requirements. The thing that we weren't going to tolerate was any message saying that people should hold off on writing documentation, even for "a little while" while we fixed up the lore conversion, because without a contractual obligation for someone to finish this work, there's really no telling how long "a little while" would be :). Since the whole point of this sphinx conversion is to appeal to documentation authors who prefer the ReST format as input (it's definitely not to make the docs look nicer, writing a new stylesheet for Lore would have taken 1/100th of the effort and nobody has expressed interest in doing that), creating a period where things were even *less* appealing to documentation authors would defeat the purpose. Another possible solution to this problem would be to modify Lore so it could process ReST sources, so that we could convert the documentation within the repository piecemeal, and start writing any new docs in ReST, but still have a coherent whole of documentation produced, eventually switching the documentation processor from Lore to Sphinx. Yet another possible solution would be to modify Sphinx, adding a plugin to process the Lore sources. As an aside: this is the part of the process which has been so frustrating to me, personally. The two alternate solutions I proposed here (and have proposed before) seem far saner and more manageable in terms of effort, to me. But, everyone I have spoken to about docutils and ReST has told me in no uncertain terms that they are both a pile of heinous hacks that resist any attempt at sensible software-engineering solutions to problems, so we need to resort to hackish system-integration stuff like what we've done. This worries me. I know that Sphinx's output is well-loved by the Python community, but if it's so hard to call into that we can't reasonably modify it to get an XML DOM that looks like Lore source to Lore, and it's so hard to plug in to it that we can't give it a data structure that it likes from Lore's XML DOM, then how the heck is it being maintained? And if it actually *isn't* that bad, then why haven't I managed to find someone that knows its code well enough to do one or the other of these things? I have no direct knowledge of any of this stuff, because my main interest here is improving the experience of working on Twisted, both for you, Kevin, and for the people who will arguably be helped by the use of Sphinx. Maybe I'm completely wrong and Sphinx is beautifully architected and we could have done this from day 1. But I faintly hope that some Docutils and Sphinx contributor hears that I said "sphinx is garbage" and makes a fool of me by contributing either a lore modification or a sphinx plugin which solves this whole problem so we can do the format or tool migration incrementally :).
b) Because of a), there needs to be tooling to run lore2sphinx (or whatever) on a regular basis. (This was sort of being done via the Sphinx-building buildbot, but in a very ad-hockery sort of way, which was brittle, broke a couple of times, and needed to be improved.)
Hmm. I wasn't aware of that. But it seems like it's running by a charm now.
c) There needs to be release management tooling to build the Sphinx docs from ReST into whatever formats we want to publish (HTML and PDF to start, maybe others later on)
Yup. (ePub? PDF is so last-century... :))
d) Convert the Lore sources to better ReST documents without all the problems that the current lore2sphinx output has.
So, this wasn't *necessary*. If we had gotten through the release automation stuff - and I still don't understand why that's stuck - we could have merged it.
I at one time thought this was pretty impractical. My first attempt at a conversion tool tried to use an intermediate object model, but I ran into trouble when trying to combine the various objects. So I abandoned the effort and created what became lore2sphinx, which basically just combined a bunch of strings. I then figured out a way of making the intermediate object thing work, and that was lore2sphinx-ng. Then it became convenient to split out the intermediate object model from the documetn processing code, so I put all of that into a library and that became rstgen.
It seems the saving grace here is that rstgen might be a generally useful tool in its own right, with more of a long-term future than lore2sphinx would have had.
(For anyone who is curious, the lore2sphinx-ng repo is forked off from the lore2sphinx repo, primarily because I didn't want to break the Sphinx buildbot by making drastic changes.)
Have a link?
Here's what my plan was prior to this whole discussion getting started again.
1) Finish rstgen, where "finished" in this instance is defined as "is capable of generating all the vanilla docutils and sphinx-specific ReST elements that we need for converting the Twisted documentation.
Sounds like a worthy goal, although I don't think this is necessarily required. Have you been working on it for the last 2 years? Do you have any idea when it might be done? It might be worthwhile to write a *smaller* .
2) Finish lore2sphinx-ng (which would probably have ended with merging it back into the lore2sphinx repo), where "finished" means that it would be capable of processing all the XHTML Lore tags that were defined in the Lore documentation and used in the Twisted documentation, and generating a tree of rstgen elements, which could then be rendered into ReST.
Cool. While this would be handy, especially for people working on documentation branches, it's not necessarily necessary.
(this would also serve to satisfy b) above, as the CLI in lore2sphinx-ng is less...well, let's just call it broken than lore2sphinx's was/is.)
OK.
3) Go back and finish SphinxBuilder (release tooling for building a sphinx project, which is basically a wrapper for sphinx-build, plus some vague "version feature").
This is really the crux; this is the thing you should work on first, I think, even if you're going to keep working on lore2sphinx-ng. Basically the only reason that I was keen to get the lore to sphinx conversion improved in the first place was that creating this tool seemed to be dragging on for quite a while after the "chunk tickets" were done. But now, this tool is almost done, and we could re-do the lore-source review if you wanted to do that. The current lore2sphinx might well be good enough to just go with, especially if the next-generation version is going to take another six months to finish.
4) Get someone to use something less hackish than what's currently building the Sphinx docs on the buildbot, and preferably in such a way that the results of those builds could be published somewhere and have persistent links. Currently the results of what the Sphinx buildbot does are stored for a time, and then go away, so you'll see links to build results in some trac tickets that go nowhere, which is decidedly unhelpful. My plan was that we'd set up something where the Sphinx docs would get generated and published someplace for every buildbot build so that we could always have the current results for the lore to sphinx conversion for the tip of each branch. I have no idea whether this is actually feasible or practical, but it seemed like it would be useful.
OK, *this* sounds like really unnecessary turd-polishing ;-). This builder is an interim step; the more interesting step is the builder that just builds the sphinx docs, in the same way that the current builder builds the lore docs. Furthermore, it seems to be working fine. Build results links that go nowhere are a known problem with buildbot, since it does eventually lose most history, and this type of history takes up a fair bit of disk space.
5) Proceed with Sphinx docs being built from lore sources, making tweaks as necessary to lore2sphinx(ng) for as long as it took for the generated docs to be good enough to justify switching to Sphinx entirely. 6) Switch to Sphinx entirely.
I really wasn't planning on trying to get people excited about switching to Sphinx again until 1) and 2) were at least "mostly" done (for certain values of done) and I had gone back to finish 3).
So. I guess at this point the question is whether to try and go with what's there (lore2sphinx) or finish up the "new stuff" (lore2sphinx-ng + rstgen). I think 3-6 in my above plan need to happen in any case, and I think those will be much easier with lore2sphinx-ng+rstgen.
This decision is really determined by time estimates. In any case, work out the sphinx release automation tool first, since we need that regardless of how we switch over.
I think I have some changes to lore2sphinx and rstgen which I haven't pushed yet. I'll try to get those out there soonish (sometime over the weekend) in case people want to look at them.
You might want to send a considerably shorter message just enticing other list members to have a look at maybe help out with that stuff :).
IIRC, rstgen has support for most of the vanilla docutils elements, with the notable exception of tables (and maybe definition lists...can't recall whether I finished those). It has a basic level of test coverage (of course you can never have too many tests) for rendering the elements individually, and some test for elements in combination (particularly nested lists). Footnotes and Citations I think also need some work, which I have a plan for, but haven't implemented yet (i don't think).
The "new" lore2sphinx CLI tool needs more work, but is relatively straightforward. Like the old tool, it's basically an elementtree processor, except instead of spitting out strings that get joined together (which granted was an unholy mess), it generates rstgen elements, which all have a .render() method. After processing a Lore document, you shoudl end up with a rstgen.Document object. You call it's render() method, which calls it's children's render() methods, etc. and it's turtles all the way down.
The framework is there for the new CLI tool, it's mostly a matter of writing a bunch of short methods that take elementtree elements as input and return appropriate rstgen objects.
Obviously these tools aren't finished, but they produce much better output than the old version of lore2sphinx w.r.t. whitespace handling, paragraph wrapping, etc.
Aesthetically, this appeals to me a lot more than going with the messiness of lore2sphinx. But it is _not_ a requirement.
Some of the code is still pretty messy, but nowhere near the train wreck that the current/old version of lore2sphinx is. By which I mean it _can_ be cleaned up, it just hasn't been yet. In particular there's some places in rstgen where the API is (to me at least) obviously awful, but I haven't gotten around to fixing it yet.
Please review the code. Please feel free to ask questions if you're interested.
Personally, I've gotten over being in a hurry about all this, and I think a robust tool is more likely to succeed in the long run, though finishing it may make the run a bit longer. So I'm for finishing lore2sphinx-ng+rstgen.
I think a little false urgency might not hurt here :-). I'm not going to work on the tool - just writing these emails probably blew my Twisted development budget for the next two months ;-) - but I will do my best to quickly clear up any procedural what-needs-to-be-done questions unambiguously. Please ping if anything gets you stuck.
What are others' opinions? Make the "old" tool work? Or make the "new" tool work?
Damn. Talk about long emails.
-glyph
Sorry it's taken me so long to get back to this. But it's gotten to be a Looong email. On Sat, Mar 2, 2013 at 3:14 AM, Glyph <glyph@twistedmatrix.com> wrote:
On Mar 1, 2013, at 9:35 PM, Kevin Horn <kevin.horn@gmail.com> wrote:
That "never-ending" series of Lore source fixes took place over the course of a couple of weeks. Doing things that way was not my idea, though it seemed reasonable at the time because the idea was that we would do the cutover at the end of it.
Well, let's go to the video tape. Based on this comment - < http://twistedmatrix.com/trac/ticket/4500#comment:12> - these tickets were closed over a period ranging from 2010/07 to 2011/03. 6 months isn't quite "weeks", but okay I guess it wasn't "never-ending" either :).
Hmmm. I recall it as being much shorter. Probably most of the work took place it two "spurts" around the beginning and end of that time, and that's why I remember it that way. But I'm not interested in digging through a bunch of old dates to find out for sure.
(As an aside, lore2sphinx is in no way a "broken pile of regexes". Not to say that it isn't broken in some really significant ways, because it is, but it doesn't use regexes at all. Just sayin'.)
Actually yeah, "regex" is just a curse-word here :). It's the emitter I'm complaining about, anyway, not the parser, so deriding it as a "regex" is in no way accurate.
I figured that was the case, I just wanted to say something so others reading this didn't get the wrong impression about how lore2sphinx is implemented. I mean it's not code I'm very proud of, but it's not _that_ bad :) <<< snip a bunch of stuff about who said what when, why I thought what I thought, etc. >>> It boils down to the fact that a bunch of the conversations happened either in person or on IRC. This was mostly because I was in a hurry at the time, usually because I wanted to do something before additions were made to the documentation, which was in a somewhat "known" state (as in I knew how it was going to behave when run through lore2sphinx) at the time. Also, please elaborate on what you mean by "do *everything* in one big
bang. My intention was never to do anything but get the SphinxBuilder working on that branch. Was there something else you thought I was doing? Was there something else I should (or should not) have been doing?
My reasoning goes like this: the ticket for the release tools is still not in review, so you must be waiting for something to re-submit it. It looks like you responded to the code, so the only thing I could think you were still waiting for would be for the lore sources themselves to be ready.
It's been long enough that I can't fully recall my reasoning on this. But _probably_ I decided that if I finished the release tools ticket, someone might use it. Which would be great, except that I think I had decided that before that actually happened I needed to figure out a way to emit nicer output from lore2sphinx. So I left it alone until I had figured out how to do that. At least, that _might_ have been part of my thought process. It really was ages ago. [the fixed-up Lore sources] got left alone because of the release tools
hangup. Ideally the release tools would have been done before the whole lore-source-tweaking process, but they weren't. I'll admit my frustration played a part in this, but so did the deafening silence I got when I asked for anyone to comment on the ticket.
Where and how did you ask people to comment on the ticket? I don't recall being asked, and I tend to be pretty good about leaving prompts like that in my inbox until I've done what was asked. (Not *perfect*, of course, and if you asked a list then there might have been some bystander effect.) It seems like we might have avoided this whole mess if you had just attached the 'review' keyword :).
On IRC.
My perception has been that I would say "what do we need to do to make this happen"? There would be some hemming and hawing (and at least several times long discussions about how documentation didn't really fit the regular UQDS process) and a sort of plan would be invented. I would proceed according to the plan as I understood it. I would then say "OK, we're ready"! And then be told that some other thing not in the plan needed to be done. The cycle would then repeat.
The only "cycle" I can either see on the tickets or recall here is where the release tools didn't come in to the initial plan.
This was the latest of several (3 or 4) according to my recollection/perception. It doesn't really matter now.
No [the need for release automation] was not brought up until well into the process. I (sort of) understand the desire for this, but it seems pretty weird to be building what is essentially a wrapper for an existing tool, along with tests for said wrapper,
OK. I can believe that this did not happen. One problem is that we (the inner-circle old-school Twisted developers) tend to engage in conversations about how a thing might be done while at the same time we discuss what must be done. And we also tend to discuss what policy is (or what all or some of us believe it *ought to be* in some case, further confusing the issue) without making explicit what the *purpose* of that requirement is.
I would ask the community to help us with this by doing a couple of things.
If somebody says "X is policy", always ask for a link to it. If there is a link, it'll help you understand it better. If there *isn't* a link, then the authority telling you it's "policy" might just be remembering that it's the way we've done things since forever and of course it's a good idea. There are definitely things that I have thought were in the coding standard that are not actually written down anywhere, on more than one occasion.
If a meandering discussion is happening - here, on the mailing list, on the ticket - never be afraid to break it up and separate out the different concerns which are being discussed: what is necessary for compliance with our development process, what would be a good idea from a design point of view, how the work might be broken up to get through review more manageably, what other concerns are in play.
Especially, if you ever see a code review where a reviewer says "I think..." without making it clear what you should *do*, you should always ask, 'is this a requirement of the review or just some thoughts you have'.
And when we ask, we should ask on the ticket, and put it back into review, yes? Because I think this was the part (or at least _A_ part) I was really missing here.
There's also the problem of "I think you should..." being interpreted as "You must...". It is *very* hard to consistently separate design feedback from code review, although we try very hard; but, it's hard to separate it out when reading it as well. So one important point to keep in mind is that, as the author of a proposed change, outside the things that are agreed upon policy consensus, you always have some degree of discretion to disagree with a reviewer. And you should freely do so when submitting anything for re-review. It's best to just do this as quickly as possible, so that it gets back to the reviewer without a whole lot of delay, and they can respond with either "I still disagree, but you're doing the work, so OK go ahead" or "No, you really have to do this, it's required by policy document X, here's a link" ;-).
1. The documentation itself needs to be able to be generated from any version of trunk. While one or two formatting snafus are acceptable to be fixed after the fact, the documentation needs to be in a comprehensible state in every revision of trunk, which means that in order to land on trunk, the ReST output.
So...you didn't finish that sentence. I realize you apologized for errors at the end of your mail, but I have a feeling you were going to say something rather important there...
Well yes, that was the point of the apology. That was a rather important thing. What I was probably going to say was just:
The ReST output needs to be in good enough shape to be generally readable, with a manageable number of errors. But, we need to be able to *verify* that it has not too many errors.
And I'd already discussed that somewhat above.
Now that I've replied to all of that, let me give you a rundown of what I've been thinking and planning, so that you have an idea of where I'm coming from.
Here are the various things that I have perceived to be necessary/required in order to get the conversion to happen:
a) The conversion process needs to be able to be run concurrently with Lore for an extended period of time. In other words, Lore would be the "official" version of the docs, and the Sphinx docs would be built in some form of automated fashion until everyone was happy with them and/or ready to deprecate/abandon Lore.
Your understanding of this requirement is slightly off, I think, although possibly the consequences are the same. As per the difficulties I laid out above, about separating the requirements from the strategies for satisfying said requirements.
I've been told that almost verbatim, several times. This is basically what led to the Sphinx buildbot happening. Perhaps I wasn't clear about what I meant.
The thing that we weren't going to tolerate was any message saying that people should hold off on writing documentation, even for "a little while" while we fixed up the lore conversion, because without a contractual obligation for someone to finish this work, there's really no telling how long "a little while" would be :).
Well, when I originally was pushing it, my plan was for that little while to be "today" (this was at PyCon during the only day of sprints I was able to attend), and if it didn't get done, we'd abandon that particular attempt. You and exarkun managed to convince me that even this was probably not a very good idea though.
Since the whole point of this sphinx conversion is to appeal to documentation authors who prefer the ReST format as input (it's definitely not to make the docs look nicer, writing a new stylesheet for Lore would have taken 1/100th of the effort and nobody has expressed interest in doing that), creating a period where things were even *less* appealing to documentation authors would defeat the purpose.
I actually considered the stylesheet thing, but it was really only a passing thought. My personal motivation started with not being able to find things in the documentation. So I started looking at the various Lore tickets to see whether there was something to clean up that would help. And a bunch of them seemed to be asking for things that Sphinx already did. Sphinx was starting to become a common tool, and I had used it on several other projects, and found it pleasant to work with. Also, when I asked about Lore on IRC, I got a lot of "I'm not sure anyone knows how that works these days" and "oh man, I wish we didn't have to support that any more", etc. So I started looking into how to convert the docs over to use Sphinx.
Another possible solution to this problem would be to modify Lore so it could process ReST sources, so that we could convert the documentation within the repository piecemeal, and start writing any new docs in ReST, but still have a coherent whole of documentation produced, eventually switching the documentation processor from Lore to Sphinx.
This would require someone smarter than me. Or at least more versed in formal parsing theory/techniques. Or something. And that would be just to read the docutils sources. I find them...alien. (though less so that when I first started looking at them...I'm not sure if they've improved, or I have)
Yet another possible solution would be to modify Sphinx, adding a plugin to process the Lore sources.
This is more reasonable, but still has problems. Actually the reasonable thing would be to create a docutils piece to process Lore sources, and then maybe some Sphinx extensions on top of that. Or something. Still, it might have been doable. However, I think Lore would have had to be modified as well, and possibly the Lore format expanded to accommodate certain constructs that it just doesn't do right now (mostly I'm thinking of the toctree directive and related stuff).
As an aside: this is the part of the process which has been so frustrating to me, personally. The two alternate solutions I proposed here (and have proposed before) seem far saner and more manageable in terms of effort, to me. But, everyone I have spoken to about docutils and ReST has told me in no uncertain terms that they are both a pile of heinous hacks that resist any attempt at sensible software-engineering solutions to problems, so we need to resort to hackish system-integration stuff like what we've done. This worries me.
Ooookaaaaay....I don't know how to respond to that exactly.
I know that Sphinx's output is well-loved by the Python community, but if it's so hard to call into that we can't reasonably modify it to get an XML DOM that looks like Lore source to Lore, and it's so hard to plug in to it that we can't give it a data structure that it likes from Lore's XML DOM, then how the heck is it being maintained? And if it actually *isn't* that bad, then why haven't I managed to find someone that knows its code well enough to do one or the other of these things?
It would be possible to make Sphinx emit Lore sources, though I'm not sure what that buys. You could do this either through a custom Sphinx "builder", or possibly even just using a custom html template with the html builder. But you'd need ReST sources to feed into Sphinx, so... You could write a docutils "parser" which parses a document and returns a "nodetree" data structure. This would get you as far as docutils, but AFAIK there is no existing way to get Sphinx to use any parser other than the default ReST one. You could probably create such a thing, which would almost certainly involve modifications to Sphinx, though that's not necessarily a big deal. It might not even be hard. I think this would actually be a lot easier now than when I started down this path, mostly because docutils seems to have better documentation on the nodes that can go in the "nodetree" I mentioned above. Note that I said "seems" because I'm not sure if it's that docutils documentation has gotten more complete, or just that I've bounced around in it enough times to find things. The Docutils docs have the same problem that the Twisted docs have, which is that they are nigh un-navigable. (I also think that the docutils docs should start using Sphinx, but I'm not sure how well that would go over in that camp...) The main problem with creating such a parser, is that Sphinx uses a bunch of docutils extensions to tie together the disparate documents in your project, and Lore, like vanilla docutils, doesn't have much of a concept of being one document among many (at least not from within a document). For example, it has things to handle tables of contents, cross document links (with the ability to link to a document section, rather than a specific document, so if it gets moved to a different document, the link gets adjusted), compilation for glossaries and index entries from across the docs project, etc. So you'd need to add some stuff to Lore to account for this (some is already there). And then we'd have to go through and modify a bunch of the Lore sources anyway. Like I said, this looks a lot more feasible now than it did when I first looked at it, though I'm not sure whether it's me or docutils/Sphinx that's changed. Probably some of each. At any rate, back then it seemed awfully difficult, and less interesting. Hmmm. And you'd also need to make some changes to the way Sphinx picks up files. And probably some other stuff I haven't thought of. I have no direct knowledge of any of this stuff, because my main interest
here is improving the experience of working on Twisted, both for you, Kevin, and for the people who will arguably be helped by the use of Sphinx. Maybe I'm completely wrong and Sphinx is beautifully architected and we could have done this from day 1. But I faintly hope that some Docutils and Sphinx contributor hears that I said "sphinx is garbage" and makes a fool of me by contributing either a lore modification or a sphinx plugin which solves this whole problem so we can do the format or tool migration incrementally :).
b) Because of a), there needs to be tooling to run lore2sphinx (or whatever) on a regular basis. (This was sort of being done via the Sphinx-building buildbot, but in a very ad-hockery sort of way, which was brittle, broke a couple of times, and needed to be improved.)
Hmm. I wasn't aware of that. But it seems like it's running by a charm now.
I think this is because a) exarkun fixed it a couple of times, and b) I stopped making changes to the lore2sphinx repo (which the buildbot pulls from). I'm also referring here to something which is completely non-obvious to anyone who hasn't actually run lore2sphinx by hand, which is that the command line tool was fairly terrible in several ways. This made it harder to use for development than it should have been.
c) There needs to be release management tooling to build the Sphinx docs from ReST into whatever formats we want to publish (HTML and PDF to start, maybe others later on)
Yup. (ePub? PDF is so last-century... :))
d) Convert the Lore sources to better ReST documents without all the problems that the current lore2sphinx output has.
So, this wasn't *necessary*. If we had gotten through the release automation stuff - and I still don't understand why that's stuck - we could have merged it.
Well, I decided it was. Or at least really really desirable.
I at one time thought this was pretty impractical. My first attempt at a conversion tool tried to use an intermediate object model, but I ran into trouble when trying to combine the various objects. So I abandoned the effort and created what became lore2sphinx, which basically just combined a bunch of strings. I then figured out a way of making the intermediate object thing work, and that was lore2sphinx-ng. Then it became convenient to split out the intermediate object model from the documetn processing code, so I put all of that into a library and that became rstgen.
It seems the saving grace here is that rstgen might be a generally useful tool in its own right, with more of a long-term future than lore2sphinx would have had.
I admit that I have become more interested in the actual problem of "generating ReST" than I once was. And I hope that it will become a generally useful tool. And probably one of the reasons I have been making such relatively slow progress on it is is _because_ I'm trying to solve a more general problem than I once was. The original lore2sphinx (the one running on the buildbot now) was very much a minimal-thing-that-could-possibly-work kind of solution. It tried to do just enough to get the job done. It sort of did get the job done, but I was never very satisfied with it.
(For anyone who is curious, the lore2sphinx-ng repo is forked off from the lore2sphinx repo, primarily because I didn't want to break the Sphinx buildbot by making drastic changes.)
Have a link?
I've posted it a couple of times in this thread, though I can hardly blame you for either missing it or losing track of it. original: https://bitbucket.org/khorn/lore2sphinx extra-crispy: https://bitbucket.org/khorn/lore2sphinx-ng
Here's what my plan was prior to this whole discussion getting started again.
1) Finish rstgen, where "finished" in this instance is defined as "is capable of generating all the vanilla docutils and sphinx-specific ReST elements that we need for converting the Twisted documentation.
Sounds like a worthy goal, although I don't think this is necessarily required. Have you been working on it for the last 2 years? Do you have any idea when it might be done? It might be worthwhile to write a *smaller* .
I started on rstgen a bit more than a year ago. I was hung up on the problem of how to combine various parts of a document for a while without having the crazy space-handling issues. And also I've been trying to come up with a relatively friendly API, and enough generality that it will end up useful outside of the lore2sphinx context. I really started on l2s-ng last July during "Julython". I've been working on it in fits and starts a few times since then.
2) Finish lore2sphinx-ng (which would probably have ended with merging it back into the lore2sphinx repo), where "finished" means that it would be capable of processing all the XHTML Lore tags that were defined in the Lore documentation and used in the Twisted documentation, and generating a tree of rstgen elements, which could then be rendered into ReST.
Cool.
While this would be handy, especially for people working on documentation branches, it's not necessarily necessary.
(this would also serve to satisfy b) above, as the CLI in lore2sphinx-ng is less...well, let's just call it broken than lore2sphinx's was/is.)
OK.
3) Go back and finish SphinxBuilder (release tooling for building a sphinx project, which is basically a wrapper for sphinx-build, plus some vague "version feature").
This is really the crux; this is the thing you should work on first, I think, even if you're going to keep working on lore2sphinx-ng. Basically the only reason that I was keen to get the lore to sphinx conversion improved in the first place was that creating this tool seemed to be dragging on for quite a while after the "chunk tickets" were done. But now, this tool is almost done, and we could re-do the lore-source review if you wanted to do that. The current lore2sphinx might well be good enough to just go with, especially if the next-generation version is going to take another six months to finish.
I'll take a look at this again soonish (a week? this month? don't know.). Probably it's a matter of: - merge forward (it has been a while) - figure out how the other tools guess/determine the Twisted version in the checkout, and make SphinxBuilder do that. - get it reveiewed - commit But I'll have to remember how to use combinator again (which will be much easier now that the combinator "docs" are on the Twisted wiki...thanks to whomever did that!) Yes, I could probably use Bazaar, but so far every time I've tried that, I've ended up spending waaaaaay too much time just on the VCS. I guess I have some kind of mental block with bzr. I'll get over it someday I suppose.
4) Get someone to use something less hackish than what's currently building the Sphinx docs on the buildbot, and preferably in such a way that the results of those builds could be published somewhere and have persistent links. Currently the results of what the Sphinx buildbot does are stored for a time, and then go away, so you'll see links to build results in some trac tickets that go nowhere, which is decidedly unhelpful. My plan was that we'd set up something where the Sphinx docs would get generated and published someplace for every buildbot build so that we could always have the current results for the lore to sphinx conversion for the tip of each branch. I have no idea whether this is actually feasible or practical, but it seemed like it would be useful.
OK, *this* sounds like really unnecessary turd-polishing ;-). This builder is an interim step; the more interesting step is the builder that just builds the sphinx docs, in the same way that the current builder builds the lore docs. Furthermore, it seems to be working fine. Build results links that go nowhere are a known problem with buildbot, since it does eventually lose most history, and this type of history takes up a fair bit of disk space.
Well, it was mostly motivated by the fact that we were doing a lot of linking to build results that would then cease to exist for a while, and it really annoyed me. It doesn't seem nearly as "necessary" to me now as it once did.
5) Proceed with Sphinx docs being built from lore sources, making tweaks as necessary to lore2sphinx(ng) for as long as it took for the generated docs to be good enough to justify switching to Sphinx entirely. 6) Switch to Sphinx entirely.
I really wasn't planning on trying to get people excited about switching to Sphinx again until 1) and 2) were at least "mostly" done (for certain values of done) and I had gone back to finish 3).
So. I guess at this point the question is whether to try and go with what's there (lore2sphinx) or finish up the "new stuff" (lore2sphinx-ng + rstgen). I think 3-6 in my above plan need to happen in any case, and I think those will be much easier with lore2sphinx-ng+rstgen.
This decision is really determined by time estimates.
In any case, work out the sphinx release automation tool first, since we need that regardless of how we switch over
Got it.
IIRC, rstgen has support for most of the vanilla docutils elements, with the notable exception of tables (and maybe definition lists...can't recall whether I finished those). It has a basic level of test coverage (of course you can never have too many tests) for rendering the elements individually, and some test for elements in combination (particularly nested lists). Footnotes and Citations I think also need some work, which I have a plan for, but haven't implemented yet (i don't think).
The "new" lore2sphinx CLI tool needs more work, but is relatively straightforward. Like the old tool, it's basically an elementtree processor, except instead of spitting out strings that get joined together (which granted was an unholy mess), it generates rstgen elements, which all have a .render() method. After processing a Lore document, you shoudl end up with a rstgen.Document object. You call it's render() method, which calls it's children's render() methods, etc. and it's turtles all the way down.
The framework is there for the new CLI tool, it's mostly a matter of writing a bunch of short methods that take elementtree elements as input and return appropriate rstgen objects.
Obviously these tools aren't finished, but they produce much better output than the old version of lore2sphinx w.r.t. whitespace handling, paragraph wrapping, etc.
Aesthetically, this appeals to me a lot more than going with the messiness of lore2sphinx.
Me too.
But it is _not_ a requirement.
Understood. Though I think it might be a practical requirement, even if it isn't a policy requirement. If that makes sense.
Some of the code is still pretty messy, but nowhere near the train wreck that the current/old version of lore2sphinx is. By which I mean it _can_ be cleaned up, it just hasn't been yet. In particular there's some places in rstgen where the API is (to me at least) obviously awful, but I haven't gotten around to fixing it yet.
Please review the code. Please feel free to ask questions if you're interested.
Personally, I've gotten over being in a hurry about all this, and I think a robust tool is more likely to succeed in the long run, though finishing it may make the run a bit longer. So I'm for finishing lore2sphinx-ng+rstgen.
I think a little false urgency might not hurt here :-). I'm not going to work on the tool - just writing these emails probably blew my Twisted development budget for the next two months ;-)
I can relate... :)
- but I will do my best to quickly clear up any procedural what-needs-to-be-done questions unambiguously. Please ping if anything gets you stuck.
I'll let you know. -- Kevin Horn
Hi, I am very much interested to complete this project, i did apply to Gsoc last time with this project and was rejected (however there were better students). And i would be happy if i can take this time. Let me know your thoughts. On 3/7/13, Kevin Horn <kevin.horn@gmail.com> wrote:
Sorry it's taken me so long to get back to this. But it's gotten to be a Looong email.
On Sat, Mar 2, 2013 at 3:14 AM, Glyph <glyph@twistedmatrix.com> wrote:
On Mar 1, 2013, at 9:35 PM, Kevin Horn <kevin.horn@gmail.com> wrote:
That "never-ending" series of Lore source fixes took place over the course of a couple of weeks. Doing things that way was not my idea, though it seemed reasonable at the time because the idea was that we would do the cutover at the end of it.
Well, let's go to the video tape. Based on this comment - < http://twistedmatrix.com/trac/ticket/4500#comment:12> - these tickets were closed over a period ranging from 2010/07 to 2011/03. 6 months isn't quite "weeks", but okay I guess it wasn't "never-ending" either :).
Hmmm. I recall it as being much shorter. Probably most of the work took place it two "spurts" around the beginning and end of that time, and that's why I remember it that way. But I'm not interested in digging through a bunch of old dates to find out for sure.
(As an aside, lore2sphinx is in no way a "broken pile of regexes". Not to say that it isn't broken in some really significant ways, because it is, but it doesn't use regexes at all. Just sayin'.)
Actually yeah, "regex" is just a curse-word here :). It's the emitter I'm complaining about, anyway, not the parser, so deriding it as a "regex" is in no way accurate.
I figured that was the case, I just wanted to say something so others reading this didn't get the wrong impression about how lore2sphinx is implemented. I mean it's not code I'm very proud of, but it's not _that_ bad :)
<<< snip a bunch of stuff about who said what when, why I thought what I thought, etc. >>>
It boils down to the fact that a bunch of the conversations happened either in person or on IRC. This was mostly because I was in a hurry at the time, usually because I wanted to do something before additions were made to the documentation, which was in a somewhat "known" state (as in I knew how it was going to behave when run through lore2sphinx) at the time.
Also, please elaborate on what you mean by "do *everything* in one big
bang. My intention was never to do anything but get the SphinxBuilder working on that branch. Was there something else you thought I was doing? Was there something else I should (or should not) have been doing?
My reasoning goes like this: the ticket for the release tools is still not in review, so you must be waiting for something to re-submit it. It looks like you responded to the code, so the only thing I could think you were still waiting for would be for the lore sources themselves to be ready.
It's been long enough that I can't fully recall my reasoning on this. But _probably_ I decided that if I finished the release tools ticket, someone might use it. Which would be great, except that I think I had decided that before that actually happened I needed to figure out a way to emit nicer output from lore2sphinx. So I left it alone until I had figured out how to do that.
At least, that _might_ have been part of my thought process. It really was ages ago.
[the fixed-up Lore sources] got left alone because of the release tools
hangup. Ideally the release tools would have been done before the whole lore-source-tweaking process, but they weren't. I'll admit my frustration played a part in this, but so did the deafening silence I got when I asked for anyone to comment on the ticket.
Where and how did you ask people to comment on the ticket? I don't recall being asked, and I tend to be pretty good about leaving prompts like that in my inbox until I've done what was asked. (Not *perfect*, of course, and if you asked a list then there might have been some bystander effect.) It seems like we might have avoided this whole mess if you had just attached the 'review' keyword :).
On IRC.
My perception has been that I would say "what do we need to do to make this happen"? There would be some hemming and hawing (and at least several times long discussions about how documentation didn't really fit the regular UQDS process) and a sort of plan would be invented. I would proceed according to the plan as I understood it. I would then say "OK, we're ready"! And then be told that some other thing not in the plan needed to be done. The cycle would then repeat.
The only "cycle" I can either see on the tickets or recall here is where the release tools didn't come in to the initial plan.
This was the latest of several (3 or 4) according to my recollection/perception. It doesn't really matter now.
No [the need for release automation] was not brought up until well into the process. I (sort of) understand the desire for this, but it seems pretty weird to be building what is essentially a wrapper for an existing tool, along with tests for said wrapper,
OK. I can believe that this did not happen. One problem is that we (the inner-circle old-school Twisted developers) tend to engage in conversations about how a thing might be done while at the same time we discuss what must be done. And we also tend to discuss what policy is (or what all or some of us believe it *ought to be* in some case, further confusing the issue) without making explicit what the *purpose* of that requirement is.
I would ask the community to help us with this by doing a couple of things.
If somebody says "X is policy", always ask for a link to it. If there is a link, it'll help you understand it better. If there *isn't* a link, then the authority telling you it's "policy" might just be remembering that it's the way we've done things since forever and of course it's a good idea. There are definitely things that I have thought were in the coding standard that are not actually written down anywhere, on more than one occasion.
If a meandering discussion is happening - here, on the mailing list, on the ticket - never be afraid to break it up and separate out the different concerns which are being discussed: what is necessary for compliance with our development process, what would be a good idea from a design point of view, how the work might be broken up to get through review more manageably, what other concerns are in play.
Especially, if you ever see a code review where a reviewer says "I think..." without making it clear what you should *do*, you should always ask, 'is this a requirement of the review or just some thoughts you have'.
And when we ask, we should ask on the ticket, and put it back into review, yes? Because I think this was the part (or at least _A_ part) I was really missing here.
There's also the problem of "I think you should..." being interpreted as "You must...". It is *very* hard to consistently separate design feedback from code review, although we try very hard; but, it's hard to separate it out when reading it as well. So one important point to keep in mind is that, as the author of a proposed change, outside the things that are agreed upon policy consensus, you always have some degree of discretion to disagree with a reviewer. And you should freely do so when submitting anything for re-review. It's best to just do this as quickly as possible, so that it gets back to the reviewer without a whole lot of delay, and they can respond with either "I still disagree, but you're doing the work, so OK go ahead" or "No, you really have to do this, it's required by policy document X, here's a link" ;-).
1. The documentation itself needs to be able to be generated from any version of trunk. While one or two formatting snafus are acceptable to be fixed after the fact, the documentation needs to be in a comprehensible state in every revision of trunk, which means that in order to land on trunk, the ReST output.
So...you didn't finish that sentence. I realize you apologized for errors at the end of your mail, but I have a feeling you were going to say something rather important there...
Well yes, that was the point of the apology. That was a rather important thing. What I was probably going to say was just:
The ReST output needs to be in good enough shape to be generally readable, with a manageable number of errors. But, we need to be able to *verify* that it has not too many errors.
And I'd already discussed that somewhat above.
Now that I've replied to all of that, let me give you a rundown of what I've been thinking and planning, so that you have an idea of where I'm coming from.
Here are the various things that I have perceived to be necessary/required in order to get the conversion to happen:
a) The conversion process needs to be able to be run concurrently with Lore for an extended period of time. In other words, Lore would be the "official" version of the docs, and the Sphinx docs would be built in some form of automated fashion until everyone was happy with them and/or ready to deprecate/abandon Lore.
Your understanding of this requirement is slightly off, I think, although possibly the consequences are the same. As per the difficulties I laid out above, about separating the requirements from the strategies for satisfying said requirements.
I've been told that almost verbatim, several times. This is basically what led to the Sphinx buildbot happening. Perhaps I wasn't clear about what I meant.
The thing that we weren't going to tolerate was any message saying that people should hold off on writing documentation, even for "a little while" while we fixed up the lore conversion, because without a contractual obligation for someone to finish this work, there's really no telling how long "a little while" would be :).
Well, when I originally was pushing it, my plan was for that little while to be "today" (this was at PyCon during the only day of sprints I was able to attend), and if it didn't get done, we'd abandon that particular attempt. You and exarkun managed to convince me that even this was probably not a very good idea though.
Since the whole point of this sphinx conversion is to appeal to documentation authors who prefer the ReST format as input (it's definitely not to make the docs look nicer, writing a new stylesheet for Lore would have taken 1/100th of the effort and nobody has expressed interest in doing that), creating a period where things were even *less* appealing to documentation authors would defeat the purpose.
I actually considered the stylesheet thing, but it was really only a passing thought. My personal motivation started with not being able to find things in the documentation. So I started looking at the various Lore tickets to see whether there was something to clean up that would help. And a bunch of them seemed to be asking for things that Sphinx already did. Sphinx was starting to become a common tool, and I had used it on several other projects, and found it pleasant to work with. Also, when I asked about Lore on IRC, I got a lot of "I'm not sure anyone knows how that works these days" and "oh man, I wish we didn't have to support that any more", etc. So I started looking into how to convert the docs over to use Sphinx.
Another possible solution to this problem would be to modify Lore so it could process ReST sources, so that we could convert the documentation within the repository piecemeal, and start writing any new docs in ReST, but still have a coherent whole of documentation produced, eventually switching the documentation processor from Lore to Sphinx.
This would require someone smarter than me. Or at least more versed in formal parsing theory/techniques. Or something. And that would be just to read the docutils sources. I find them...alien. (though less so that when I first started looking at them...I'm not sure if they've improved, or I have)
Yet another possible solution would be to modify Sphinx, adding a plugin to process the Lore sources.
This is more reasonable, but still has problems. Actually the reasonable thing would be to create a docutils piece to process Lore sources, and then maybe some Sphinx extensions on top of that. Or something. Still, it might have been doable. However, I think Lore would have had to be modified as well, and possibly the Lore format expanded to accommodate certain constructs that it just doesn't do right now (mostly I'm thinking of the toctree directive and related stuff).
As an aside: this is the part of the process which has been so frustrating to me, personally. The two alternate solutions I proposed here (and have proposed before) seem far saner and more manageable in terms of effort, to me. But, everyone I have spoken to about docutils and ReST has told me in no uncertain terms that they are both a pile of heinous hacks that resist any attempt at sensible software-engineering solutions to problems, so we need to resort to hackish system-integration stuff like what we've done. This worries me.
Ooookaaaaay....I don't know how to respond to that exactly.
I know that Sphinx's output is well-loved by the Python community, but if it's so hard to call into that we can't reasonably modify it to get an XML DOM that looks like Lore source to Lore, and it's so hard to plug in to it that we can't give it a data structure that it likes from Lore's XML DOM, then how the heck is it being maintained? And if it actually *isn't* that bad, then why haven't I managed to find someone that knows its code well enough to do one or the other of these things?
It would be possible to make Sphinx emit Lore sources, though I'm not sure what that buys. You could do this either through a custom Sphinx "builder", or possibly even just using a custom html template with the html builder. But you'd need ReST sources to feed into Sphinx, so...
You could write a docutils "parser" which parses a document and returns a "nodetree" data structure. This would get you as far as docutils, but AFAIK there is no existing way to get Sphinx to use any parser other than the default ReST one. You could probably create such a thing, which would almost certainly involve modifications to Sphinx, though that's not necessarily a big deal. It might not even be hard. I think this would actually be a lot easier now than when I started down this path, mostly because docutils seems to have better documentation on the nodes that can go in the "nodetree" I mentioned above. Note that I said "seems" because I'm not sure if it's that docutils documentation has gotten more complete, or just that I've bounced around in it enough times to find things. The Docutils docs have the same problem that the Twisted docs have, which is that they are nigh un-navigable. (I also think that the docutils docs should start using Sphinx, but I'm not sure how well that would go over in that camp...)
The main problem with creating such a parser, is that Sphinx uses a bunch of docutils extensions to tie together the disparate documents in your project, and Lore, like vanilla docutils, doesn't have much of a concept of being one document among many (at least not from within a document). For example, it has things to handle tables of contents, cross document links (with the ability to link to a document section, rather than a specific document, so if it gets moved to a different document, the link gets adjusted), compilation for glossaries and index entries from across the docs project, etc. So you'd need to add some stuff to Lore to account for this (some is already there). And then we'd have to go through and modify a bunch of the Lore sources anyway.
Like I said, this looks a lot more feasible now than it did when I first looked at it, though I'm not sure whether it's me or docutils/Sphinx that's changed. Probably some of each.
At any rate, back then it seemed awfully difficult, and less interesting.
Hmmm. And you'd also need to make some changes to the way Sphinx picks up files. And probably some other stuff I haven't thought of.
I have no direct knowledge of any of this stuff, because my main interest
here is improving the experience of working on Twisted, both for you, Kevin, and for the people who will arguably be helped by the use of Sphinx. Maybe I'm completely wrong and Sphinx is beautifully architected and we could have done this from day 1. But I faintly hope that some Docutils and Sphinx contributor hears that I said "sphinx is garbage" and makes a fool of me by contributing either a lore modification or a sphinx plugin which solves this whole problem so we can do the format or tool migration incrementally :).
b) Because of a), there needs to be tooling to run lore2sphinx (or whatever) on a regular basis. (This was sort of being done via the Sphinx-building buildbot, but in a very ad-hockery sort of way, which was brittle, broke a couple of times, and needed to be improved.)
Hmm. I wasn't aware of that. But it seems like it's running by a charm now.
I think this is because a) exarkun fixed it a couple of times, and b) I stopped making changes to the lore2sphinx repo (which the buildbot pulls from). I'm also referring here to something which is completely non-obvious to anyone who hasn't actually run lore2sphinx by hand, which is that the command line tool was fairly terrible in several ways. This made it harder to use for development than it should have been.
c) There needs to be release management tooling to build the Sphinx docs from ReST into whatever formats we want to publish (HTML and PDF to start, maybe others later on)
Yup. (ePub? PDF is so last-century... :))
d) Convert the Lore sources to better ReST documents without all the problems that the current lore2sphinx output has.
So, this wasn't *necessary*. If we had gotten through the release automation stuff - and I still don't understand why that's stuck - we could have merged it.
Well, I decided it was. Or at least really really desirable.
I at one time thought this was pretty impractical. My first attempt at a conversion tool tried to use an intermediate object model, but I ran into trouble when trying to combine the various objects. So I abandoned the effort and created what became lore2sphinx, which basically just combined a bunch of strings. I then figured out a way of making the intermediate object thing work, and that was lore2sphinx-ng. Then it became convenient to split out the intermediate object model from the documetn processing code, so I put all of that into a library and that became rstgen.
It seems the saving grace here is that rstgen might be a generally useful tool in its own right, with more of a long-term future than lore2sphinx would have had.
I admit that I have become more interested in the actual problem of "generating ReST" than I once was. And I hope that it will become a generally useful tool.
And probably one of the reasons I have been making such relatively slow progress on it is is _because_ I'm trying to solve a more general problem than I once was. The original lore2sphinx (the one running on the buildbot now) was very much a minimal-thing-that-could-possibly-work kind of solution. It tried to do just enough to get the job done. It sort of did get the job done, but I was never very satisfied with it.
(For anyone who is curious, the lore2sphinx-ng repo is forked off from the lore2sphinx repo, primarily because I didn't want to break the Sphinx buildbot by making drastic changes.)
Have a link?
I've posted it a couple of times in this thread, though I can hardly blame you for either missing it or losing track of it.
original: https://bitbucket.org/khorn/lore2sphinx extra-crispy: https://bitbucket.org/khorn/lore2sphinx-ng
Here's what my plan was prior to this whole discussion getting started again.
1) Finish rstgen, where "finished" in this instance is defined as "is capable of generating all the vanilla docutils and sphinx-specific ReST elements that we need for converting the Twisted documentation.
Sounds like a worthy goal, although I don't think this is necessarily required. Have you been working on it for the last 2 years? Do you have any idea when it might be done? It might be worthwhile to write a *smaller* .
I started on rstgen a bit more than a year ago. I was hung up on the problem of how to combine various parts of a document for a while without having the crazy space-handling issues. And also I've been trying to come up with a relatively friendly API, and enough generality that it will end up useful outside of the lore2sphinx context.
I really started on l2s-ng last July during "Julython". I've been working on it in fits and starts a few times since then.
2) Finish lore2sphinx-ng (which would probably have ended with merging it back into the lore2sphinx repo), where "finished" means that it would be capable of processing all the XHTML Lore tags that were defined in the Lore documentation and used in the Twisted documentation, and generating a tree of rstgen elements, which could then be rendered into ReST.
Cool.
While this would be handy, especially for people working on documentation branches, it's not necessarily necessary.
(this would also serve to satisfy b) above, as the CLI in lore2sphinx-ng is less...well, let's just call it broken than lore2sphinx's was/is.)
OK.
3) Go back and finish SphinxBuilder (release tooling for building a sphinx project, which is basically a wrapper for sphinx-build, plus some vague "version feature").
This is really the crux; this is the thing you should work on first, I think, even if you're going to keep working on lore2sphinx-ng. Basically the only reason that I was keen to get the lore to sphinx conversion improved in the first place was that creating this tool seemed to be dragging on for quite a while after the "chunk tickets" were done. But now, this tool is almost done, and we could re-do the lore-source review if you wanted to do that. The current lore2sphinx might well be good enough to just go with, especially if the next-generation version is going to take another six months to finish.
I'll take a look at this again soonish (a week? this month? don't know.). Probably it's a matter of:
- merge forward (it has been a while) - figure out how the other tools guess/determine the Twisted version in the checkout, and make SphinxBuilder do that. - get it reveiewed - commit
But I'll have to remember how to use combinator again (which will be much easier now that the combinator "docs" are on the Twisted wiki...thanks to whomever did that!)
Yes, I could probably use Bazaar, but so far every time I've tried that, I've ended up spending waaaaaay too much time just on the VCS. I guess I have some kind of mental block with bzr. I'll get over it someday I suppose.
4) Get someone to use something less hackish than what's currently building the Sphinx docs on the buildbot, and preferably in such a way that the results of those builds could be published somewhere and have persistent links. Currently the results of what the Sphinx buildbot does are stored for a time, and then go away, so you'll see links to build results in some trac tickets that go nowhere, which is decidedly unhelpful. My plan was that we'd set up something where the Sphinx docs would get generated and published someplace for every buildbot build so that we could always have the current results for the lore to sphinx conversion for the tip of each branch. I have no idea whether this is actually feasible or practical, but it seemed like it would be useful.
OK, *this* sounds like really unnecessary turd-polishing ;-). This builder is an interim step; the more interesting step is the builder that just builds the sphinx docs, in the same way that the current builder builds the lore docs. Furthermore, it seems to be working fine. Build results links that go nowhere are a known problem with buildbot, since it does eventually lose most history, and this type of history takes up a fair bit of disk space.
Well, it was mostly motivated by the fact that we were doing a lot of linking to build results that would then cease to exist for a while, and it really annoyed me. It doesn't seem nearly as "necessary" to me now as it once did.
5) Proceed with Sphinx docs being built from lore sources, making tweaks as necessary to lore2sphinx(ng) for as long as it took for the generated docs to be good enough to justify switching to Sphinx entirely. 6) Switch to Sphinx entirely.
I really wasn't planning on trying to get people excited about switching to Sphinx again until 1) and 2) were at least "mostly" done (for certain values of done) and I had gone back to finish 3).
So. I guess at this point the question is whether to try and go with what's there (lore2sphinx) or finish up the "new stuff" (lore2sphinx-ng + rstgen). I think 3-6 in my above plan need to happen in any case, and I think those will be much easier with lore2sphinx-ng+rstgen.
This decision is really determined by time estimates.
In any case, work out the sphinx release automation tool first, since we need that regardless of how we switch over
Got it.
IIRC, rstgen has support for most of the vanilla docutils elements, with the notable exception of tables (and maybe definition lists...can't recall whether I finished those). It has a basic level of test coverage (of course you can never have too many tests) for rendering the elements individually, and some test for elements in combination (particularly nested lists). Footnotes and Citations I think also need some work, which I have a plan for, but haven't implemented yet (i don't think).
The "new" lore2sphinx CLI tool needs more work, but is relatively straightforward. Like the old tool, it's basically an elementtree processor, except instead of spitting out strings that get joined together (which granted was an unholy mess), it generates rstgen elements, which all have a .render() method. After processing a Lore document, you shoudl end up with a rstgen.Document object. You call it's render() method, which calls it's children's render() methods, etc. and it's turtles all the way down.
The framework is there for the new CLI tool, it's mostly a matter of writing a bunch of short methods that take elementtree elements as input and return appropriate rstgen objects.
Obviously these tools aren't finished, but they produce much better output than the old version of lore2sphinx w.r.t. whitespace handling, paragraph wrapping, etc.
Aesthetically, this appeals to me a lot more than going with the messiness of lore2sphinx.
Me too.
But it is _not_ a requirement.
Understood. Though I think it might be a practical requirement, even if it isn't a policy requirement. If that makes sense.
Some of the code is still pretty messy, but nowhere near the train wreck that the current/old version of lore2sphinx is. By which I mean it _can_ be cleaned up, it just hasn't been yet. In particular there's some places in rstgen where the API is (to me at least) obviously awful, but I haven't gotten around to fixing it yet.
Please review the code. Please feel free to ask questions if you're interested.
Personally, I've gotten over being in a hurry about all this, and I think a robust tool is more likely to succeed in the long run, though finishing it may make the run a bit longer. So I'm for finishing lore2sphinx-ng+rstgen.
I think a little false urgency might not hurt here :-). I'm not going to work on the tool - just writing these emails probably blew my Twisted development budget for the next two months ;-)
I can relate... :)
- but I will do my best to quickly clear up any procedural what-needs-to-be-done questions unambiguously. Please ping if anything gets you stuck.
I'll let you know.
-- Kevin Horn
-- Cordially Abdul Rauf (haseeb)
participants (4)
-
Abdul Rauf
-
Glyph
-
Kevin Horn
-
Tom Prince