Document <paragraph> Paragraph -------- <--- error here <paragraph> Paragraph There are several possibilities for the implementation. 1. Implement horizontal rules as "divisions" or segments. A "division" is a title-less, non-hierarchical section. The first try at an implementation looked like this:: <document> <section name="document"> <title> Document <paragraph> Paragraph <division> <paragraph> Paragraph But the two paragraphs are really at the same level; they shouldn't appear to be at different levels. There's really an invisible "first division". The horizontal rule splits the document body into two segments, which should be treated uniformly. 2. Treating "divisions" uniformly brings us to the second possibility:: <document> <section name="document"> <title> Document <division> <paragraph> Paragraph <division> <paragraph> Paragraph With this change, documents and sections will directly contain divisions and sections, but not body elements. Only divisions will directly contain body elements. Even without a horizontal rule anywhere, the body elements of a document or section would be contained within a division element. This makes the document tree deeper. This is similar to the way HTML treats document contents: grouped within a <BODY> element. 3. Implement them as "transitions", empty elements:: <document> <section name="document"> <title> Document <paragraph> Paragraph <transition> <paragraph> Paragraph A transition would be a "point element", not containing anything, only identifying a point within the document structure. This keeps the document tree flatter, but the idea of a "point element" like "transition" smells bad. A transition isn't a thing itself, it's the space between [#]_ two divisions. .. [#] Cool song by Dave Matthews. First time I heard it, I thought it was Peter Gabriel. Matthews' voice on this song sounds uncannily like Gabriel's, and the style & lyrics wouldn't be out of place either. Totally off topic. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From fdrake@acm.org Mon Oct 22 18:24:35 2001 From: fdrake@acm.org (Fred L. Drake) Date: Mon, 22 Oct 2001 13:24:35 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011022172435.28AF528697@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Various updates, including support for the "Site Navigation Bar" in Mozilla 0.9.5. From fdrake@acm.org Mon Oct 22 19:41:46 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Mon, 22 Oct 2001 14:41:46 -0400 Subject: [Doc-SIG] GNU Info documentation for Python 2.1.1 Message-ID: <15316.26730.592110.734796@grendel.zope.com> Milan Zamazal has contributed versions of the Python documentation in GNU Info format for Python 2.1.1. If you've been itching for this, grab your copy today! ftp://ftp.python.org/pub/python/doc/2.1.1/ -Fred -- Fred L. Drake, Jr. <fdrake at acm.org> PythonLabs at Zope Corporation <a href="ftp://ftp.python.org/pub/python/doc/2.1.1/">GNU Info documentation for Python 2.1.1</a> -- The standard Python documentation is once more available in GNU Info format! From gustav@morpheus.demon.co.uk Mon Oct 22 20:45:18 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Mon, 22 Oct 2001 20:45:18 +0100 Subject: [Doc-SIG] How to traverse a document object Message-ID: <f1t8ttss7h3rfvm3r0mrng2ho4detl97d6@4ax.com> I don't know if I'm missing something stupidly obvious here... I want to traverse a document generated by a dps.parsers.restructuredtext.Parser instance. I can see no way of doing so short of a manual tree-walk with type checks all the way - you know:: if type(current_node) =3D=3D section: # process a section, probably recursing on its children elif type(current_node) =3D=3D paragraph: # process a paragraph # etc, ad nauseam. This *can't* be the correct way of doing this. The example code in the tools directory isn't any help, as it builds on the asdom() method, which does the trick "for free". I tried looking at the xml.dom module, but that didn't offer anything that immediately leapt out at me as a suitable "visitor" interface. But I'm not familiar with XML, so I'm probably missing something. But IMHO, there *must* be a simple way of traversing the document model. At the moment, I can't see it. Can someone enlighten me? (Or if there isn't one, surely it needs implementing - and as it may well need support from the classes in nodes.py, I'd imagine it needs adding sooner rather than later?) As I say, I'm convinced I've missed something fundamental here... Paul. From goodger@users.sourceforge.net Mon Oct 22 22:44:36 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Mon, 22 Oct 2001 17:44:36 -0400 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <f1t8ttss7h3rfvm3r0mrng2ho4detl97d6@4ax.com> Message-ID: <B7FA0B83.194A0%goodger@users.sourceforge.net> Paul Moore wrote: > I don't know if I'm missing something stupidly obvious here... I want to > traverse a document generated by a dps.parsers.restructuredtext.Parser > instance. I can see no way of doing so short of a manual tree-walk with > type checks all the way You're not missing anything. The answer is simple: not implemented yet. (At least not in nodes.py; perhaps Tony's pydps implemented something?) > (Or if there isn't one, surely it needs implementing - and as it may well need > support from the classes in nodes.py, I'd imagine it needs adding sooner > rather than later?) And the classic open-source answer: please go ahead, patches gratefully accepted! Alternatively, explain what you are trying to do and how it will benefit humankind, and perhaps somebody else's interest will be piqued enough they'll implement it for you. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From gustav@morpheus.demon.co.uk Mon Oct 22 23:55:18 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Mon, 22 Oct 2001 23:55:18 +0100 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <f1t8ttss7h3rfvm3r0mrng2ho4detl97d6@4ax.com> References: <f1t8ttss7h3rfvm3r0mrng2ho4detl97d6@4ax.com> Message-ID: <e659tt8331819fv5s646isknugofh62p9t@4ax.com> On Mon, 22 Oct 2001 20:45:18 +0100, Paul Moore <gustav@morpheus.demon.co.uk> wrote: >I don't know if I'm missing something stupidly obvious here... I want to >traverse a document generated by a dps.parsers.restructuredtext.Parser >instance. I can see no way of doing so short of a manual tree-walk with >type checks all the way - you know:: Hmm. I dug a bit deeper - I can do a bit better, by switching on the tagname attribute. But I can't find a robust way of terminating the recursion. I can't check the "children" attribute (if it's zero, don't recurse) as text nodes don't have this attribute. OK, getattr(node, 'children', 0) works, but that looks ugly. The problem seems to be that the _Node class has no useful attributes or methods to handle tree walks, whereas _Element contains children which are not themselves _Elements (via _TextElement, which again has no attributes to let me notice what's going on). OK, I can check for a tagname of "#text". That seems to work. But there's no way that this feels natural - it smacks too much of magic numbers... Grmph. This feels like it should be a natural application for the "Visitor" pattern. The following works:: # "Acceptor methods" - see the Visitor pattern for details # This one has to handle children def AcceptNode(self, visitor): visitor.Visit(self) for child in self.children: child.Accept(visitor) # This one doesn't handle children def AcceptText(self, visitor): visitor.Visit(self) # Install the methods - this feels as if it's # being unacceptably chummy with the node classes - # particularly by referring to the _Element class, # which has a leading underscore import dps.nodes dps.nodes._Element.Accept =3D AcceptNode dps.nodes.Text.Accept =3D AcceptText # Define a visitor, which just prints the tagname class V: def Visit(self, node): print node.tagname # Walk the document tree document.Accept(V()) In fact, this seems fairly nice, except for the fact that [Offtopic]_ a. I'm inserting new methods into the node classes, which is a little presumptuous of me. b. I need to know the class names of the node classes, which seems to be too tied to implementation specifics. Nevertheless, its a pretty good tree walking model. It might be nice to have this in the node classes, except that there may not be a single "correct" walk order (a bit like the normal preorder, postorder, inorder issues). .. [Offtopic] I would normally indent this list if I was writing "plain" (ie, no markup) text. I'm not sure what effect such indentation would have on reST. I get the impression I'd get an extra "blockquote" element that I didn't want. Is this harmful? (I guess that depends on the output formatter, and so the answer has to be "possibly"). Can it be avoided? In "plain" text, I *really* prefer the look of lists when they are indented. Even more offtopic - is this indented enough to be part of the footnote? I think the indentation rules need more clarification... [I did a test, and it is included...] Sorry, this has all turned into a bit of a brain-dump. But that's probably because I'm feeling that I'm having to invent something that I expected to be part of the basic infrastructure. Is it simply that no-one's got to the point of needing this implemented yet? It's an "output" issue, and the lack of output generators suggests that that may well be the problem. Thanks for listening, Paul. From Juergen Hermann" <jh@web.de Tue Oct 23 00:28:29 2001 From: Juergen Hermann" <jh@web.de (Juergen Hermann) Date: Tue, 23 Oct 2001 01:28:29 +0200 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <e659tt8331819fv5s646isknugofh62p9t@4ax.com> Message-ID: <m15voT2-007qasC@smtp.web.de> On Mon, 22 Oct 2001 23:55:18 +0100, Paul Moore wrote: >Grmph. This feels like it should be a natural application for the >"Visitor" pattern. Same idea here. Actually, it's already there, sort-of. The creation of t= he dom tree acts like the visitor pattern, only for a specific purpose. If = you ask me, the whole thing should be abstracted (as you proposed) and t= he the _dom_node and _rooted_dom_node could be implemented as ONE visitor. What I missed was a "toSAX" method, which could be easily added then. Ciao, J=FCrgen From lists@itamarst.org Tue Oct 23 11:36:23 2001 From: lists@itamarst.org (Itamar Shtull-Trauring) Date: Tue, 23 Oct 2001 12:36:23 +0200 Subject: [Doc-SIG] ANN: Teud 1.2, a documentation generator Message-ID: <3BD54827.3050208@itamarst.org> Teud - A Python documentation generator ======================================= What does it do? Converts python files to XML files containing the documentation for the objects in the python file. The XML files can be transformed into HTML using an XSLT stylesheet, etc. Teud tells you when a method in a class overrides or inherits from another class, and also supports PEP 245 interfaces (__implements__). What does it mean? teud is pronounced teh-OOD. It means "documentation" in Hebrew. What does the output look like? Well, the output can look however you want, just write a custom XSLT stylesheet. However, default output of with the fancy.css stylesheet that comes with Teud can be seen at: http://itamarst.org/twisted-docs/twisted.html How does it decide what to document? 1) If there is __all__, anything listed in it is public and is added to the XML file, anything not listed in it is not considered public but is added. 2) Anything imported from somewhere else (if that can be found out) and not in __all__ is not added to the XML file. 3) If there is no __all__, anything beginning with _ is not public. The default XSLT does not export non-public objects to the HTML file. Where do I download it? For more details and a TODO list, see http://twistedmatrix.com/users/jh.twistd/python/moin.cgi/TeudProject From goodger@users.sourceforge.net Wed Oct 24 03:07:14 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Tue, 23 Oct 2001 22:07:14 -0400 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <e659tt8331819fv5s646isknugofh62p9t@4ax.com> Message-ID: <B7FB9A90.195A8%goodger@users.sourceforge.net> Paul Moore wrote: > a. I'm inserting new methods into the node classes, which is a little > presumptuous of me. If you insert methods from outside the module, yes, you *would* be presumptuous. If you edit the nodes.py module and submit a patch, though, you'll be a contributing developer! > b. I need to know the class names of the node classes, which seems to > be too tied to implementation specifics. The document tree is meant to be an specific document/DTD/schema implementation only, not a generic DOM. > Nevertheless, its a pretty good tree walking model. It might be nice > to have this in the node classes, except that there may not be a > single "correct" walk order (a bit like the normal preorder, > postorder, inorder issues). I don't know how you'd do inorder with document trees. ;-) It's most useful to have a hook on the way in and on the way out of an element. I'm sure we can come up with a useful set of methods, perhaps without having to reinvent the wheel, using polymorphism and without resorting to magic. SAX comes to mind. Suggestions, references, and/or patches welcome. > .. [Offtopic] I would normally indent this list if I was writing > "plain" (ie, no markup) text. I'm not sure what effect such > indentation would have on reST. I get the impression I'd get an > extra "blockquote" element that I didn't want. Correct. > Is this harmful? Depends on what you mean by "harmful". :-) > Can it be avoided? In "plain" text, I *really* prefer the look > of lists when they are indented. A transform could be written that looks for a block quote containing only a list, and extracts the list from within the block quote. The spec could be changed to specify that this will happen. But what if we *want* a list inside a block quote? How else would we write it? > Even more offtopic - is this indented enough to be part of the > footnote? I think the indentation rules need more > clarification... [I did a test, and it is included...] Yes. OK, the nitty gritties of the indentation rules should be spelled out better. Added to the to-do. > Sorry, this has all turned into a bit of a brain-dump. But that's > probably because I'm feeling that I'm having to invent something > that I expected to be part of the basic infrastructure. Great value for the money though! > Is it simply that no-one's got to the point of needing this > implemented yet? In my case, yes. I haven't tackled the output end yet. What are you trying to do exactly? My interest is half-piqued. Provide me a bit of stimulus and it may become fully-piqued. > Thanks for listening, Any time. Cheaper than psychotherapy. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From tony@lsl.co.uk Wed Oct 24 09:40:19 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Wed, 24 Oct 2001 09:40:19 +0100 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <B7FB9A90.195A8%goodger@users.sourceforge.net> Message-ID: <002901c15c67$8340e660$545aa8c0@lslp862.int.lsl.co.uk> David Goodger wrote: > If you insert methods from outside the module, yes, you *would* be > presumptuous. Ah, but the advantage we have with the DPS nodes tree is that one *can* be presumptuous - it doesn't get one very far with "classic" DOM! For clarification, my (quick and dirty) HTML output mode works with a dictionary that links tag name to method name [1]_, with a special case for "#text" ('cos it's special). Thus it doesn't actually assume much about what includes what (although things will go strange if the structure is *not* what one would expect from the spec). Since I rather want a Writer class that can be customised easily, this dictionary approach seems simplest for the initial development - it's easy to subclass the Writer and just amend the dictionary entries. .. [1] Actually, the tag name keys a method name and some optional arguments, but that's enough detail for this discussion. I'm sure some visitor model would be nice at some stage, though - once one has understood the thing (!) it works very nicely (cf. the ``compiler`` package). Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "Bounce with the bunny. Strut with the duck. Spin with the chickens now - CLUCK CLUCK CLUCK!" BARNYARD DANCE! by Sandra Boynton My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From tony@lsl.co.uk Wed Oct 24 09:55:45 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Wed, 24 Oct 2001 09:55:45 +0100 Subject: [Doc-SIG] Representation of Horizontal Rules In-Reply-To: <B7F61A08.193D3%goodger@users.sourceforge.net> Message-ID: <002a01c15c69$aaf15a30$545aa8c0@lslp862.int.lsl.co.uk> David Goodger wrote: > I'm leaning towards solution #2. Those of you working with the > document tree, writing code or style sheets, please let me know if > this will cause problems beyond minor revisions.] Hmm. OK, I can't see that adding "division" as yet another structure within the document would cause me *problems*, but... I'll reinsert the quote from the Chicago Manual of Style again, 'cos it seems to be David's inspiration (or, at least, support in his argument): Instead of subheads, extra space or a type ornament between paragraphs may be used to mark text divisions or to signal changes in subject or emphasis. However, personally I would lean strongly to option 3. This is mainly because I "see" such an ornament as a typographic element, more than a structural element - a sort of "gross elipsis", a giant semicolon, a pause in the flow of the text (which I take to be the meaning of the second alternative in the Chicago Manual's description). David says: > A transition isn't a thing itself and I think it's there that we disagree! *But* I clearly see that David's argument is supported by the "instead of subheads" clause, and if he (as our benevolent designer(!)) finds that this pushes towards option 2, then I shan't cavil too much. > 2. Treating "divisions" uniformly brings us to the second > possibility:: > > <document> > <section name="document"> > <title> > Document > <division> > <paragraph> > Paragraph > <division> > <paragraph> > Paragraph > > With this change, documents and sections will directly contain > divisions and sections, but not body elements. Only divisions will > directly contain body elements. Even without a horizontal rule > anywhere, the body elements of a document or section would be > contained within a division element. This makes the document tree > deeper. This is similar to the way HTML treats document contents: > grouped within a <BODY> element. There's an advantage in this for me, in fact. I want to be able to indicate that a paragraph *after* a title is special (specifically, when doing the title for package, module and class sections). Being able to enclose such a paragraph (or paragraphs) within a division [1]_ makes it *much* easier to do what I want, and without having to add any new classes to the DPS node tree. So although I may object (slightly) to the proposal on the grounds David wants it, I do like the idea for subtly different reasons. .. [1] For my purposes, I *think* all elements will be sprouting the optional presence of a "style" (or some such) attribute, making it easier to indicate (for instance) that this division is being used for "detail" about the module, etc. But more on this if/when it happens... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From gustav@morpheus.demon.co.uk Wed Oct 24 22:31:08 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Wed, 24 Oct 2001 22:31:08 +0100 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <B7FB9A90.195A8%goodger@users.sourceforge.net> References: <e659tt8331819fv5s646isknugofh62p9t@4ax.com> <B7FB9A90.195A8%goodger@users.sourceforge.net> Message-ID: <4e0ett030c7jqgvjs1s747o94jvoj4dhn9@4ax.com> On Tue, 23 Oct 2001 22:07:14 -0400, David Goodger <goodger@users.sourceforge.net> wrote: >Paul Moore wrote: >> a. I'm inserting new methods into the node classes, which is a little >> presumptuous of me. > >If you insert methods from outside the module, yes, you *would* be >presumptuous. If you edit the nodes.py module and submit a patch, >though, you'll be a contributing developer! =46air enough :-) The real point, I guess, was that I wasn't sure if what I was doing counted as a prototype for a generally useful feature, or just a one-off piece of code for the particular task I was attempting. On reflection, it looks like a general thing (basically, *all* writers will need to tree-walk the nodes, so it's general...) >> b. I need to know the class names of the node classes, which seems to >> be too tied to implementation specifics. > >The document tree is meant to be an specific document/DTD/schema >implementation only, not a generic DOM. I'm not sure I understand what you mean here. That's probably because I know almost nothing of XML, and in particular, I find the terminology confusing (what do you mean by a "generic DOM"?) Regardless, the point is probably moot, as this looks more like a prototype for a patch (which means that being tied to the implementation details is emphatically *not* an issue) as opposed to a separate module. >> Nevertheless, its a pretty good tree walking model. It might be nice >> to have this in the node classes, except that there may not be a >> single "correct" walk order (a bit like the normal preorder, >> postorder, inorder issues). > >I don't know how you'd do inorder with document trees. ;-) OK, so I got a bit over-excited there :-) >It's most useful to have a hook on the way in and on the way out of an >element. I'm sure we can come up with a useful set of methods, perhaps >without having to reinvent the wheel, using polymorphism and without >resorting to magic. SAX comes to mind. Suggestions, references, and/or >patches welcome. Yes, (what little I know of) SAX struck me as relevant. I tried looking in the Python library manual (not XML/SAX specific, I know, but the only documentation I had to hand) and that seemed to imply that SAX was related to the parsing end of things rather than to the tree-walking end. There were some vague hints about tree walker classes and visitors in the PyXML modules, but unless I missed it, there's virtually no documentation for those modules, so I couldn't find out any more. Question: This document model isn't a "real" XML DOM (by my reading of your comments). So we end up reinventing technologies like DOM tree walkers, etc. My naive reaction is "why aren't we using the XML DOM, then, so we get this sort of thing for free"? Presumably there *are* good reasons - I'd be interested to know what they are. (The biggest sign of a problem, I suspect, would be if the asdom() method was heavily used, implying that people habitually generated a "real" DOM to handle this thing - presumably because of failings in the DPS DOM). >> .. [Offtopic] I would normally indent this list if I was writing >> "plain" (ie, no markup) text. I'm not sure what effect such >> indentation would have on reST. I get the impression I'd get an >> extra "blockquote" element that I didn't want. > >Correct. Experimentation had confirmed this for me. >> Is this harmful? > >Depends on what you mean by "harmful". :-) "Not what I want" :-) Seriously, I guess the question is how "blockquote" elements are going to be visibly marked up in the various output formats. Until we actually *have* some more formats, the question is hard to answer. >> Can it be avoided? In "plain" text, I *really* prefer the look >> of lists when they are indented. > >A transform could be written that looks for a block quote containing >only a list, and extracts the list from within the block quote. The >spec could be changed to specify that this will happen. But what if we >*want* a list inside a block quote? How else would we write it? I think that's basically my point. A lot depends on what a block quote is intended to signify. I view blockquotes as basically a way of displaying a quotation, or something similar, without the "..." around it. As such, it would be a pretty rare thing. In LaTeX, I'd use the "quote" environment for this. My copy of "HTML - The Definitive Guide" says that <blockquote> tags cause the contents to be set of from the main text, usually with indented right and left margins, and *sometimes in italicised typeface* (my emphasis). And it does point out that it's intended for quotes. This isn't the sort of thing I'd want to use often... Maybe the blockquote element in the DPS model should be redefined (or just better defined) to clarify the intended use. I can see a number of possibilities: - It is intended for block quotations, and so the HTML <blockquote> and LaTeX quote environments are appropriate. I can't see this form getting much use. - It is for general text which is indented on both left and right. This would be more useful, for text to "stand out" from the surroundings. But it doesn't match the HTML <blockquote> tag - so there may be problems implementing this in a HTML writer. - It is for text indented on the left only. This is something people actually do a lot. It also matches the look of the source text, so it's fairly easy to understand. But again, it may be harder to implement, and it smacks of "visual" formatting, rather than "logical" formatting. >> Sorry, this has all turned into a bit of a brain-dump. But that's >> probably because I'm feeling that I'm having to invent something >> that I expected to be part of the basic infrastructure. > >Great value for the money though! Oh, definitely! > >> Is it simply that no-one's got to the point of needing this >> implemented yet? > >In my case, yes. I haven't tackled the output end yet. I think that's the problem. I can't conceptualise the data structure in isolation - I need to grasp how it translates into real output. That's not a criticism of what you've done, it's just that I work differently. And I need to get a handle on getting stuff out of the object model before I can completely understand it. That's a real chicken-and-egg problem, though, as I'm trying to understand the model so that I can implement an output program :-) >What are you trying to do exactly? My interest is half-piqued. Provide >me a bit of stimulus and it may become fully-piqued. Write an output processor, at the most basic level. More specifically, write a framework with which output processors can be built fairly simply. My first concrete implementation is likely to be some sort of dump of a summary of the structure - tweakable, so that different levels of structure can be seen. With that, I'll be able to explore the model for a given document. Once I get that, I'm looking to implement a LaTeX output processor. It shouldn't be hard to do others (such as yet another HTML writer). If it is, I'll have done something wrong... One thing that has already become clear to me is that it will be *far* easier to write output processors for "structural" markup languages (HTML, (La)TeX, DocBook, Texinfo, etc) than for "layout" oriented languages (PDF, PostScript, etc). Sufficiently so that I doubt it will ever be realistic to go direct to such formats - you'd have to implement line and page breaking algorithms, etc, etc. >> Thanks for listening, > >Any time. Cheaper than psychotherapy. Gibber, gibber... Paul. From gustav@morpheus.demon.co.uk Wed Oct 24 22:31:11 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Wed, 24 Oct 2001 22:31:11 +0100 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <002901c15c67$8340e660$545aa8c0@lslp862.int.lsl.co.uk> References: <B7FB9A90.195A8%goodger@users.sourceforge.net> <002901c15c67$8340e660$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: <nfbettkp28dg98nt9sjf5uld9utrc4hgun@4ax.com> On Wed, 24 Oct 2001 09:40:19 +0100, "Tony J Ibbs (Tibs)" <tony@lsl.co.uk> wrote: >David Goodger wrote: >> If you insert methods from outside the module, yes, you *would* be >> presumptuous. > >Ah, but the advantage we have with the DPS nodes tree is that one *can* >be presumptuous - it doesn't get one very far with "classic" DOM! Hmm. Why not - it's a feature of the language (Python) that you can insert methods into predefined classes. That applies to DPS node classes or XML DOM classes equally. You need to know some details of the implememtation, but that's where the presumption comes in... (And whether it's a *good* thing is a separate point). >For clarification, my (quick and dirty) HTML output mode works with a >dictionary that links tag name to method name [1]_, with a special case >for "#text" ('cos it's special). Thus it doesn't actually assume much >about what includes what (although things will go strange if the >structure is *not* what one would expect from the spec). Yes, I looked at your stuff and saw that. It's nice and general. There's some "feel" to it that still feels adhoc to me. Can't put my finger on it, though. It's one of those cases where I suspect there's a "clearly right" solution, and all of the other options lack that "spark" of clarity and "obvious rightness". Om... >Since I rather want a Writer class that can be customised easily, this >dictionary approach seems simplest for the initial development - it's >easy to subclass the Writer and just amend the dictionary entries. I hadn't thought of subclassing and changing the dictionary. Yes, that makes it nicely reusable. >I'm sure some visitor model would be nice at some stage, though - once >one has understood the thing (!) it works very nicely (cf. the >``compiler`` package). The visitor pattern is nice. However, it relies on co-operation from the class being visited, and getting the form of that co-operation right is hard (you're designing a general tree-walk framework, often without knowing all the ways it might need to be used - after all, you can't modify it in response to client requirements, that's the point.) Paul. From gustav@morpheus.demon.co.uk Wed Oct 24 22:31:12 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Wed, 24 Oct 2001 22:31:12 +0100 Subject: [Doc-SIG] Output writers and XSLT In-Reply-To: <B7FB9A90.195A8%goodger@users.sourceforge.net> References: <e659tt8331819fv5s646isknugofh62p9t@4ax.com> <B7FB9A90.195A8%goodger@users.sourceforge.net> Message-ID: <p4cett0p3a1j4n5qcbnmof6avia656pplb@4ax.com> On Tue, 23 Oct 2001 22:07:14 -0400, David Goodger <goodger@users.sourceforge.net> wrote: >> Can it be avoided? In "plain" text, I *really* prefer the look >> of lists when they are indented. > >A transform could be written that looks for a block quote containing >only a list, and extracts the list from within the block quote. The >spec could be changed to specify that this will happen. But what if we >*want* a list inside a block quote? How else would we write it? You just said "transform". That made me think - we can *already* get XML out of reStructuredText. Given that, can we not use XSLT to generate whatever output formats we need? (You can tell I don't know much about XML, but I've read the brochures, can't you :-) Seriously, what little I know of XSLT implies that it might be possible to do this. I know someone (can't recall the reference just now) posted that they have reST->HTML code which goes via XML and XSLT. Is that a viable general approach? Does Python include XSLT processing modules? In the PyXML package, maybe? I dunno. Part of me feels that this is the way XML always works - lots of concepts which help people to think about problems, and lots of technology which never actually gets used to do the job, but which provides ideas to allow people to re-implement bits of it as needed. Cynically y'rs Paul. From Juergen Hermann" <jh@web.de Wed Oct 24 23:32:49 2001 From: Juergen Hermann" <jh@web.de (Juergen Hermann) Date: Thu, 25 Oct 2001 00:32:49 +0200 Subject: [Doc-SIG] Output writers and XSLT In-Reply-To: <p4cett0p3a1j4n5qcbnmof6avia656pplb@4ax.com> Message-ID: <m15wWXx-007qefC@smtp.web.de> On Wed, 24 Oct 2001 22:31:12 +0100, Paul Moore wrote: >I dunno. Part of me feels that this is the way XML always works - lots >of concepts which help people to think about problems, and lots of >technology which never actually gets used to do the job, but which >provides ideas to allow people to re-implement bits of it as needed. Or you know how to use the stuff, and then you get things like this: http://purl.net/wiki/python/TeudViewer?module=3Ddps And all this with _assembling_ the wheels, not reinventing them. ;) Ciao, J=FCrgen From goodger@users.sourceforge.net Thu Oct 25 03:46:22 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 24 Oct 2001 22:46:22 -0400 Subject: [Doc-SIG] Representation of Horizontal Rules In-Reply-To: <002a01c15c69$aaf15a30$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: <B7FCF53D.195CB%goodger@users.sourceforge.net> > David Goodger wrote: >> I'm leaning towards solution #2. Tony J Ibbs (Tibs) wrote: > However, personally I would lean strongly to option 3. If I squint my left eye, I can see both sides of the argument. Once again, it's a matter of perception. Take a look at a novel or short story, and locate the "transitions", extra vertical whitespace or a line of three well-spaced asterisks. Is this construct merely the border between two segments of text, or is it an object in its own right, a vertical elipsis? I tried to find a DTD containing such a construct in either sense, to establish precedent. Neither DocBook nor TEI contain such a beast. Does anybody know of a publicly-available DTD suitable for the markup of typical prose, such as a novel? [Tony, re option 2] > There's an advantage in this for me, in fact. I want to be able to > indicate that a paragraph *after* a title is special (specifically, when > doing the title for package, module and class sections). Being able to > enclose such a paragraph (or paragraphs) within a division [1]_ makes it > *much* easier to do what I want, and without having to add any new > classes to the DPS node tree. So although I may object (slightly) to the > proposal on the grounds David wants it, I do like the idea for subtly > different reasons. Is this misguided embrace of option 2 really just an attempt to force me to choose option 3, in order to foil such a blatant abuse of markup? Tricky! But you can't fool me. I'm on to you... -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Thu Oct 25 04:46:10 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 24 Oct 2001 23:46:10 -0400 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <4e0ett030c7jqgvjs1s747o94jvoj4dhn9@4ax.com> Message-ID: <B7FD0095.195CC%goodger@users.sourceforge.net> [David] >> The document tree is meant to be an specific document/DTD/schema >> implementation only, not a generic DOM. [Paul] > I'm not sure I understand what you mean here. ... > (what do you mean by a "generic DOM"?) DOM is a generic XML data structure. It contains an ``Element`` class (among others), whose instances represent all elements. If you want to store a ``list`` element, it would be an ``Element`` instance whose ``tagName`` attribute was set to "list". It's not very useful from an object-oriented programming point of view; you have to switch on the ``tagName`` attribute instead of using polymorphism. Another way to put it is to use XML itself as a model. A proper XML fragment might look like this:: <list> <item> <paragraph> Item one. </paragraph> </item> </list> You could just as easily represent the above with a single element, "element", in this abomination:: <element tagName="list"> <element tagName="item"> <element tagName="paragraph"> Item one. </element> </element> </element> Think of each element as a class instance in the data structure, where the tag name is equivalent to the class, and you'll see the difference. Proper XML is to the abomination what an application-specific class library is to DOM. DOM *is* useful though. The reason DOM *is* used is because it *is* generic. You don't have to write up an application-specific class library just to represent an arbitrary data structure. The dps.nodes classes can only represent a DPS doc tree, nothing else. DOM can represent *any* XML instance. > Question: This document model isn't a "real" XML DOM (by my reading of > your comments). So we end up reinventing technologies like DOM tree > walkers, etc. My naive reaction is "why aren't we using the XML DOM, > then, so we get this sort of thing for free"? It's free, yes, but the cost is too high. It depends on how you want to build the data structure, and what you want to do with the data structure once it's complete. In most XML-processing applications, you parse an already-existing XML file to a data structure, for which DOM is a valid choice. The reStructuredText parser is *building* a document tree piecemeal, and it's easier and more powerful to say ``node = nodes.list()`` than it is to say ``node = minidom.Element("list")``, especially when you can customize the ``nodes.list`` class with specialized behaviour. As for processing the data structure once complete, I haven't done much yet but I'm sure there will be advantages if it's made up of custom objects. > (The biggest sign of a problem, I suspect, would be if the asdom() method was > heavily used, implying that people habitually generated a "real" DOM to handle > this thing - presumably because of failings in the DPS DOM). (Let's call it the DPS doc tree, to avoid misunderstandings.) Even if ``asdom()`` is called every time, it's still a win as far as I'm concerned. It's dirt easy to turn a DPS doc tree into a DOM tree, and the effort involved in coding the ``asdom()`` transformation has paid off many times over in the simplicity of the doc tree creation code, like ``node = nodes.list()``. > I view blockquotes as basically a way of displaying a quotation, or something > similar, without the "..." around it. Correct. > As such, it would be a pretty rare thing. I've used block quotes many times. For example, see my 2001-10-17 Doc-SIG post, "horizontal rules & text divisions". Used block quotes twice. > This isn't the sort of thing I'd want to use often... I decided to include block quotes in reStructuredText early on. StructuredText and Setext didn't have them; they both used simple indentation for structural purposes (sections). I believe block quotes are a generally useful construct. I'm *always* quoting stuff. I may not be a typical user, but then again I have an "in" with the guy who wrote the spec. ;-) > Maybe the blockquote element in the DPS model should be redefined (or > just better defined) to clarify the intended use. I'm starting to write a document defining the roles of each of the DPS doc tree elements, independently of the markup syntax. I've just barely begun. It's available at http://docstring.sourceforge.net/spec/doctree.txt. > I can see a number of possibilities: > > - It is intended for block quotations, and so the HTML <blockquote> and > LaTeX quote environments are appropriate. I can't see this form > getting much use. I can. And that is the indended role. > - It is for general text which is indented on both left and right. ... > - It is for text indented on the left only. ... These are presentation issues, not descriptive markup ones. > One thing that has already become clear to me is that it will be *far* > easier to write output processors for "structural" markup languages > (HTML, (La)TeX, DocBook, Texinfo, etc) than for "layout" oriented > languages (PDF, PostScript, etc). Sufficiently so that I doubt it will > ever be realistic to go direct to such formats - you'd have to implement > line and page breaking algorithms, etc, etc. If we have a TeX Writer, PostScript and PDF are almost free. Can't wait! -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Thu Oct 25 04:49:14 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 24 Oct 2001 23:49:14 -0400 Subject: [Doc-SIG] Re: Output writers and XSLT In-Reply-To: <p4cett0p3a1j4n5qcbnmof6avia656pplb@4ax.com> Message-ID: <B7FD03F9.195CF%goodger@users.sourceforge.net> Paul Moore wrote: > we can *already* get XML > out of reStructuredText. Given that, can we not use XSLT to generate > whatever output formats we need? Sure can. Remi Bertholet, Paul Wright, and Alan Jaffray already have. See the reStructuredText sandbox for their .xsl style sheets. > Is that a viable general approach? Maybe. Depends what you mean by "viable general". > Does Python include XSLT processing modules? In the PyXML package, maybe? In PyXML, yes, I believe so. But not in the standard library yet. Which rules it out as a general solution for now. Eventually, I want the "docutils" (umbrella package including DPS or a renaming of DPS) to be part of the standard library, so it can't use stuff from outside. If PyXML were to be included in the stdlib, though... -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From tony@lsl.co.uk Thu Oct 25 09:45:12 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Thu, 25 Oct 2001 09:45:12 +0100 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <B7FD0095.195CC%goodger@users.sourceforge.net> Message-ID: <002c01c15d31$5c04be80$545aa8c0@lslp862.int.lsl.co.uk> David Goodger and Paul Moore were discussing various things: David: > DOM *is* useful though. The reason DOM *is* used is because > it *is* generic. I rather like DOM - it's a bit clunky, but it is generic, and it provides a useful (to me) hook for thinking about (some properties of) XML [the fun, of course, is when you're thinking of XML as tree and "list" at the same time!]. But it is deliberately made so that one does *not* subclass its elements - when constructing a DOM tree you have (well, that's the theory) to ask the document instance to give you new nodes, and so one is restricted to the classes that are already defined [1]_. This means that, as David says, you have to use attribution to discriminate between elements, which is not very convenient for *specific* programming. .. [1] there are attempts (sorry, don't have a reference to hand) to automatically "mirror" a DOM tree by Python classes which have "nicer" names, I believe, but that has its own problems. David again: > The reStructuredText parser is *building* a document > tree piecemeal, and it's easier and more powerful to > say ``node = nodes.list()`` than it is to say > ``node = minidom.Element("list")``, As I'm currently working on code to build a DPS tree, I can vouch that it is easier *with* the DPS tree than it would be with DOM - not least because if I can abstract useful building concepts out of what I'm doing, I can submit a patch to David which will make it easier for everyone to do such stuff. Also, as I understand it, in a proper DOM you aren't meant to ask the package to construct an object for you - it is meant to be the document instance that knows how to do this. But I can't be bothered to look up the "correct manner". Then we move on to blockquotes: Paul (and then David) wrote: > > I view blockquotes as basically a way of displaying a > > quotation, or something similar, without the "..." > > around it. > > Correct. > > > As such, it would be a pretty rare thing. Depends entirely on one's application area! I still have plans to start a Wiki next year, using reST as its text format, following on from an apazine/fanzine I used to produce, which discusses matters relating to reading/books/etc. In *that* context, block quotes are quite likely to occur - both quoting external sources, and also quoting other people within discussions. *However*, notwithstanding the HTML "standard"s descrption, I've yet to see a browser that distinguishes the first two uses of block quotes - that is, quotation versus inset text - by using emphasis. It's another instance, I think, of browser writers following each other rather than the HTML text. Meanwhile, I happily use blockquotes a lot to inset text for "commentary" purposes - something that is difficult to convey otherwise (of course, I also use parentheses and em-dashes a lot in writing text, so perhaps I'm not a perfect examplar). David: > I may not be a typical user, but then again I have an "in" with the guy who > wrote the spec. ;-) !!!!! > I'm starting to write a document defining the roles of each > of the DPS doc tree elements, independently of the markup syntax. > I've just barely begun. > It's available at http://docstring.sourceforge.net/spec/doctree.txt. By the way, this is a wonderful thing to have started, even if it is in its early days yet. Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ Give a pedant an inch and they'll take 25.4mm (once they've established you're talking a post-1959 inch, of course) My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From tony@lsl.co.uk Thu Oct 25 09:48:38 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Thu, 25 Oct 2001 09:48:38 +0100 Subject: [Doc-SIG] Representation of Horizontal Rules In-Reply-To: <B7FCF53D.195CB%goodger@users.sourceforge.net> Message-ID: <002d01c15d31$d6c28ee0$545aa8c0@lslp862.int.lsl.co.uk> David Goodger wrote: > Is this misguided embrace of option 2 really just an attempt > to force me to choose option 3, in order to foil such a > blatant abuse of markup? Tricky! > But you can't fool me. I'm on to you... Aha - no, it was actually a sneaky attempt to *accept* option 2 so that I can subvert it for entirely the wrong purposes at a later date - but I'll have to be careful if you're getting *that* close... Tibs -- BikeCode0.2 http://www.tibsnjoan.co.uk/bikecode.html P: [Tibs] Tc B10 K:++ i29:30" h1.65m n1960 H+:~ v~ A+ M+ Rg- B: [AnthroTech] 3tRu U1c w37" Wr19:406 Mfr SAf bDh[Sachs]:C G3x7 8s Lrr1B Cb[Michael] VjsX col[MidnightBlue] T: [BurleyD'Lite] 2c2[Thomas] f++ VsX My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From fdrake@acm.org Thu Oct 25 16:56:18 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 25 Oct 2001 11:56:18 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011025155618.C21CA28697@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Experimental stylesheet changes -- code is now presented using proportional fonts (mostly since Guido really dislikes Courier). Please send comments on this change to me at python-docs@python.org. From gustav@morpheus.demon.co.uk Thu Oct 25 22:35:42 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Thu, 25 Oct 2001 22:35:42 +0100 Subject: [Doc-SIG] Re: Output writers and XSLT In-Reply-To: <B7FD03F9.195CF%goodger@users.sourceforge.net> References: <p4cett0p3a1j4n5qcbnmof6avia656pplb@4ax.com> <B7FD03F9.195CF%goodger@users.sourceforge.net> Message-ID: <hlogttcnakss6hrl2hb80aeu6cihla7kht@4ax.com> On Wed, 24 Oct 2001 23:49:14 -0400, David Goodger <goodger@users.sourceforge.net> wrote: >Paul Moore wrote: >> we can *already* get XML >> out of reStructuredText. Given that, can we not use XSLT to generate >> whatever output formats we need? > >Sure can. Remi Bertholet, Paul Wright, and Alan Jaffray already have. = See >the reStructuredText sandbox for their .xsl style sheets. Interesting... The XSL style sheets don't look over complicated, which is clearly a good thing. But it is another language to learn. >> Is that a viable general approach? > >Maybe. Depends what you mean by "viable general". By "general", I specifically meant can it be used for output formats other than HTML? By "viable", I was thinking of whether it would be available to everyone - which comes down to the next point... >> Does Python include XSLT processing modules? In the PyXML package, = maybe? > >In PyXML, yes, I believe so. But not in the standard library yet. Which >rules it out as a general solution for now. Eventually, I want the >"docutils" (umbrella package including DPS or a renaming of DPS) to be = part >of the standard library, so it can't use stuff from outside. If PyXML = were >to be included in the stdlib, though... In other words, packages aimed at the core cannot use technologies which are not part of the core, even if such technologies are available in Python, and are appropriate. Grmph. I see your point (although I wish I didn't). OK, so it sounds like XSLT is not a "viable general approach", although docutils can usefully be a good showcase for its capabilities. ("Look - we implemented an output processor in only 100 lines of XSLT code"). I'll continue to look at tree-walking solutions. BTW, in my view, the key output formats which need to be supported are HTML and at least one TeX derivative (probably either LaTeX or Texinfo). HTML for web output, and TeX for hardcopy (PDF via PDFTeX, PostScript, general printed formats via DVI). If and when we have these 2 formats, we can (IMHO) say we have the output side of things covered. Does this seem right? Paul (posting because it takes less brain power than coding...). From gustav@morpheus.demon.co.uk Thu Oct 25 22:35:44 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Thu, 25 Oct 2001 22:35:44 +0100 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <B7FD0095.195CC%goodger@users.sourceforge.net> References: <4e0ett030c7jqgvjs1s747o94jvoj4dhn9@4ax.com> <B7FD0095.195CC%goodger@users.sourceforge.net> Message-ID: <89qgttgkk7sh6t9s9vidrpeumt7tr73bd1@4ax.com> On Wed, 24 Oct 2001 23:46:10 -0400, David Goodger <goodger@users.sourceforge.net> wrote: >[David] >>> The document tree is meant to be an specific document/DTD/schema >>> implementation only, not a generic DOM. > >[Paul] >> I'm not sure I understand what you mean here. ... >> (what do you mean by a "generic DOM"?) > >DOM is a generic XML data structure. It contains an ``Element`` class = (among >others), whose instances represent all elements. If you want to store a >``list`` element, it would be an ``Element`` instance whose ``tagName`` >attribute was set to "list". It's not very useful from an = object-oriented >programming point of view; you have to switch on the ``tagName`` = attribute >instead of using polymorphism. Got you. That makes complete sense. But why does all my tree walking stuff (and Tibs') spend its time switching on tagname then? Actually, I know the answer to this - coding a "proper" object-oriented tree-walk is hard. It's the sort of thing the Visitor pattern is intended to handle, but as I pointed out in another message, Visitor relies on the right infrastructure being in place in the "visited" hierarchy - and designing that infrastructure is hard. It's probably significant that most of the classes in the DPS doc tree are of the form class paragraph(_TextElement): pass There's no polymorphism here. The class name is *only* relevant in setting the tagname attribute via introspection. In many ways, this tree isn't really object-oriented at all. Let me come back to this later_. >It's free, yes, but the cost is too high. It depends on how you want to >build the data structure, and what you want to do with the data = structure >once it's complete. In most XML-processing applications, you parse an >already-existing XML file to a data structure, for which DOM is a valid >choice. The reStructuredText parser is *building* a document tree = piecemeal, >and it's easier and more powerful to say ``node =3D nodes.list()`` than = it is >to say ``node =3D minidom.Element("list")``, especially when you can = customize >the ``nodes.list`` class with specialized behaviour. .. _later: OK, I see the point. I was looking at using the tree, not building it. I agree that building trees using DOM is verbose and clumsy (I've seen code for it before). So maybe building using specialised code, then using asdom() to get a DOM, which can then be processed by standard XML tools, is a valid approach. But as Juergen Hermann pointed out, asdom() is only one (trivial) example of a visitor pattern, so we probably need to factor out the visitor, and reimplement asdom in terms of it. I'll look at this. I do wonder about your comment "especially when you can customize the ``nodes.list`` class with specialized behaviour". Agreed, it's a valid advantage. But you don't *use* that advantage. The only significantly polymorphic aspects of nodes.py are the bits in support of asdom(). [The astext() method works polymorphically, but as I can't see where I might use this for other than a #text node, so making the polymorphism moot, I'm discounting this]. This isn't to criticise your design. I'm still trying to get a handle on it from an output point of view. I had started from the assumption that the DPS doc tree was pretty much inviolate, and I should work with it as it stands. It looks like there are probably changes needed to support output. I'm a bit nervous about fiddling with something that central, though... >As for processing the data structure once complete, I haven't done much = yet >but I'm sure there will be advantages if it's made up of custom objects. That's a fairly clear confirmation that you believe that there is a need to incorporate changes in support of output :-) Paul. [I'll comment on the blockquote stuff separately] From gustav@morpheus.demon.co.uk Thu Oct 25 22:35:46 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Thu, 25 Oct 2001 22:35:46 +0100 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <B7FD0095.195CC%goodger@users.sourceforge.net> References: <4e0ett030c7jqgvjs1s747o94jvoj4dhn9@4ax.com> <B7FD0095.195CC%goodger@users.sourceforge.net> Message-ID: <m6sgtt456rqqfatpcbn6gfd81enp0i40b2@4ax.com> On Wed, 24 Oct 2001 23:46:10 -0400, David Goodger <goodger@users.sourceforge.net> wrote: >> I view blockquotes as basically a way of displaying a quotation, or = something >> similar, without the "..." around it. > >Correct. OK. >> As such, it would be a pretty rare thing. > >I've used block quotes many times. For example, see my 2001-10-17 = Doc-SIG >post, "horizontal rules & text divisions". Used block quotes twice. =46air enough. It's a style thing. Your call, I guess. I suppose that as = I tend to think of reST in the context of E-Mail (in this group), quotes aren't as common (because the standard E-Mail ">" quoting convention takes its place). >If we have a TeX Writer, PostScript and PDF are almost free. Can't wait! Your wish is my command... Paul. From gustav@morpheus.demon.co.uk Thu Oct 25 22:35:47 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Thu, 25 Oct 2001 22:35:47 +0100 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <002c01c15d31$5c04be80$545aa8c0@lslp862.int.lsl.co.uk> References: <B7FD0095.195CC%goodger@users.sourceforge.net> <002c01c15d31$5c04be80$545aa8c0@lslp862.int.lsl.co.uk> Message-ID: <nqugtt04ltco4rpidbi56tkorvdbjf4l1m@4ax.com> On Thu, 25 Oct 2001 09:45:12 +0100, "Tony J Ibbs (Tibs)" <tony@lsl.co.uk> wrote: >Meanwhile, I happily use blockquotes a lot to inset text for >"commentary" purposes - something that is difficult to convey otherwise >(of course, I also use parentheses and em-dashes a lot in writing text, >so perhaps I'm not a perfect examplar). That's going to come back to haunt you if another output processor formats blockquotes more "quotishly". My point was the reverse. I use indented text a lot in plain text, for things like lists. I'm happy for reST processors to indent lists automatically for me, but I don't like to lose the ability to format reST source for readability. I prefer:: This is how I would normally format a list in plain text: 1. First item 2. Second item to this:: This is how reST requires it, if I am to avoid spurious blockquotes: 1. First item 2. Second item Maybe simply defining a special case - a blockquote which contains nothing but a list, is treated as a simple list, without the blockquote. Of course this makes it impossible to write a blockquote containing only a list, and it *is* a special case, which is bad in itself. I'm focussing on the fact that reST should be readable in its raw form, as well as after processing. Maybe that's not the best view to take, but I'm not sure how often reST will be read unprocessed. At the moment, 99.999% of the reST I see is unprocessed (not just due to the lack of output processors - for example, I would never bother putting an E-Mail through a formatter, I'd read it "raw". And I *like* using reST in E-Mail - it enhances the expressiveness). I dunno. Maybe it's simply a case where you have to accept that reST can't be expected to handle "pure" plain text without *any* concessions to markup needs... Paul. From goodger@users.sourceforge.net Fri Oct 26 01:36:07 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 25 Oct 2001 20:36:07 -0400 Subject: [Doc-SIG] Re: Output writers and XSLT In-Reply-To: <hlogttcnakss6hrl2hb80aeu6cihla7kht@4ax.com> Message-ID: <B7FE2836.196E9%goodger@users.sourceforge.net> [David;] >> If PyXML were to be included in the stdlib, though... [Paul;] > In other words, packages aimed at the core cannot use technologies which > are not part of the core, even if such technologies are available in > Python, and are appropriate. Grmph. I see your point (although I wish I > didn't). I would certainly support a push for inclusion in the core. I can't be the champion though; enough on my plate as it is. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From fdrake@acm.org Fri Oct 26 04:13:46 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 25 Oct 2001 23:13:46 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011026031346.8550628697@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Yet another experimental change to the presentation -- ditch the proportional font for code since some displays (especially interactive sessions and tabular displays) get messed up with proportional fonts. We do try to use monospaced fonts that are less ugly than Courier. As before, feedback on the fonts is welcome at python-docs@python.org. From goodger@users.sourceforge.net Fri Oct 26 04:21:55 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 25 Oct 2001 23:21:55 -0400 Subject: [Doc-SIG] How to traverse a document object In-Reply-To: <89qgttgkk7sh6t9s9vidrpeumt7tr73bd1@4ax.com> Message-ID: <B7FE4F12.196EF%goodger@users.sourceforge.net> Paul Moore wrote: > But why does all my tree walking stuff (and Tibs') spend its time switching > on tagname then? Actually, I know the answer to this - coding a "proper" > object-oriented tree-walk is hard. I don't know about that. I've got the inkling of an elegant structure glimmering in my mind (including a judicious use of a __getattr__ method for default behaviour). I just haven't needed it yet. Maybe you can beat me to it! > It's probably significant that most of the classes in the DPS doc tree > are of the form > > class paragraph(_TextElement): pass > > There's no polymorphism here. The class name is *only* relevant in > setting the tagname attribute via introspection. So far, perhaps so. There's a lot of infrastructure to be added yet. > I do wonder about your comment "especially when you can customize > the ``nodes.list`` class with specialized behaviour". Agreed, it's a > valid advantage. But you don't *use* that advantage. Not *yet*. > I had started from the assumption that the DPS doc tree was pretty much > inviolate, and I should work with it as it stands. Aha! Nothing could be further from the truth. You must leave such assumptions behind! > It looks like there are probably changes needed to support output. I'm > a bit nervous about fiddling with something that central, though... That's where SourceForge patches and peer review work their magic. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Fri Oct 26 04:25:38 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Thu, 25 Oct 2001 23:25:38 -0400 Subject: [Doc-SIG] lists & block quotes In-Reply-To: <m6sgtt456rqqfatpcbn6gfd81enp0i40b2@4ax.com> Message-ID: <B7FE4FF1.196EF%goodger@users.sourceforge.net> Paul Moore wrote: > I prefer:: > > This is how I would normally format a list in plain text: > > 1. First item > 2. Second item > > to this:: > > This is how reST requires it, if I am to avoid spurious blockquotes: > > 1. First item > 2. Second item Ah. You seem to see lists as "belonging" to the referring paragraph. That's one interpretation, perfectly valid. Representing that idea properly and completely makes for tricky content models and processing though. > I'm focussing on the fact that reST should be readable in its raw form, > as well as after processing. Maybe that's not the best view to take, but > I'm not sure how often reST will be read unprocessed. Are you implying that reStructuredText *will* or *will not* be read much in its unprocessed form? If the latter, I don't think there's an issue. In typical HTML renderers, a list is indented relative to the text before & after. That could be the reason why you prefer lists indented. > I dunno. Maybe it's simply a case where you have to accept that reST > can't be expected to handle "pure" plain text without *any* concessions > to markup needs... It's a balancing act. Can't please everybody all of the time. This is one area where the balance has been properly struck though, I think. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Fri Oct 26 05:30:38 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Fri, 26 Oct 2001 00:30:38 -0400 Subject: [Doc-SIG] inline hyperlink targets Message-ID: <B7FE5F2C.196F0%goodger@users.sourceforge.net> While working on the outline of doctree.txt_, I wanted to categorize the elements. My first try was with a nested list structure:: - Root element: document_ - Body elements: - General body elements: paragraph_, literal_block_, block_quote_, doctest_block_, table_, figure_, footnote_ - Lists: bullet_list_, enumerated_list_, definition_list_, field_list_, option_list_ But I wanted to be able to refer back to each item from other parts of the document. That made me change the structure to one of nested sections:: Root Element ============ document_ Body Elements ============= General Body Elements --------------------- paragraph_, literal_block_, block_quote_, doctest_block_, table_, figure_, footnote_ Lists ----- bullet_list_, enumerated_list_, definition_list_, field_list_, option_list_ This solution rankles. Why let the markup determine the structure of my writing? That's backwards. But the best I could do with the current syntax would be to include a bunch of hyperlink targets. And in order not to break up the list, the targets would have to be inside the list items:: - .. _root element: Root element: document_ - .. _body element: Body elements: - .. _general body elements: General body elements: paragraph_, literal_block_, block_quote_, doctest_block_, table_, figure_, footnote_ - .. _lists: Lists: bullet_list_, enumerated_list_, definition_list_, field_list_, option_list_ Very awkward. Then I came up with the idea of allowing explicit hyperlink targets as inline markup:: - _`Root element`: document_ - _`Body elements`: - _`General body elements`: paragraph_, literal_block_, block_quote_, doctest_block_, table_, figure_, footnote_ - _`Lists`: bullet_list_, enumerated_list_, definition_list_, field_list_, option_list_ This seems to work. It can be rationalized as a natural consequence of the rest of reStructuredText's hyperlink syntax. I have two concerns: - This markup seems a little too noisy at first. But that may be because it's coming at the beginning of a list item, and is followed by a colon (which is *not* part of the markup). An example in the middle of a paragraph helps to alleviate this concern:: The _`quick brown fox` jumped over the lazy dog. - Should the backquotes be required, even for single-word targets? Initially I'd say yes, because leading-underscore terms are so common in code, and in the documentation of said code as well. Trailing-underscore terms are much less common. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From garth@deadlybloodyserious.com Fri Oct 26 06:34:30 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Fri, 26 Oct 2001 15:34:30 +1000 Subject: [Doc-SIG] inline hyperlink targets In-Reply-To: <B7FE5F2C.196F0%goodger@users.sourceforge.net> Message-ID: <NBBBIJGOIKKLHHFHILDNCELIKHAA.garth@deadlybloodyserious.com> > Very awkward. Then I came up with the idea of allowing > explicit hyperlink targets as inline markup:: > > - _`Root element`: document_ Oh, cute. +1. Also fantastic for automatic construction of indexes -- whenever you mention a term for the first time, _`blargle` it. From gustav@morpheus.demon.co.uk Fri Oct 26 20:02:00 2001 From: gustav@morpheus.demon.co.uk (Paul Moore) Date: Fri, 26 Oct 2001 20:02:00 +0100 Subject: [Doc-SIG] Re: lists & block quotes In-Reply-To: <B7FE4FF1.196EF%goodger@users.sourceforge.net> References: <m6sgtt456rqqfatpcbn6gfd81enp0i40b2@4ax.com> <B7FE4FF1.196EF%goodger@users.sourceforge.net> Message-ID: <kbcjttgnq8vig2vet28d402vmfi707sdio@4ax.com> On Thu, 25 Oct 2001 23:25:38 -0400, David Goodger <goodger@users.sourceforge.net> wrote: >Ah. You seem to see lists as "belonging" to the referring paragraph. = That's >one interpretation, perfectly valid. Representing that idea properly and >completely makes for tricky content models and processing though. Exactly. When I write lists, they are usually preceded by "lead-in" text, so viewing the list as attached to the preceding paragraph seems exactly right. But I take your point that it's tricky to model properly. And your implication that there is an alternative interpretation, with lists as independent entities, explains where my ideas and yours are clashing. I'll try to start thinking differently about it, and see where that takes me. >> I'm focussing on the fact that reST should be readable in its raw = form, >> as well as after processing. Maybe that's not the best view to take, = but >> I'm not sure how often reST will be read unprocessed. > >Are you implying that reStructuredText *will* or *will not* be read much= in >its unprocessed form? If the latter, Sorry, I wasn't clear. I'm implying that reStructuredText *will* be read in its raw form quite a lot. At the very least, the *author* will spend most of his time reading the raw form he's just typed... In raw form, lists aren't indented unless you type them that way. So my interpretation of list structure results in me tending to type things wrongly (or at least, in a way which gives the wrong document structure). >It's a balancing act. Can't please everybody all of the time. This is = one >area where the balance has been properly struck though, I think. Agreed. Both on the fact that it's a balancing act, and on the fact that you've hit the right balance - now that I understand your logic. Thanks for following this through with me. Paul. From tavis@calrudd.com Fri Oct 26 21:28:22 2001 From: tavis@calrudd.com (Tavis Rudd) Date: Fri, 26 Oct 2001 13:28:22 -0700 Subject: [Doc-SIG] font change in the development version of the docs Message-ID: <01102613282200.01979@lucy> Hi, I've just noticed that the font used to list module names, source snippets and method/function names has been changed in the devel version of the docs. IMHO, the old font was much more readable. Also, will this change affect the output from the mkhowto script? If it will there should be an option to use the original fonts. Cheers, Tavis From jaffray@pobox.com Mon Oct 29 02:20:45 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Sun, 28 Oct 2001 21:20:45 -0500 (EST) Subject: [Doc-SIG] reStructuredText inline markup Message-ID: <Pine.LNX.4.21.0110180226580.3248-100000@starchild.astral.net> In the current reStructuredText specification, inline markup can't be nested and is not extensible. IMHO this is a problem. Nesting is needed in some very simple cases:: Most recent interpretation of the Second Amendment has been based on `*USA vs. Miller* (1939)`__. __ http://caselaw.lp.findlaw.com/scripts/getcase.pl?court=us&vol=307&invol=174 Extensibility beyond the current "roles" mechanism is also necessary. For example, one of my intended applications for rST currently has a frequently-used tag to refer to another user of the system:: I had lunch with <lj user="tikva" text="Rachel"> today. It's a nice semantically meaningful output-format-neutral element. I don't know how I'd do something equivalent in rST. If it weren't inline, I'd use a directive, and if it didn't have an argument, I'd use a role, but as it is, I think I'm stuck. The extensibility problem is relatively easy to address if you allow attributes or arguments in roles:: I had lunch with :lj user=tikva:`Rachel` today. But to me, that syntax looks confusing. I was never too fond of the `` :rolename:`text` `` syntax in the first place, and adding spaces makes it more confusing. I'd like to get something more block-looking to go around the rolename and arguments. I'd also like to require that it be postfix, both for simplicity, and to follow the general principle that the text should be primary and the way it's marked up should be an aside. :: I had lunch with `Rachel`{lj user=tikva} today. My guess is that most non-markup uses for braces are in programming language examples, and most of those will be in literal blocks or inline literals or interpreted text, so stealing the braces won't lead to too much extra escaping. The other problem is that lots of inline markup is obtrusive and ugly:: I had lunch with `Rachel`{lj user=tikva icon="badger.png"} today. If we think of these as inline directives, we can steal a play from the hyperlink book, and drag the role out-of-line:: I had lunch with `Rachel`{_} today. .. _Rachel: {lj user=tikva icon="badger.png"} There's even the possibility of:: I had lunch with `Rachel`{__} today. __ {lj user=tikva icon="badger.png"} I'm not sure about nesting. Some cases are easy, but any time you have two constructs that involve backquotes, problems will ensue because backquotes don't nest. But backquotes are a really nice unobtrusive delimiter for sections of text, and it'd be a shame to replace them with braces or angle brackets or something. I have some bad ideas. Anyone have some good ideas? Alan From usc@ieee.org Mon Oct 29 20:10:54 2001 From: usc@ieee.org (Ueli Schl�pfer) Date: 29 Oct 2001 21:10:54 +0100 Subject: [Doc-SIG] reStructuredText inline markup In-Reply-To: Alan Jaffray's message of "Sun, 28 Oct 2001 21:20:45 -0500 (EST)" References: <Pine.LNX.4.21.0110180226580.3248-100000@starchild.astral.net> Message-ID: <m2snc2xl4h.fsf@hobbes.dyn.dhs.org> Alan Jaffray <jaffray@pobox.com> writes: > In the current reStructuredText specification, inline markup can't be > nested and is not extensible. IMHO this is a problem. > > Nesting is needed in some very simple cases:: > > Most recent interpretation of the Second Amendment has been based > on `*USA vs. Miller* (1939)`__. > > __ http://caselaw.lp.findlaw.com/scripts/getcase.pl?court=us&vol=307&invol=174 > I'll keep out of this battle -- but you should find plenty of material in the archive. > Extensibility beyond the current "roles" mechanism is also necessary. > For example, one of my intended applications for rST currently has a > frequently-used tag to refer to another user of the system:: > > I had lunch with <lj user="tikva" text="Rachel"> today. > > It's a nice semantically meaningful output-format-neutral element. > I don't know how I'd do something equivalent in rST. If it weren't > inline, I'd use a directive, and if it didn't have an argument, I'd > use a role, but as it is, I think I'm stuck. [...] Doesn't the current spec_ cover your wishes better than you seem to think? The spec says clearly that the role may be prefix or postfix, and that the interpretation is domain-dependent. .. _spec: *reStructuredText Markup Specification*, CVS version 1.22 of 2001/10/27 So nothing should keep you from supplying your own handler for the ``lj`` role and write:: I had lunch with `text=Rachel user=tikva`:lj: today. That's not more obtrusive than your example, :: I had lunch with <lj user="tikva" text="Rachel"> today. or is it? And it would be entirely up to you to use some mail-header inspired quoting mechanism, i.e.:: I had lunch with `Rachel <tikva>`:lj: today. All that you'd need to do in this case is to write a layer that interfaces with the rfc822 module! (Actually, to me this looks like a nice solution for your case, but there are obvious limits as to how much data you can stuff inline without getting something very obtrusive.) > I had lunch with `Rachel`{lj user=tikva} today. > > My guess is that most non-markup uses for braces are in programming > language examples, and most of those will be in literal blocks or > inline literals or interpreted text, so stealing the braces won't > lead to too much extra escaping. > > The other problem is that lots of inline markup is obtrusive and ugly:: > > I had lunch with `Rachel`{lj user=tikva icon="badger.png"} today. > > If we think of these as inline directives, we can steal a play from > the hyperlink book, and drag the role out-of-line:: > > I had lunch with `Rachel`{_} today. > > .. _Rachel: {lj user=tikva icon="badger.png"} How about:: I had lunch with `Rachel`_ today. .. _Rachel: .. lj:: user=tikva icon="badger.png" I'm not sure about the idented directive -- I could find no explicit exclusion, so I assume that it is correct syntax. But is it necessary, or would this do as well (similar to multiple targets for a hyperlink):: I had lunch with `Rachel`_ today. .. _Rachel: .. lj:: user=tikva icon="badger.png" > There's even the possibility of:: > > I had lunch with `Rachel`{__} today. > > __ {lj user=tikva icon="badger.png"} > [...] Just my 2 cents, Ueli From jaffray@pobox.com Mon Oct 29 21:10:31 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Mon, 29 Oct 2001 16:10:31 -0500 (EST) Subject: [Doc-SIG] reStructuredText inline markup In-Reply-To: <m2snc2xl4h.fsf@hobbes.dyn.dhs.org> Message-ID: <Pine.LNX.4.21.0110291527450.19098-100000@starchild.astral.net> On 29 Oct 2001, Ueli Schl=E4pfer wrote: > Alan Jaffray <jaffray@pobox.com> writes: >=20 > > Nesting is needed in some very simple cases:: > >=20 > > Most recent interpretation of the Second Amendment has been based= =20 > > on `*USA vs. Miller* (1939)`__. > >=20 > > __ http://caselaw.lp.findlaw.com/scripts/getcase.pl?court=3Dus&vo= l=3D307&invol=3D174 > >=20 >=20 > I'll keep out of this battle -- but you should find plenty of material > in the archive. The only comments I can find, searching on "nested inline doc-sig" or "nested inline restructuredtext", are David's remark of "that way lies madness" with no explanation and Tony's note that it'd be a pain to=20 implement and he wasn't certain how important it was. > Doesn't the current spec_ cover your wishes better than you seem to > think? The spec says clearly that the role may be prefix or postfix, > and that the interpretation is domain-dependent. >=20 > .. _spec: *reStructuredText Markup Specification*, CVS version 1.22 of > 2001/10/27 Is that valid? The reference processor says:: Warning: [level 1] Hyperlink target at line 5 contains whitespace. Pe= rhaps a footnote was intended? > So nothing should keep you from supplying your own handler for the > ``lj`` role and write:: >=20 > I had lunch with `text=3DRachel user=3Dtikva`:lj: today. >=20 > That's not more obtrusive than your example, :: >=20 > I had lunch with <lj user=3D"tikva" text=3D"Rachel"> today. >=20 > or is it? Indeed, but they're both awful. :-) > And it would be entirely up to you to use some mail-header inspired > quoting mechanism, i.e.:: >=20 > I had lunch with `Rachel <tikva>`:lj: today. That works well in this case with only one argument that isn't too distracting when placed inline and has an evocative shorthand. But imagine:: The `text=3D"biohazard" src=3D"biohazard.png" height=3D20 width=3D20`= :img: symbol must be used on containers used to dispose of medical waste. That really needs to get placed yanked out of the flow of text. :: The `biohazard`{_} symbol must be used on containers used to dispose of medical waste. .. _biohazard: {img src=3D"biohazard.png" height=3D20 width=3D20} Alternately, if you prefer something directive-like:: The `biohazard`:_: symbol must be used on containers used to dispose of medical waste. .. _biohazard:: img src=3D"biohazard.png" height=3D20 width=3D20 I don't like that for two reasons. First, it doesn't obey the usual ``directive_name:: directive_args`` pattern. Second, it looks odd=20 when you make the interpreted text into a hyperlink:: The `biohazard`:_:_ symbol ... My brain tokenizes that as ``:_ :_``, while it does the right thing with ``{_}_``, turning it into ``{_} _``. > How about:: >=20 > I had lunch with `Rachel`_ today. >=20 > .. _Rachel: > =09 > .. lj:: user=3Dtikva icon=3D"badger.png" Now we're getting into using that target syntax for not only hrefs and anchors, but also macro inclusion. rST doesn't currently have anything which includes referenced text at the current position in the document. Not necessarily a bad idea, and macro-including directives would solve the inline markup issue, but it's getting into new territory. Alan From fdrake@acm.org Tue Oct 30 06:23:32 2001 From: fdrake@acm.org (Fred L. Drake) Date: Tue, 30 Oct 2001 01:23:32 -0500 (EST) Subject: [Doc-SIG] [development doc updates] Message-ID: <20011030062332.77F4928697@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Re-arranged the material in Chapter 2 (built-in things) to make it easier to start off with. Functions come before Types and Exceptions, and file objects are promoted one level in the outline, making them easier to find (they now appear in the table of contents instead of being hidden in the "Other Objects" category). From garth@deadlybloodyserious.com Wed Oct 31 02:29:30 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Wed, 31 Oct 2001 13:29:30 +1100 Subject: [Doc-SIG] reStructuredText inline markup In-Reply-To: <Pine.LNX.4.21.0110291527450.19098-100000@starchild.astral.net> Message-ID: <NBBBIJGOIKKLHHFHILDNMEGNKIAA.garth@deadlybloodyserious.com> Yaaargh. Messy messy messy. I haven't read the spec for a while. What's wrong with user:`uid`? The filter/patterner/munger/renderer/whatever can substitute the full text for an appropriate link, right? > -----Original Message----- > From: doc-sig-admin@python.org [mailto:doc-sig-admin@python.org]On > Behalf Of Alan Jaffray > Sent: Tuesday, 30 October 2001 8:11 AM > To: usc@ieee.org > Cc: Alan Jaffray; Doc-SIG@python.org > Subject: Re: [Doc-SIG] reStructuredText inline markup > > > On 29 Oct 2001, Ueli Schl�pfer wrote: > > Alan Jaffray <jaffray@pobox.com> writes: > > > > > Nesting is needed in some very simple cases:: > > > > > > Most recent interpretation of the Second Amendment > has been based > > > on `*USA vs. Miller* (1939)`__. > > > > > > __ > http://caselaw.lp.findlaw.com/scripts/getcase.pl?court=us&vol= > 307&invol=174 > > > > > > > I'll keep out of this battle -- but you should find plenty > of material > > in the archive. > > The only comments I can find, searching on "nested inline doc-sig" or > "nested inline restructuredtext", are David's remark of "that way lies > madness" with no explanation and Tony's note that it'd be a pain to > implement and he wasn't certain how important it was. > > > Doesn't the current spec_ cover your wishes better than you seem to > > think? The spec says clearly that the role may be prefix > or postfix, > > and that the interpretation is domain-dependent. > > > > .. _spec: *reStructuredText Markup Specification*, CVS > version 1.22 of > > 2001/10/27 > > Is that valid? The reference processor says:: > > Warning: [level 1] Hyperlink target at line 5 contains > whitespace. Perhaps a footnote was intended? > > > So nothing should keep you from supplying your own handler for the > > ``lj`` role and write:: > > > > I had lunch with `text=Rachel user=tikva`:lj: today. > > > > That's not more obtrusive than your example, :: > > > > I had lunch with <lj user="tikva" text="Rachel"> today. > > > > or is it? > > Indeed, but they're both awful. :-) > > > And it would be entirely up to you to use some mail-header inspired > > quoting mechanism, i.e.:: > > > > I had lunch with `Rachel <tikva>`:lj: today. > > That works well in this case with only one argument that isn't too > distracting when placed inline and has an evocative shorthand. > But imagine:: > > The `text="biohazard" src="biohazard.png" height=20 width=20`:img: > symbol must be used on containers used to dispose of > medical waste. > > That really needs to get placed yanked out of the flow of text. :: > > The `biohazard`{_} symbol must be used on containers used > to dispose > of medical waste. > > .. _biohazard: {img src="biohazard.png" height=20 width=20} > > Alternately, if you prefer something directive-like:: > > The `biohazard`:_: symbol must be used on containers used > to dispose > of medical waste. > > .. _biohazard:: img src="biohazard.png" height=20 width=20 > > I don't like that for two reasons. First, it doesn't obey the usual > ``directive_name:: directive_args`` pattern. Second, it looks odd > when you make the interpreted text into a hyperlink:: > > The `biohazard`:_:_ symbol ... > > My brain tokenizes that as ``:_ :_``, while it does the right thing > with ``{_}_``, turning it into ``{_} _``. > > > How about:: > > > > I had lunch with `Rachel`_ today. > > > > .. _Rachel: > > > > .. lj:: user=tikva icon="badger.png" > > Now we're getting into using that target syntax for not only hrefs and > anchors, but also macro inclusion. rST doesn't currently > have anything > which includes referenced text at the current position in the > document. > Not necessarily a bad idea, and macro-including directives would solve > the inline markup issue, but it's getting into new territory. > > Alan > > > > > _______________________________________________ > Doc-SIG maillist - Doc-SIG@python.org > http://mail.python.org/mailman/listinfo/doc-sig From jaffray@pobox.com Wed Oct 31 03:28:43 2001 From: jaffray@pobox.com (Alan Jaffray) Date: Tue, 30 Oct 2001 22:28:43 -0500 (EST) Subject: [Doc-SIG] reStructuredText inline markup In-Reply-To: <NBBBIJGOIKKLHHFHILDNMEGNKIAA.garth@deadlybloodyserious.com> Message-ID: <Pine.LNX.4.21.0110302217590.1794-100000@starchild.astral.net> On Wed, 31 Oct 2001, Garth T Kidd wrote: > I haven't read the spec for a while. What's wrong with user:`uid`? Huh? That's not even a markup construct in the current spec. Nor does it capture "<TEXT> referring to user <USERNAME>", which was the simplest example I gave of useful inline markup that the current spec can't handle. Alan From goodger@users.sourceforge.net Wed Oct 31 05:01:38 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 31 Oct 2001 00:01:38 -0500 Subject: [Doc-SIG] nesting (was re: reStructuredText inline markup) In-Reply-To: <m2snc2xl4h.fsf@hobbes.dyn.dhs.org> Message-ID: <B804EFE1.1A337%goodger@users.sourceforge.net> [Alan] > In the current reStructuredText specification, inline markup can't > be nested and is not extensible. IMHO this is a problem. > > Nesting is needed in some very simple cases:: > > Most recent interpretation of the Second Amendment has been > based on `*USA vs. Miller* (1939)`__. > > __ http://caselaw.lp.findlaw.com/scripts/getcase.pl? > court=us&vol=307&invol=174 BTW, the ``__`` syntax is now implemented. There have been some changes to the doc tree though: "reference" replaces "link", and "target" now uses a "refuri" attribute to hold what used to be data (the data is now used for an inline target). > I'm not sure about nesting. Some cases are easy, but any time you > have two constructs that involve backquotes, problems will ensue > because backquotes don't nest. But backquotes are a really nice > unobtrusive delimiter for sections of text, and it'd be a shame to > replace them with braces or angle brackets or something. I have some > bad ideas. Anyone have some good ideas? [Ueli] > I'll keep out of this battle -- but you should find plenty of material > in the archive. [Alan] > The only comments I can find, searching on "nested inline doc-sig" > or "nested inline restructuredtext", are David's remark of "that way > lies madness" with no explanation and Tony's note that it'd be a > pain to implement and he wasn't certain how important it was. You didn't look back far enough, and your search may have been too specific (try searching for only "nest"). Try, for example, Ed Loper's 2001-03-21 post, which details some rules for nested inline markup. I think the complexity is prohibitive for the marginal benefit. (And if you can understand that tree without going mad, you're a better man than I. ;-) Inline markup is already fragile. Allowing nested inline markup would only be asking for trouble IMHO. If it proves absolutely necessary, it can be added later. The rules for what can appear inside what must be well thought out first though. -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From goodger@users.sourceforge.net Wed Oct 31 05:27:46 2001 From: goodger@users.sourceforge.net (David Goodger) Date: Wed, 31 Oct 2001 00:27:46 -0500 Subject: [Doc-SIG] extensibility (was re: reStructuredText inline markup) In-Reply-To: <Pine.LNX.4.21.0110291527450.19098-100000@starchild.astral.net> Message-ID: <B804F601.1A339%goodger@users.sourceforge.net> extensibility (was re: reStructuredText inline markup) [Alan] > Extensibility beyond the current "roles" mechanism is also > necessary. For example, one of my intended applications for rST > currently has a frequently-used tag to refer to another user of the > system:: > > I had lunch with <lj user="tikva" text="Rachel"> today. > > It's a nice semantically meaningful output-format-neutral element. It's also markup-heavy and very application specific. If you want that level of markup, you'll have to live with markup that looks like markup. > The extensibility problem is relatively easy to address if you allow > attributes or arguments in roles:: > > I had lunch with :lj user=tikva:`Rachel` today. Instead, think of the role as the tag name; you can parse the text between the backquotes however you like. > But to me, that syntax looks confusing. I was never too fond of the > `` :rolename:`text` `` syntax in the first place, and adding spaces > makes it more confusing. It's a last resort, for application-specific constructs. It's useful in moderation, but I hope that the ``:role:`text``` syntax is never abused. > I'd also like to require that it be postfix, both for simplicity, > and to follow the general principle that the text should be primary > and the way it's marked up should be an aside. :: > > I had lunch with `Rachel`{lj user=tikva} today. As Ueli pointed out, postfix is OK too. [Ueli] > And it would be entirely up to you to use some mail-header inspired > quoting mechanism, i.e.:: > > I had lunch with `Rachel <tikva>`:lj: today. [Alan] > That works well in this case with only one argument that isn't too > distracting when placed inline and has an evocative shorthand. And if this is the main or only use of interpreted text in your application, you could drop the role altogether:: I had lunch with `Rachel <tikva>` today. Maybe you could do a text to username lookup in your app:: I had lunch with `Rachel` today. [Alan] > But imagine:: > > The `text="biohazard" src="biohazard.png" height=20 > width=20`:img: symbol must be used on containers used to dispose > of medical waste. > > That really needs to get placed yanked out of the flow of text. What may be needed is the equivalent of SGML/XML's named entities. Something like this (syntax arbitrary and subject to debate):: The #biohazard# symbol must be used on containers used to dispose of medical waste. .. substitution:: biohazard text="biohazard" source="biohazard.png" height=20 width=20 (Although the "text=" attribute may be redundant and unnecessary.) ``#biohazard#`` would be replaced by whatever the ``substitution:: biohazard`` directive generates. Instead of ``#name#``, the syntax could be a variant of interpreted text too. Something like:: `name`@ or `name`# or `name`& [Alan] > If we think of these as inline directives, we can steal a play from > the hyperlink book, and drag the role out-of-line:: > > I had lunch with `Rachel`{_} today. > > .. _Rachel: {lj user=tikva icon="badger.png"} > > There's even the possibility of:: > > I had lunch with `Rachel`{__} today. > > __ {lj user=tikva icon="badger.png"} You do like curly braces, don't you? ;-) What is the end result of this markup? Is it going to end up as a hyperlink? Please show us how you would mark it up in HTML. [Ueli] > How about:: > > I had lunch with `Rachel`_ today. > > .. _Rachel: > > .. lj:: user=tikva icon="badger.png" > > I'm not sure about the idented directive -- I could find no explicit > exclusion, so I assume that it is correct syntax. It's not included in the three things allowed in a link block: nothing or empty (internal hyperlink target), URI (external target), reference (indirect target). I suppose a hyperlink target's link block could be generalized to allow a directive also ("custom hyperlink target"?), but I'm not sure if it's warranted for the general case. And even if directives were allowed inside hyperlink targets' link blocks, they would still need to produce a hyperlink target. Non-hyperlink cases wouldn't benefit. (Note that the way the syntax works now, the link block would have to immediately follow the target marker, with no blank line in-between, unlike Ueli's example.) [Ueli] > But is it necessary, or would this do as well (similar to multiple > targets for a hyperlink):: > > I had lunch with `Rachel`_ today. > > .. _Rachel: > .. lj:: user=tikva icon="badger.png" That depends on what the "lj" construct actually does. Looking at the above, I'd expect "Rachel" to be a reference to a picture or object where the ``.. _Rachel:`` target points. [Alan] > Now we're getting into using that target syntax for not only hrefs > and anchors, but also macro inclusion. rST doesn't currently have > anything which includes referenced text at the current position in > the document. Not necessarily a bad idea, and macro-including > directives would solve the inline markup issue, but it's getting > into new territory. It may be the right territory though. Generalizing further, the "substitution" directive could contain text, or a directive which resolves to text or an image or some other inline-compatible object. > it looks odd when you make the interpreted text into a hyperlink:: > > The `biohazard`:_:_ symbol ... > > My brain tokenizes that as ``:_ :_``, while it does the right thing > with ``{_}_``, turning it into ``{_} _``. That's *horrible*! This is supposed to be a readable markup. We must exercise restraint! [Ueli] > Doesn't the current spec_ cover your wishes better than you seem to > think? The spec says clearly that the role may be prefix or postfix, > and that the interpretation is domain-dependent. > > .. _spec: *reStructuredText Markup Specification*, CVS version 1.22 > of 2001/10/27 [Alan, referring to Ueli's ``.. _spec:`` construct above] > Is that valid? The reference processor says:: > > Warning: [level 1] Hyperlink target at line 5 contains > whitespace. Perhaps a footnote was intended? It probably should be a footnote. Notice how helpful the parser is? -- David Goodger goodger@users.sourceforge.net Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From tony@lsl.co.uk Wed Oct 31 10:20:46 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Wed, 31 Oct 2001 10:20:46 -0000 Subject: [Doc-SIG] reStructuredText inline markup In-Reply-To: <Pine.LNX.4.21.0110291527450.19098-100000@starchild.astral.net> Message-ID: <004601c161f5$b4885580$545aa8c0@lslp862.int.lsl.co.uk> Despite his saying he won't get involved, I like Ueli's analysis of possible ways around your problem. I'd add that I think Alan is maybe working to the wrong model - I think of reST as more like a simple text formatting language, rather than more like a data description language. Consider how one would derive (for instance) TeX code from it, and what that would look like on the page. Alan Jaffray said: > The only comments I can find, searching on "nested inline doc-sig" or > "nested inline restructuredtext", are David's remark of "that way lies > madness" with no explanation and Tony's note that it'd be a pain to > implement and he wasn't certain how important it was. As David says, there's more history than that. My failing memory (isn't it always) hints that ST has never supported nested inline markup (or at least, not predictably) - that may have been despite claims to the contrary. Don't know about setext. My own "prototype" system was very simple, and was re based, and I was leaving such stuff for later - whilst it is clearly *possible* to do nested inline markup with simple re usage (you recognise one level, split the text out and leave "markers" behind, then recognise the next level, etc., taking care not to introduce ordering problems), it's a pain (I knew broadly what to do, but the implementation would be that sort of fun that isn't, if you see what I mean, and there were other things to do). Trying to get three people (e.g., me, Edward Loper and David) to agree on what made *sense* in terms of nesting was difficult, too - what about edge cases like ``*Emphasis enclosing **bold***``? I'm sure David could write his parser to cope with nested inline markup, but the nest of snakes isn't so much in the implementation as in the description of what is allowed - it looked like it was always going to introduce potential unexpectedness. The "solution" of dropping **strong** emphasis was not acceptable, which leaves the other "solution" of leaving it alone until a strong case for the need for it is made, possibly with appropriate patches (to code *and* documentation). My initial reaction to that approach was one of dismay, but some intense thought quickly showed me that, in practice, I could easily live with the restriction, at least for the initial roll-out of reST and DPS. "simple is better than complex" Tibs -- BikeCode0.2 http://www.tibsnjoan.co.uk/bikecode.html P: [Tibs] Tc B10 K:++ i29:30" h1.65m n1960 H+:~ v~ A+ M+ Rg- B: [AnthroTech] 3tRu U1c w37" Wr19:406 Mfr SAf bDh[Sachs]:C G3x7 8s Lrr1B Cb[Michael] VjsX col[MidnightBlue] T: [BurleyD'Lite] 2c2[Thomas] f++ VsX My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.)