From fdrake@beowolf.digicool.com Mon Jul 2 16:37:45 2001 From: fdrake@beowolf.digicool.com (Fred Drake) Date: Mon, 2 Jul 2001 11:37:45 -0400 (EDT) Subject: [Doc-SIG] [maintenance doc updates] Message-ID: <20010702153745.B304B28929@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/maint-docs/ Updated to reflect the current state of the Python 2.1.1 maintenance release branch. From fdrake@acm.org Sat Jul 7 00:40:37 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 6 Jul 2001 19:40:37 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20010706234037.432972892B@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Lot's of updates! Mostly small style adjustments. Documentation for some new markup of the documentation has been added. There is a bunch of new content in the Python/C API manual. I have started describing the new interface to support high-performance profiling and tracing. Some of the PyObject_*() functions which are used in creating objects have been described and some related reference count information has been added as well. Some small corrections have also been made in the C API manual. The updates to this manual have not yet been checked in. From ralph@inputplus.demon.co.uk Sat Jul 7 11:17:44 2001 From: ralph@inputplus.demon.co.uk (Ralph Corderoy) Date: Sat, 7 Jul 2001 11:17:44 +0100 Subject: [Doc-SIG] Re: PEP 258: DPS Generic Implementation Details In-Reply-To: Message-ID: <200107071017.LAA04868@inputplus.demon.co.uk> Hi, > Choice of Docstring Format > ========================== > > Rather than force everyone to use a single docstring format, > multiple input formats are allowed by the processing system. A > special variable, __docformat__, may appear at the top level of a > module before any function or class definitions. Over time or > through decree, a standard format or set of formats should emerge. I read in another Doc PEP of the difficulty extracting structure from the plain text of a docstring. Of the various Wikis I found TWiki's plain text input format allows quite a lot of structure to be expressed whilst still being readable as plain text. Perhaps it would present some ideas. One thing that it uses to good effect is significant whitespace. Something familiar to all :-) Allowing multiple docstring formats, whilst perhaps least contentious, does seem a little like trying to please everyone. Ralph. PS. Not on mailing list. Sent there at request of Usenet posting. Please CC. From edcjones@erols.com Mon Jul 9 04:21:00 2001 From: edcjones@erols.com (Edward C. Jones) Date: Sun, 08 Jul 2001 23:21:00 -0400 Subject: [Doc-SIG] Essay on reference counts (long) Message-ID: <3B49231C.3040404@erols.com> To help me understand reference counts, I put together the following. You are welcome to include any form of any part of this in the Python documentation. Ed Jones EE is "Extending and Embedding the Python Interpreter". API is "Python/C API Reference Manual". Paragraphs starting with expressions like "{EE 1.10}" are very similar to paragraphs in the Python documentation. ========================================================================== ========================================================================== Metasummary Some of the Python source code documentation can be usefully duplicated in the API documentation. I find the "own", "borrow", "steal" metaphor confusing. I prefer to restrict the use of "reference" to the meaning "pointer to an PyObject". If Py_INCREF() has been called for a reference, the reference is "protected". From the Python coder's point of view, a reference is an object. Therefore it is reasonable to use "object" interchangeably with "reference". 1. Summary 1. Every Python object contains a reference counter which is incremented by Py_INCREF() and decremented by Py_DECREF(). If the counter becomes zero, Python might delete the object. 2. For every call to Py_INCREF(), there should eventually be a call to Py_DECREF(). Call Py_INCREF() for objects that you want to keep around for a while. Call Py_DECREF() when you are done with them. A pointer to an object that has been INCREFed is said to be "protected". 3. Most functions INCREF any object that should continue to exist after the function exits. This includes objects returned via arguments or a return statement. In these cases the calling function usually has responsibility for calling Py_DECREF(). The calling function can, in turn, pass responsibility for the DECREF to _its_ caller. 4. Some functions violate rule 3 by not INCREFing objects they return. The standard example is PyTuple_GetItem(). 5. It is not necessary to increment an object's reference count for every local variable that contains a pointer to an object. But don't leave objects unprotected if there is any chance that the object will be DECREFed elsewhere. 6. Most functions assume that each argument passed to them is a protected object. Therefore the code for the function does not INCREF the argument. 7. There are exactly two important functions that are exceptions to rule 6: PyTuple_SetItem() and PyList_SetItem(). These functions take over responsibility of the item passed to them -- even if they fail! 2. Background 1.1 Memory Problems with C/C++ {EE 1.10} In languages like C or C++, the programmer is responsible for dynamic allocation and deallocation of memory on the heap. In C, this is done using the functions malloc() and free(). In C++, the operators new and delete are used with essentially the same meaning; they are actually implemented using malloc() and free(), so we'll restrict the following discussion to the latter. {EE 1.10} Every block of memory allocated with malloc() should eventually be returned to the pool of available memory by exactly one call to free(). It is important to call free() at the right time. If a block's address is forgotten but free() is not called for it, the memory it occupies cannot be reused until the program terminates. This is called a memory leak. On the other hand, if a program calls free() for a block and then continues to use the block, it creates a conflict with re-use of the block through another malloc() call. This is called using freed memory. It has the same bad consequences as referencing uninitialized data -- core dumps, wrong results, mysterious crashes. {EE 1.10} Common causes of memory leaks are unusual paths through the code. For instance, a function may allocate a block of memory, do some calculation, and then free the block again. Now a change in the requirements for the function may add a test to the calculation that detects an error condition and can return prematurely from the function. It's easy to forget to free the allocated memory block when taking this premature exit, especially when it is added later to the code. Such leaks, once introduced, often go undetected for a long time: the error exit is taken only in a small fraction of all calls, and most modern machines have plenty of virtual memory, so the leak only becomes apparent in a long-running process that uses the leaking function frequently. Therefore, it's important to prevent leaks from happening by having a coding convention or strategy that minimizes this kind of errors. {EE 1.10} Since Python makes heavy use of malloc() and free(), it needs a strategy to avoid memory leaks as well as the use of freed memory. The chosen method is called reference counting. The principle is simple: every object contains a counter, which is incremented when a pointer to the object is stored somewhere, and which is decremented when the pointer is deleted. When the counter reaches zero, the last pointer to the object has been deleted and the object is freed. 1.2 Python Objects {object.h} Python objects are structures allocated on the heap. They have type "PyObject". They are accessed through pointers of type "PyObject*". Special rules apply to the use of objects to ensure they are properly garbage-collected. Objects are never allocated statically or on the stack; they must be accessed through special macros and functions only. (Type objects are exceptions to the first rule; the standard types are represented by statically initialized type objects.) {object.h} An object has a 'reference count' that is increased or decreased when a pointer to the object is copied or deleted; when the reference count reaches zero there are no references to the object left and it can be removed from the heap. Every Python object has a reference count and a type (and possibly more). Here is the relevant code from "object.h", one of the Python header files. "PyObject" is a C struct. Each instance of "PyObject" contains the variable "ob_refcnt" where the "reference count" is kept and "ob_type" where the type is kept. #define PyObject_HEAD \ int ob_refcnt; \ struct _typeobject *ob_type; typedef struct _object { PyObject_HEAD } PyObject; {object.h} The "type" of an object determines what it represents and what kind of data it contains. An object's type is fixed when it is created. Types themselves are represented as objects; an object contains a pointer to the corresponding type object. The type itself has a type pointer pointing to the object representing the type 'type', which contains a pointer to itself!). {object.h} Objects do not float around in memory; once allocated an object keeps the same size and address. Objects that must hold variable-size data can contain pointers to variable-size parts of the object. Not all objects of the same type have the same size; but the size cannot change after allocation. (These restrictions are made so a reference to an object can be simply a pointer -- moving an object would require updating all the pointers, and changing an object's size would require moving it if there was another object right next to it.) {object.h} Objects are always accessed through pointers of the type 'PyObject *'. The type 'PyObject' is the structure shown above that only contains the reference count and the type pointer. The actual memory allocated for an object contains other data that can only be accessed after casting the pointer to a pointer to a longer structure type. This longer type must start with the reference count and type fields; the macro PyObject_HEAD above should be used for this (to accommodate for future changes). The implementation of a particular object type can cast the object pointer to the proper type and back. {object.h} A standard interface also exists for objects that contain an array of items whose size is determined when the object is allocated. It looks like #define PyObject_VAR_HEAD \ PyObject_HEAD \ int ob_size; /* Number of items in variable part */ typedef struct _object { PyObject_HEAD } PyObject; If the reference count for some object becomes zero, Python will usually reclaim the memory the object uses, making the object disappear. If an object is used in a piece of code, the code should make sure that the reference count cannot drop to zero, usually by adding one to it. If the reference count never becomes zero, the memory is never reclaimed. If it was not intended for the object to be permanent, there is a memory leak. 3. Py_INCREF() and Py_DECREF() {object.h} The macros Py_INCREF(op) and Py_DECREF(op) are used to increment or decrement reference counts. Py_DECREF() also calls the object's deallocator function; for objects that don't contain references to other objects or heap memory this can be the standard function free(). Both macros can be used wherever a void expression is allowed. The argument shouldn't be a NIL pointer. {object.h} We assume that the reference count field can never overflow; this can be proven when the size of the field is the same as the pointer size but even with a 16-bit reference count field it is pretty unlikely so we ignore the possibility. To prevent memory leaks, corresponding to each call to Py_INCREF(), there must be a call to Py_DECREF(): for each call to Py_INCREF(), there is a "responsibility" to call Py_DECREF(). This responsibility can be passed around between functions. The last user of the reference must call Py_DECREF(). 4. When to Use Py_INCREF() and Py_DECREF() 4.1 Objects Returned from Functions Most Python objects in C code are created by calls to functions in the "Python/C API". These functions have prototypes that look like: PyObject* Py_Something(arguments); These functions usually (but not always!) call Py_INCREF() before returning (a pointer to) the new object. Generally, the function that called PySomething has the responsibility to call Py_DECREF(). If, in turn, the function returns the object to its caller, it passes on the responsibility for the reference. Some things are best understood by example pseudo-code: void MyCode(arguments) { PyObject* pyo; ... pyo = Py_Something(args); MyCode has responsibility for the reference passed to it by Py_Something. When MyCode is done with "pyo", it must call: Py_DECREF(pyo); On the other hand, if MyCode returns "pyo", there must not be a call to Py_DECREF(). PyObject* MyCode(arguments) { PyObject* pyo; ... pyo = Py_Something(args); ... return pyo; } n this situation, MyCode has "passed on the responsibility" for DECREFing the reference. Note: if a function is to return None, the C code should look like: Py_INCREF(Py_None); return Py_None; Remember to INCREF Py_None! So far, only the most common case has been discussed, where "Py_Something" creates a reference and passes responsibility for it to the caller. In the Python documentation, this is called a "new reference". For example the documentation says: PyObject* PyList_New(int len) Return value: New reference. Returns a new list of length len on success, or NULL on failure. The documentation uses the word "reference" is two closely related ways: a pointer to a PyObject and the responsibility to DECREF the object. I will try to use "reference" in the first sense only. When a reference has been INCREFed, I prefer to say that the reference is "protected". I will often get sloppy and say "object" when I mean "reference". (After all, what a C programmer sees as a "PyObject*" the Python programmer sees as an "object".) But sometimes the Python source code DOES NOT CALL Py_DECREF(): PyObject * PyTuple_GetItem(register PyObject *op, register int i) { if (!PyTuple_Check(op)) { PyErr_BadInternalCall(); return NULL; } if (i < 0 || i >= ((PyTupleObject *)op) -> ob_size) { PyErr_SetString(PyExc_IndexError, "tuple index out of range"); return NULL; } return ((PyTupleObject *)op) -> ob_item[i]; } In the documentation, this is referred to as "borrowing" a reference: PyObject* PyTuple_GetItem(PyObject *p, int pos) Return value: Borrowed reference. Returns the object at position pos in the tuple pointed to by p. If pos is out of bounds, returns NULL and sets an IndexError exception. I prefer to say that the the reference (in the sense I use it) is left "unprotected". Functions returning unprotected referencess (borrowing a reference) are: PyTuple_GetItem(), PyList_GetItem(), PyList_GET_ITEM(), PyList_SET_ITEM(), PyDict_GetItem(), PyDict_GetItemString(), PyErr_Occurred(), PyFile_Name(), PyImport_GetModuleDict(), PyModule_GetDict(), PyImport_AddModule(), PyObject_Init(), Py_InitModule(), Py_InitModule3(), Py_InitModule4(), and PySequence_Fast_GET_ITEM(). {EE 10.1.2} The function PyImport_AddModule() does not INCREF the reference it returns even though it may actually create the object the reference refers to: this is possible because the object is INCREFed when it is stored in sys.modules. See also PyArg_ParseTuple() in Extending and Embedding, 1.7. This function sometimes returns PyObjects back to the caller through its arguments. An example from sysmodule.c is: static PyObject * sys_getrefcount(PyObject *self, PyObject *args) { PyObject *arg; if (!PyArg_ParseTuple(args, "O:getrefcount", &arg)) return NULL; return PyInt_FromLong(arg->ob_refcnt); } "arg" is an unprotected object. It should not be DECREFed before leaving the function (because it was never INCREFed!). {API 1.2.1.1} Here is an example of how you could write a function that computes the sum of the items in a list of integers, here using PySequence_GetItem() and later using PyList_GetItem(). long sum_sequence(PyObject *sequence) { int i, n; long total = 0; PyObject *item; n = PySequence_Length(sequence); if (n < 0) return -1; /* Has no length. */ /* Caller should use PyErr_Occurred() if a -1 is returned. */ for (i = 0; i < n; i++) { /* PySequence_GetItem INCREFs item. */ item = PySequence_GetItem(sequence, i); if (item == NULL) return -1; /* Not a sequence, or other failure */ if (PyInt_Check(item)) total += PyInt_AsLong(item); Py_DECREF(item); } return total; } 4.1.1 When to be sloppy with unINCREFed objects {API 1.2.1} It is not necessary to increment an object's reference count for every local variable that contains a pointer to an object. In theory, the object's reference count should be increased by one when the variable is made to point to it and decreased by one when the variable goes out of scope. However, these two cancel each other out, so at the end the reference count hasn't changed. The only real reason to use the reference count is to prevent the object from being deallocated as long as our variable is pointing to it. If we know that there is at least one other reference to the object that lives at least as long as our variable, there is no need to increment the reference count temporarily. {API 1.2.1} An important situation where this arises is for objects that are passed as arguments to C functions in an extension module that are called from Python; the call mechanism guarantees to hold a reference to every argument for the duration of the call. Here is the "sum_list" example again, this time using PyList_GetItem(). long sum_list(PyObject *list) { int i, n; long total = 0; PyObject *item; n = PyList_Size(list); if (n < 0) return -1; /* Not a list */ /* Caller should use PyErr_Occurred() if a -1 is returned. */ for (i = 0; i < n; i++) { /* PyList_GetItem does not INCREF "item". "item" is unprotected. */ item = PyList_GetItem(list, i); /* Can't fail */ if (PyInt_Check(item)) total += PyInt_AsLong(item); } return total; } 4.1.2 Thin Ice: When not to be sloppy with INCREF Don't leave an object unprotected if there is any chance that it will be DECREFed. {API 1.2.1} Subtle bugs can occur when when a reference is unprotected. A common example is to extract an object from a list and hold on to it for a while without incrementing its reference count. Some other operation might conceivably remove the object from the list, decrementing its reference count and possible deallocating it. The real danger is that innocent-looking operations may invoke arbitrary Python code which could do this; there is a code path which allows control to flow back to the Python user from a Py_DECREF(), so almost any operation is potentially dangerous. For example: bug(PyObject *list) { PyObject *item = PyList_GetItem(list, 0); PyList_SetItem(list, 1, PyInt_FromLong(0L)); PyObject_Print(item, stdout, 0); /* BUG! */ } {EE 1.10.3} The PyObject "item" is gotten using PyList_GetItem and left unprotected. The code then replaces list[1] with the value 0, and finally prints "item". Looks harmless, right? But it's not! {EE 1.10.3} Let's follow the control flow into PyList_SetItem(). The list has protected references to all its items, so when item 1 is replaced, it has to dispose of the original item 1 by DECREFing it. Now let's suppose the original item 1 was an instance of a user-defined class, and let's further suppose that the class defined a __del__() method. If this class instance has a reference count of 1, disposing of it will call its __del__() method. {EE 1.10.3} Since it is written in Python, the __del__() method can execute arbitrary Python code. Could it perhaps do something to invalidate the reference to item in bug()? You bet! Assuming that the list passed into bug() is accessible to the __del__() method, it could execute a statement to the effect of "del list[0]", and assuming this was the last reference to that object, it would free the memory associated with it, thereby invalidating item. {EE 1.10.3} The solution, once you know the source of the problem, is easy: temporarily increment the reference count. The correct version of the function reads: no_bug(PyObject *list) { PyObject *item = PyList_GetItem(list, 0); Py_INCREF(item); /* Protect item. */ PyList_SetItem(list, 1, PyInt_FromLong(0L)); PyObject_Print(item, stdout, 0); Py_DECREF(item); } {EE 1.10.3} This is a true story. An older version of Python contained variants of this bug and someone spent a considerable amount of time in a C debugger to figure out why his __del__() methods would fail... {EE 1.10.3} The second case of problems with unprotected objects is a variant involving threads. Normally, multiple threads in the Python interpreter can't get in each other's way, because there is a global lock protecting Python's entire object space. However, it is possible to temporarily release this lock using the macro Py_BEGIN_ALLOW_THREADS, and to re-acquire it using Py_END_ALLOW_THREADS. This is common around blocking I/O calls, to let other threads use the CPU while waiting for the I/O to complete. Obviously, the following function has the same problem as the previous one: bug(PyObject *list) { PyObject *item = PyList_GetItem(list, 0); Py_BEGIN_ALLOW_THREADS ...some blocking I/O call... Py_END_ALLOW_THREADS PyObject_Print(item, stdout, 0); /* BUG! */ } 4.2 Objects Passed to Functions So far, we have looked at references to objects returned from functions. Now consider what happens when an object is passed to a function. To fix the ideas consider: int Caller(void) { PyObject* pyo; Function(pyo); Most functions assume that the arguments passed to them are already protected. Therefore Py_INCREF() is not called inside "Function" unless "Function" wants the argument to continue to exist after "Caller" exits". In the documentation, "Function" is said to "borrow" a reference: {EE 10.1.2} When you pass an object reference into another function, in general, the function borrows the reference from you -- if it needs to store it, it will use Py_INCREF() to become an independent owner. "PyDict_SetItem()" can serve as an example of normal behavior. Putting something in a dictionary is "storing" it. Therefore "PyDict_SetItem()" INCREFs both the key and the value. {EE 10.1.2} There are exactly two important functions that do not behave in this normal way: PyTuple_SetItem() and PyList_SetItem(). These functions take over responsibility of the item passed to them -- even if they fail! The Python documentation uses the phrase "steal a reference" to mean "takes over responsibility". Here is what PyTuple_SetItem(atuple, i, item) does: If "atuple[i]" currently contains a PyObject, that PyObject is DECREFed. Then "atuple[i]" is set to "item". "item" is *not* INCREFed. If PyTuple_SetItem() fails to insert "item", it *decrements* the reference count for "item". Similarly, PyTuple_GetItem() does not increment the returned item's reference count. Metaphorically, PyTuple_SetItem() grabs responsibility for a reference to "item" from you. If "item" is unprotected, PyTuple_SetItem() might DECREF it anyway which can crash Python. Other exceptions are PyList_SET_ITEM(), PyModule_AddObject(), and PyTuple_SET_ITEM(). Look at this piece of code: PyObject *t; PyObject *x; x = PyInt_FromLong(1L); At this point x has a reference count of one. When you are done with it, you normally would call Py_DECREF(x). But if if PyTuple_SetItem is called: PyTuple_SetItem(t, 0, x); you must not call Py_DECREF(). PyTuple_SetItem() will call it for you: when the tuple is DECREFed, the items will be also. {API 1.2.1.1} PyTuple_SetItem(), et. al, were designed to take over responsibility for a reference because of a common idiom for populating a tuple or list with newly created objects; for example, the code to create the tuple (1, 2, "three") could look like this (forgetting about error handling for the moment). It is better coding practice to use the less confusing PySequence family of functions as below. PyObject *t; t = PyTuple_New(3); PyTuple_SetItem(t, 0, PyInt_FromLong(1L)); PyTuple_SetItem(t, 1, PyInt_FromLong(2L)); PyTuple_SetItem(t, 2, PyString_FromString("three")); {API 1.2.1.1} Incidentally, PyTuple_SetItem() is the only way to set tuple items; PySequence_SetItem() and PyObject_SetItem() refuse to do this since tuples are an immutable data type. You should only use PyTuple_SetItem() for tuples that you are creating yourself. {API 1.2.1.1} Equivalent code for populating a list can be written using PyList_New() and PyList_SetItem(). Such code can also use PySequence_SetItem(); this illustrates the difference between the two (the extra Py_DECREF() calls): PyObject *l, *x; l = PyList_New(3); x = PyInt_FromLong(1L); PySequence_SetItem(l, 0, x); Py_DECREF(x); x = PyInt_FromLong(2L); PySequence_SetItem(l, 1, x); Py_DECREF(x); x = PyString_FromString("three"); PySequence_SetItem(l, 2, x); Py_DECREF(x); {API 1.2.1.1} You might find it strange that the 'better coding practice' takes more code. However, you are unlikely to often use these ways of creating and populating a tuple or list. There's a generic function, Py_BuildValue(), that can create most common objects from C values, directed by a format string. For example, the above two blocks of code could be replaced by the following (which also takes care of the error checking): PyObject *t, *l; t = Py_BuildValue("(iis)", 1, 2, "three"); l = Py_BuildValue("[iis]", 1, 2, "three"); {API 1.2.1.1} It is much more common to use PyObject_SetItem() and friends with protected objects (ie, the reference count was incremented before passing the item to you), a typical example being arguments that were passed in to the function you are writing. In that case, their behaviour regarding reference counts has a simpler appearance, since you don't have to do anything at all with reference counts. For example, this function sets all items of a list (actually, any mutable sequence) to a given item: int set_all(PyObject *target, PyObject *item) { int i, n; n = PyObject_Length(target); if (n < 0) return -1; for (i = 0; i < n; i++) { if (PyObject_SetItem(target, i, item) < 0) return -1; } return 0; } 5. Two Examples Example 1. This is a pretty standard example of C code using the Python API. PyObject* MyFunction(void) { PyObject* temporary_list=NULL; PyObject* return_this=NULL; temporary_list = PyList_New(1); /* Note 1 */ if (temporary_list == NULL) return NULL; return_this = PyList_New(1); /* Note 1 */ if (return_this == NULL) Py_DECREF(temporary_list); /* Note 2 */ return NULL; } Py_DECREF(temporary_list); /* Note 2 */ return return_this; } Note 1: The object returned by PyList_New has a reference count of 1. Note 2: Since "temporary_list" should disappear when MyFunction exits, it must be DECREFed before any return from the function. If a return can be reached both before or after "temporary_list" is created, then initialize "temporary_list" to NULL and use "Py_XDECREF()". Example 2. This is the same as Example 1 except PyTuple_GetItem() is used. PyObject* MyFunction(void) { PyObject* temporary=NULL; PyObject* return_this=NULL; PyObject* tup; PyObject* num; int err; tup = PyTuple_New(2); if (tup == NULL) return NULL; err = PyTuple_SetItem(tup, 0, PyInt_FromLong(222L)); /* Note 1 */ if (err) { Py_DECREF(tup); return NULL; } err = PyTuple_SetItem(tup, 1, PyInt_FromLong(333L)); /* Note 1 */ if (err) { Py_DECREF(tup); return NULL; } temporary = PyTuple_Getitem(tup, 0); /* Note 2 */ if (temporary == NULL) { Py_DECREF(tup); return NULL; } return_this = PyTuple_Getitem(tup, 1); /* Note 3 */ if (return_this == NULL) { Py_DECREF(tup); /* Note 3 */ return NULL; } /* Note 3 */ Py_DECREF(tup); return return_this; } Note 1: If "PyTuple_SetItem" fails or if the tuple it created is DECREFed to 0, then the object returned by "PyInt_FromLong" is DECREFed. Note 2: "PyTuple_Getitem" does not increment the reference count for the object it returns. Note 3: You have no responsibility for DECFREFing "temporary". From dgoodger@bigfoot.com Mon Jul 9 05:08:19 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Mon, 09 Jul 2001 00:08:19 -0400 Subject: [Doc-SIG] Re: PEP 258: DPS Generic Implementation Details In-Reply-To: <200107071017.LAA04868@inputplus.demon.co.uk> Message-ID: on 2001-07-07 6:17 AM, Ralph Corderoy (ralph@inputplus.demon.co.uk) wrote: > Of the various Wikis I found TWiki's > plain text input format allows quite a lot of structure to be expressed > whilst still being readable as plain text. Perhaps it would present > some ideas. Link? > One thing that it uses to good effect is significant whitespace. > Something familiar to all :-) Old argument; see the archives. For a tongue-in-cheek summary of a recent thread, see: http://structuredtext.sf.net/spec/indentedsections.txt > Allowing multiple docstring formats, whilst perhaps least contentious, > does seem a little like trying to please everyone. Read through the Doc-SIG archives and you'll see that there's never been agreement on any one syntax. The DPS isn't concerned with which syntax is used, but is a generic framework. Saying, "the standard syntax will be X" is a surefire guarantee that the system would never get anywhere. I'm working on one syntax (reStructuredText), which may or may not eventually become *the* syntax, and simultaneously working out the details of the DPS interfaces. "Over time or through decree, a standard format or set of formats should emerge." By allowing multiple formats/syntaxes, we separate the components. In this context, it's "together we fail, divided we may just have a chance." -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sf.net - reStructuredText: http://structuredtext.sf.net - The Go Tools Project: http://gotools.sf.net From ralph@inputplus.demon.co.uk Mon Jul 9 10:47:54 2001 From: ralph@inputplus.demon.co.uk (Ralph Corderoy) Date: Mon, 09 Jul 2001 10:47:54 +0100 Subject: [Doc-SIG] Re: PEP 258: DPS Generic Implementation Details In-Reply-To: Message from David Goodger of "Mon, 09 Jul 2001 00:08:19 EDT." Message-ID: <200107090947.KAA04719@inputplus.demon.co.uk> Hi David, > > Of the various Wikis I found TWiki's plain text input format allows > > quite a lot of structure to be expressed whilst still being > > readable as plain text. Perhaps it would present some ideas. > > Link? http://twiki.org/ http://TWiki.org/cgi-bin/view/TWiki/TextFormattingRules Ignoring its acceptance of HTML it provides ASCII ways to specify tables, bullet lists, etc. Thanks for the references, I'll have a read. Ralph. From martin@loewis.home.cs.tu-berlin.de Mon Jul 9 16:09:51 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 9 Jul 2001 17:09:51 +0200 Subject: [Doc-SIG] A DTD for Python documentation Message-ID: <200107091509.f69F9p204117@mira.informatik.hu-berlin.de> --Multipart_Mon_Jul__9_17:09:51_2001-1 Content-Type: text/plain; charset=US-ASCII Being back from the European Python Meeting, I found some time to look into writing a DTD for the pythpn documentation. As proposed in Bordeaux, I ran the current sgmlconv tools on the documentation, and then ran DDbE on the resulting XML files. The resulting DTD is attached, it shall be known as +//IDN python.org//DTD Python Documentation 1.0//EN//XML In the near future, I plan to make a few simplifications on this DTD, in particular simplify the section and subsection content models, and remove the DDbE annotations. In addition, I'll try introducing ID-valued attributes for a number of elements, to allow better tracking of documentation changes - in particular for the purposes of translation. Regards, Martin P.S. If anybody wants the Schema generated by DDbE, please let me know. --Multipart_Mon_Jul__9_17:09:51_2001-1 Content-Type: application/octet-stream Content-Disposition: attachment; filename="pydoc.dtd" Content-Transfer-Encoding: 8bit --Multipart_Mon_Jul__9_17:09:51_2001-1 Content-Type: text/plain; charset=US-ASCII --Multipart_Mon_Jul__9_17:09:51_2001-1-- From martin@loewis.home.cs.tu-berlin.de Mon Jul 9 16:32:09 2001 From: martin@loewis.home.cs.tu-berlin.de (Martin v. Loewis) Date: Mon, 9 Jul 2001 17:32:09 +0200 Subject: [Doc-SIG] Translation of Python documentation Message-ID: <200107091532.f69FW9304591@mira.informatik.hu-berlin.de> At the European Python Meeting, we had a few sessions on translating Python documentation. These were initiated by Benoit Lacherez, who currently manages the French Python translations (frpython.sf.net). The French translation group currently uses a script to markup the original documentation, which copies the English text into commented regions. The translator inserts the French translations in-between these regions. We have discussed versioning of the documentation to some extend, and found two problems: 1. it is still unclear if and when the documentation will be converted to XML. Having XML might simplify the translation process to some degree, but it will also mean that the existing translations need to be converted, as well. 2. version tracking is quite a challenge. So far, the French translators had problems when documentation moved from one file to another after 1.5.2. However, we anticipate further problems with version changes, like: - the order of paragraphs or sections may change - changes might merely affect formatting (e.g. line breaking), but a plain diff will display the entire paragraph as changed 3. it might be desirable to offer "incomplete" translations, which only offers translation when they are available, and English documentation for the rest. To solve these issues, we propose that a) the conversion to XML is done rather sooner than later, b) in the original documents, unique identifications of sections and *desc elements are introduced. These identifications can then be used in the translations to specify correlate the translations with the original text. This might look like capitalize word Capitalize the first character of the argument. A script would need to check whether these are truly unique, and whether they are present in all places (and assign them if they aren't). I assume they can be used for cross-referencing, also. c) some sort of versioning is used in the translations. It is not clear to me what the best approach would be, options include: - attribute each element with an ID also with the CVS version number where this element was last changed. - attribute each such element with a hash value for its contents. Regards, Martin From edcjones@erols.com Mon Jul 9 23:42:14 2001 From: edcjones@erols.com (Edward C. Jones) Date: Mon, 09 Jul 2001 18:42:14 -0400 Subject: [Doc-SIG] Essay on Reference Counts (long), second try Message-ID: <3B4A3346.1020305@erols.com> I have always been confused by reference counts. I put this together to try to clear up my confusion. You are welcome to use any part of this document in writing Python documentation. Ed Jones EE is "Extending and Embedding the Python Interpreter". API is "Python/C API Reference Manual". Paragraphs starting with expressions like "{EE 1.10}" are very similar to paragraphs in the Python documentation. ========================================================================== ========================================================================== Metasummary Some of the Python source code documentation can be usefully duplicated in the API documentation. I find the "own", "borrow", "steal" metaphor confusing. I prefer to restrict the use of "reference" to the meaning "pointer to an PyObject". If Py_INCREF() has been called for a reference, the reference is "protected". From the Python coder's point of view, a reference is an object. Therefore it is reasonable to use "object" interchangeably with "reference". 1. Summary 1. Every Python object contains a reference counter which is incremented by Py_INCREF() and decremented by Py_DECREF(). If the counter becomes zero, Python might delete the object. 2. For every call to Py_INCREF(), there should eventually be a call to Py_DECREF(). Call Py_INCREF() for objects that you want to keep around for a while. Call Py_DECREF() when you are done with them. A pointer to an object that has been INCREFed is said to be "protected". 3. Most functions INCREF any object that should continue to exist after the function exits. This includes objects returned via arguments or a return statement. In these cases the calling function usually has responsibility for calling Py_DECREF(). The calling function can, in turn, pass responsibility for the DECREF to _its_ caller. 4. Some functions violate rule 3 by not INCREFing objects they return. The standard example is PyTuple_GetItem(). 5. It is not necessary to increment an object's reference count for every local variable that contains a pointer to an object. But don't leave objects unprotected if there is any chance that the object will be DECREFed elsewhere. 6. Most functions assume that each argument passed to them is a protected object. Therefore the code for the function does not INCREF the argument. 7. There are exactly two important functions that are exceptions to rule 6: PyTuple_SetItem() and PyList_SetItem(). These functions take over responsibility of the item passed to them -- even if they fail! 2. Background 1.1 Memory Problems with C/C++ {EE 1.10} In languages like C or C++, the programmer is responsible for dynamic allocation and deallocation of memory on the heap. In C, this is done using the functions malloc() and free(). In C++, the operators new and delete are used with essentially the same meaning; they are actually implemented using malloc() and free(), so we'll restrict the following discussion to the latter. {EE 1.10} Every block of memory allocated with malloc() should eventually be returned to the pool of available memory by exactly one call to free(). It is important to call free() at the right time. If a block's address is forgotten but free() is not called for it, the memory it occupies cannot be reused until the program terminates. This is called a memory leak. On the other hand, if a program calls free() for a block and then continues to use the block, it creates a conflict with re-use of the block through another malloc() call. This is called using freed memory. It has the same bad consequences as referencing uninitialized data -- core dumps, wrong results, mysterious crashes. {EE 1.10} Common causes of memory leaks are unusual paths through the code. For instance, a function may allocate a block of memory, do some calculation, and then free the block again. Now a change in the requirements for the function may add a test to the calculation that detects an error condition and can return prematurely from the function. It's easy to forget to free the allocated memory block when taking this premature exit, especially when it is added later to the code. Such leaks, once introduced, often go undetected for a long time: the error exit is taken only in a small fraction of all calls, and most modern machines have plenty of virtual memory, so the leak only becomes apparent in a long-running process that uses the leaking function frequently. Therefore, it's important to prevent leaks from happening by having a coding convention or strategy that minimizes this kind of errors. {EE 1.10} Since Python makes heavy use of malloc() and free(), it needs a strategy to avoid memory leaks as well as the use of freed memory. The chosen method is called reference counting. The principle is simple: every object contains a counter, which is incremented when a pointer to the object is stored somewhere, and which is decremented when the pointer is deleted. When the counter reaches zero, the last pointer to the object has been deleted and the object is freed. 1.2 Python Objects {object.h} Python objects are structures allocated on the heap. They have type "PyObject". They are accessed through pointers of type "PyObject*". Special rules apply to the use of objects to ensure they are properly garbage-collected. Objects are never allocated statically or on the stack; they must be accessed through special macros and functions only. (Type objects are exceptions to the first rule; the standard types are represented by statically initialized type objects.) {object.h} An object has a 'reference count' that is increased or decreased when a pointer to the object is copied or deleted; when the reference count reaches zero there are no references to the object left and it can be removed from the heap. Every Python object has a reference count and a type (and possibly more). Here is the relevant code from "object.h", one of the Python header files. "PyObject" is a C struct. Each instance of "PyObject" contains the variable "ob_refcnt" where the "reference count" is kept and "ob_type" where the type is kept. #define PyObject_HEAD \ int ob_refcnt; \ struct _typeobject *ob_type; typedef struct _object { PyObject_HEAD } PyObject; {object.h} The "type" of an object determines what it represents and what kind of data it contains. An object's type is fixed when it is created. Types themselves are represented as objects; an object contains a pointer to the corresponding type object. The type itself has a type pointer pointing to the object representing the type 'type', which contains a pointer to itself!). {object.h} Objects do not float around in memory; once allocated an object keeps the same size and address. Objects that must hold variable-size data can contain pointers to variable-size parts of the object. Not all objects of the same type have the same size; but the size cannot change after allocation. (These restrictions are made so a reference to an object can be simply a pointer -- moving an object would require updating all the pointers, and changing an object's size would require moving it if there was another object right next to it.) {object.h} Objects are always accessed through pointers of the type 'PyObject *'. The type 'PyObject' is the structure shown above that only contains the reference count and the type pointer. The actual memory allocated for an object contains other data that can only be accessed after casting the pointer to a pointer to a longer structure type. This longer type must start with the reference count and type fields; the macro PyObject_HEAD above should be used for this (to accommodate for future changes). The implementation of a particular object type can cast the object pointer to the proper type and back. {object.h} A standard interface also exists for objects that contain an array of items whose size is determined when the object is allocated. It looks like #define PyObject_VAR_HEAD \ PyObject_HEAD \ int ob_size; /* Number of items in variable part */ typedef struct _object { PyObject_HEAD } PyObject; If the reference count for some object becomes zero, Python will usually reclaim the memory the object uses, making the object disappear. If an object is used in a piece of code, the code should make sure that the reference count cannot drop to zero, usually by adding one to it. If the reference count never becomes zero, the memory is never reclaimed. If it was not intended for the object to be permanent, there is a memory leak. 3. Py_INCREF() and Py_DECREF() {object.h} The macros Py_INCREF(op) and Py_DECREF(op) are used to increment or decrement reference counts. Py_DECREF() also calls the object's deallocator function; for objects that don't contain references to other objects or heap memory this can be the standard function free(). Both macros can be used wherever a void expression is allowed. The argument shouldn't be a NIL pointer. {object.h} We assume that the reference count field can never overflow; this can be proven when the size of the field is the same as the pointer size but even with a 16-bit reference count field it is pretty unlikely so we ignore the possibility. To prevent memory leaks, corresponding to each call to Py_INCREF(), there must be a call to Py_DECREF(): for each call to Py_INCREF(), there is a "responsibility" to call Py_DECREF(). This responsibility can be passed around between functions. The last user of the reference must call Py_DECREF(). 4. When to Use Py_INCREF() and Py_DECREF() 4.1 Objects Returned from Functions Most Python objects in C code are created by calls to functions in the "Python/C API". These functions have prototypes that look like: PyObject* Py_Something(arguments); These functions usually (but not always!) call Py_INCREF() before returning (a pointer to) the new object. Generally, the function that called PySomething has the responsibility to call Py_DECREF(). If, in turn, the function returns the object to its caller, it passes on the responsibility for the reference. Some things are best understood by example pseudo-code: void MyCode(arguments) { PyObject* pyo; ... pyo = Py_Something(args); MyCode has responsibility for the reference passed to it by Py_Something. When MyCode is done with "pyo", it must call: Py_DECREF(pyo); On the other hand, if MyCode returns "pyo", there must not be a call to Py_DECREF(). PyObject* MyCode(arguments) { PyObject* pyo; ... pyo = Py_Something(args); ... return pyo; } In this situation, MyCode has "passed on the responsibility" for DECREFing the reference. Note: if a function is to return None, the C code should look like: Py_INCREF(Py_None); return Py_None; Remember to INCREF Py_None! So far, only the most common case has been discussed, where "Py_Something" creates a reference and passes responsibility for it to the caller. In the Python documentation, this is called a "new reference". For example the documentation says: PyObject* PyList_New(int len) Return value: New reference. Returns a new list of length len on success, or NULL on failure. The documentation uses the word "reference" is two closely related ways: a pointer to a PyObject and the responsibility to DECREF the object. I will try to use "reference" in the first sense only. When a reference has been INCREFed, I prefer to say that the reference is "protected". I will often get sloppy and say "object" when I mean "reference". (After all, what a C programmer sees as a "PyObject*" the Python programmer sees as an "object".) But sometimes the Python source code DOES NOT CALL Py_DECREF(): PyObject * PyTuple_GetItem(register PyObject *op, register int i) { if (!PyTuple_Check(op)) { PyErr_BadInternalCall(); return NULL; } if (i < 0 || i >= ((PyTupleObject *)op) -> ob_size) { PyErr_SetString(PyExc_IndexError, "tuple index out of range"); return NULL; } return ((PyTupleObject *)op) -> ob_item[i]; } In the documentation, this is referred to as "borrowing" a reference: PyObject* PyTuple_GetItem(PyObject *p, int pos) Return value: Borrowed reference. Returns the object at position pos in the tuple pointed to by p. If pos is out of bounds, returns NULL and sets an IndexError exception. I prefer to say that the the reference (in the sense I use it) is left "unprotected". Functions returning unprotected referencess (borrowing a reference) are: PyTuple_GetItem(), PyList_GetItem(), PyList_GET_ITEM(), PyList_SET_ITEM(), PyDict_GetItem(), PyDict_GetItemString(), PyErr_Occurred(), PyFile_Name(), PyImport_GetModuleDict(), PyModule_GetDict(), PyImport_AddModule(), PyObject_Init(), Py_InitModule(), Py_InitModule3(), Py_InitModule4(), and PySequence_Fast_GET_ITEM(). {EE 10.1.2} The function PyImport_AddModule() does not INCREF the reference it returns even though it may actually create the object the reference refers to: this is possible because the object is INCREFed when it is stored in sys.modules. See also PyArg_ParseTuple() in Extending and Embedding, 1.7. This function sometimes returns PyObjects back to the caller through its arguments. An example from sysmodule.c is: static PyObject * sys_getrefcount(PyObject *self, PyObject *args) { PyObject *arg; if (!PyArg_ParseTuple(args, "O:getrefcount", &arg)) return NULL; return PyInt_FromLong(arg->ob_refcnt); } "arg" is an unprotected object. It should not be DECREFed before leaving the function (because it was never INCREFed!). {API 1.2.1.1} Here is an example of how you could write a function that computes the sum of the items in a list of integers, here using PySequence_GetItem() and later using PyList_GetItem(). long sum_sequence(PyObject *sequence) { int i, n; long total = 0; PyObject *item; n = PySequence_Length(sequence); if (n < 0) return -1; /* Has no length. */ /* Caller should use PyErr_Occurred() if a -1 is returned. */ for (i = 0; i < n; i++) { /* PySequence_GetItem INCREFs item. */ item = PySequence_GetItem(sequence, i); if (item == NULL) return -1; /* Not a sequence, or other failure */ if (PyInt_Check(item)) total += PyInt_AsLong(item); Py_DECREF(item); } return total; } 4.1.1 When to be sloppy with unINCREFed objects {API 1.2.1} It is not necessary to increment an object's reference count for every local variable that contains a pointer to an object. In theory, the object's reference count should be increased by one when the variable is made to point to it and decreased by one when the variable goes out of scope. However, these two cancel each other out, so at the end the reference count hasn't changed. The only real reason to use the reference count is to prevent the object from being deallocated as long as our variable is pointing to it. If we know that there is at least one other reference to the object that lives at least as long as our variable, there is no need to increment the reference count temporarily. {API 1.2.1} An important situation where this arises is for objects that are passed as arguments to C functions in an extension module that are called from Python; the call mechanism guarantees to hold a reference to every argument for the duration of the call. Here is the "sum_list" example again, this time using PyList_GetItem(). long sum_list(PyObject *list) { int i, n; long total = 0; PyObject *item; n = PyList_Size(list); if (n < 0) return -1; /* Not a list */ /* Caller should use PyErr_Occurred() if a -1 is returned. */ for (i = 0; i < n; i++) { /* PyList_GetItem does not INCREF "item". "item" is unprotected. */ item = PyList_GetItem(list, i); /* Can't fail */ if (PyInt_Check(item)) total += PyInt_AsLong(item); } return total; } 4.1.2 Thin Ice: When not to be sloppy with INCREF Don't leave an object unprotected if there is any chance that it will be DECREFed. {API 1.2.1} Subtle bugs can occur when when a reference is unprotected. A common example is to extract an object from a list and hold on to it for a while without incrementing its reference count. Some other operation might conceivably remove the object from the list, decrementing its reference count and possible deallocating it. The real danger is that innocent-looking operations may invoke arbitrary Python code which could do this; there is a code path which allows control to flow back to the Python user from a Py_DECREF(), so almost any operation is potentially dangerous. For example: bug(PyObject *list) { PyObject *item = PyList_GetItem(list, 0); PyList_SetItem(list, 1, PyInt_FromLong(0L)); PyObject_Print(item, stdout, 0); /* BUG! */ } {EE 1.10.3} The PyObject "item" is gotten using PyList_GetItem and left unprotected. The code then replaces list[1] with the value 0, and finally prints "item". Looks harmless, right? But it's not! {EE 1.10.3} Let's follow the control flow into PyList_SetItem(). The list has protected references to all its items, so when item 1 is replaced, it has to dispose of the original item 1 by DECREFing it. Now let's suppose the original item 1 was an instance of a user-defined class, and let's further suppose that the class defined a __del__() method. If this class instance has a reference count of 1, disposing of it will call its __del__() method. {EE 1.10.3} Since it is written in Python, the __del__() method can execute arbitrary Python code. Could it perhaps do something to invalidate the reference to item in bug()? You bet! Assuming that the list passed into bug() is accessible to the __del__() method, it could execute a statement to the effect of "del list[0]", and assuming this was the last reference to that object, it would free the memory associated with it, thereby invalidating item. {EE 1.10.3} The solution, once you know the source of the problem, is easy: temporarily increment the reference count. The correct version of the function reads: no_bug(PyObject *list) { PyObject *item = PyList_GetItem(list, 0); Py_INCREF(item); /* Protect item. */ PyList_SetItem(list, 1, PyInt_FromLong(0L)); PyObject_Print(item, stdout, 0); Py_DECREF(item); } {EE 1.10.3} This is a true story. An older version of Python contained variants of this bug and someone spent a considerable amount of time in a C debugger to figure out why his __del__() methods would fail... {EE 1.10.3} The second case of problems with unprotected objects is a variant involving threads. Normally, multiple threads in the Python interpreter can't get in each other's way, because there is a global lock protecting Python's entire object space. However, it is possible to temporarily release this lock using the macro Py_BEGIN_ALLOW_THREADS, and to re-acquire it using Py_END_ALLOW_THREADS. This is common around blocking I/O calls, to let other threads use the CPU while waiting for the I/O to complete. Obviously, the following function has the same problem as the previous one: bug(PyObject *list) { PyObject *item = PyList_GetItem(list, 0); Py_BEGIN_ALLOW_THREADS ...some blocking I/O call... Py_END_ALLOW_THREADS PyObject_Print(item, stdout, 0); /* BUG! */ } 4.2 Objects Passed to Functions So far, we have looked at references to objects returned from functions. Now consider what happens when an object is passed to a function. To fix the ideas consider: int Caller(void) { PyObject* pyo; Function(pyo); Most functions assume that the arguments passed to them are already protected. Therefore Py_INCREF() is not called inside "Function" unless "Function" wants the argument to continue to exist after "Caller" exits". In the documentation, "Function" is said to "borrow" a reference: {EE 10.1.2} When you pass an object reference into another function, in general, the function borrows the reference from you -- if it needs to store it, it will use Py_INCREF() to become an independent owner. "PyDict_SetItem()" can serve as an example of normal behavior. Putting something in a dictionary is "storing" it. Therefore "PyDict_SetItem()" INCREFs both the key and the value. {EE 10.1.2} There are exactly two important functions that do not behave in this normal way: PyTuple_SetItem() and PyList_SetItem(). These functions take over responsibility of the item passed to them -- even if they fail! The Python documentation uses the phrase "steal a reference" to mean "takes over responsibility". Here is what PyTuple_SetItem(atuple, i, item) does: If "atuple[i]" currently contains a PyObject, that PyObject is DECREFed. Then "atuple[i]" is set to "item". "item" is *not* INCREFed. If PyTuple_SetItem() fails to insert "item", it *decrements* the reference count for "item". Similarly, PyTuple_GetItem() does not increment the returned item's reference count. Metaphorically, PyTuple_SetItem() grabs responsibility for a reference to "item" from you. If "item" is unprotected, PyTuple_SetItem() might DECREF it anyway which can crash Python. Other exceptions are PyList_SET_ITEM(), PyModule_AddObject(), and PyTuple_SET_ITEM(). Look at this piece of code: PyObject *t; PyObject *x; x = PyInt_FromLong(1L); At this point x has a reference count of one. When you are done with it, you normally would call Py_DECREF(x). But if if PyTuple_SetItem is called: PyTuple_SetItem(t, 0, x); you must not call Py_DECREF(). PyTuple_SetItem() will call it for you: when the tuple is DECREFed, the items will be also. {API 1.2.1.1} PyTuple_SetItem(), et. al, were designed to take over responsibility for a reference because of a common idiom for populating a tuple or list with newly created objects; for example, the code to create the tuple (1, 2, "three") could look like this (forgetting about error handling for the moment). It is better coding practice to use the less confusing PySequence family of functions as below. PyObject *t; t = PyTuple_New(3); PyTuple_SetItem(t, 0, PyInt_FromLong(1L)); PyTuple_SetItem(t, 1, PyInt_FromLong(2L)); PyTuple_SetItem(t, 2, PyString_FromString("three")); {API 1.2.1.1} Incidentally, PyTuple_SetItem() is the only way to set tuple items; PySequence_SetItem() and PyObject_SetItem() refuse to do this since tuples are an immutable data type. You should only use PyTuple_SetItem() for tuples that you are creating yourself. {API 1.2.1.1} Equivalent code for populating a list can be written using PyList_New() and PyList_SetItem(). Such code can also use PySequence_SetItem(); this illustrates the difference between the two (the extra Py_DECREF() calls): PyObject *l, *x; l = PyList_New(3); x = PyInt_FromLong(1L); PySequence_SetItem(l, 0, x); Py_DECREF(x); x = PyInt_FromLong(2L); PySequence_SetItem(l, 1, x); Py_DECREF(x); x = PyString_FromString("three"); PySequence_SetItem(l, 2, x); Py_DECREF(x); {API 1.2.1.1} You might find it strange that the 'better coding practice' takes more code. However, you are unlikely to often use these ways of creating and populating a tuple or list. There's a generic function, Py_BuildValue(), that can create most common objects from C values, directed by a format string. For example, the above two blocks of code could be replaced by the following (which also takes care of the error checking): PyObject *t, *l; t = Py_BuildValue("(iis)", 1, 2, "three"); l = Py_BuildValue("[iis]", 1, 2, "three"); {API 1.2.1.1} It is much more common to use PyObject_SetItem() and friends with protected objects (ie, the reference count was incremented before passing the item to you), a typical example being arguments that were passed in to the function you are writing. In that case, their behaviour regarding reference counts has a simpler appearance, since you don't have to do anything at all with reference counts. For example, this function sets all items of a list (actually, any mutable sequence) to a given item: int set_all(PyObject *target, PyObject *item) { int i, n; n = PyObject_Length(target); if (n < 0) return -1; for (i = 0; i < n; i++) { if (PyObject_SetItem(target, i, item) < 0) return -1; } return 0; } 5. Two Examples Example 1. This is a pretty standard example of C code using the Python API. PyObject* MyFunction(void) { PyObject* temporary_list=NULL; PyObject* return_this=NULL; temporary_list = PyList_New(1); /* Note 1 */ if (temporary_list == NULL) return NULL; return_this = PyList_New(1); /* Note 1 */ if (return_this == NULL) Py_DECREF(temporary_list); /* Note 2 */ return NULL; } Py_DECREF(temporary_list); /* Note 2 */ return return_this; } Note 1: The object returned by PyList_New has a reference count of 1. Note 2: Since "temporary_list" should disappear when MyFunction exits, it must be DECREFed before any return from the function. If a return can be reached both before or after "temporary_list" is created, then initialize "temporary_list" to NULL and use "Py_XDECREF()". Example 2. This is the same as Example 1 except PyTuple_GetItem() is used. PyObject* MyFunction(void) { PyObject* temporary=NULL; PyObject* return_this=NULL; PyObject* tup; PyObject* num; int err; tup = PyTuple_New(2); if (tup == NULL) return NULL; err = PyTuple_SetItem(tup, 0, PyInt_FromLong(222L)); /* Note 1 */ if (err) { Py_DECREF(tup); return NULL; } err = PyTuple_SetItem(tup, 1, PyInt_FromLong(333L)); /* Note 1 */ if (err) { Py_DECREF(tup); return NULL; } temporary = PyTuple_Getitem(tup, 0); /* Note 2 */ if (temporary == NULL) { Py_DECREF(tup); return NULL; } return_this = PyTuple_Getitem(tup, 1); /* Note 3 */ if (return_this == NULL) { Py_DECREF(tup); /* Note 3 */ return NULL; } /* Note 3 */ Py_DECREF(tup); return return_this; } Note 1: If "PyTuple_SetItem" fails or if the tuple it created is DECREFed to 0, then the object returned by "PyInt_FromLong" is DECREFed. Note 2: "PyTuple_Getitem" does not increment the reference count for the object it returns. Note 3: You have no responsibility for DECFREFing "temporary". From dgoodger@bigfoot.com Tue Jul 10 02:43:41 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Mon, 09 Jul 2001 21:43:41 -0400 Subject: [Doc-SIG] Re: reStructuredText In-Reply-To: Message-ID: Hi Simon, on 2001-07-09 5:08 PM, Simon Hefti (hefti@netcetera.ch) wrote: > I have been reading your reStructuredText Specs and I like it a lot. Thank you. I've copied my response to doc-sig; I hope you don't mind. > I have a few comments: > > 1) what is the reason to choose ".." as comment tag ? > I should think that "#" would be more intuitive for most of us. It comes from Setext. I find ".." less obtrusive than "#". Plus, it's not just a comment marker; it handles hyperlinks, footnotes, and directives too. > Also: is it "^..", or can comments start everywhere in the line ? They start at the left edge of the current indentation. Indent further and they start a block quote. They can end an indent too (correctly, of course). > 2) If I understood this correctly, your `some text here`_ syntax > implies that each "key" must be unique. That may not be true > for section titles. >From the working reStructuredText.txt (not yet on the web site), end of "Section Structure": Each section title automatically generates a hyperlink target pointing to the section. The text of the hyperlink target is the same as that of the section title. See `Implicit Hyperlink Targets`_ for a complete description. And further on:: Implicit Hyperlink Targets `````````````````````````` Implicit hyperlink targets are generated by section titles, and may also be generated by extension constructs. Implicit hyperlink targets behave identically to explicit `hyperlink targets`_. Problems of ambiguity due to conflicting duplicate implicit and explicit hyperlink names are avoided because the reStructuredText parser follows these rules: 1. Explicit hyperlink targets override any implicit targets having the same hyperlink name. The implicit hyperlink targets are removed, and level-0 system warnings are inserted. 2. If two or more sections have the same title (such as "Introduction" subsections of a rigidly-structured document), there will be duplicate implicit target hyperlink names. Solution: all duplicate hyperlink targets are removed, and level-0 system warnings inserted. 3. If there are duplicate explicit hyperlink target names, all duplicates are removed, and level-1 system warnings inserted. System warnings are inserted where target links have been removed. See 'Error Handling' in `PEP 258`_. The parser must return a set of *unique* hyperlink targets. The calling software (such as the `Python Docstring Processing System`_) can warn of unresolvable links, giving reasons for the warnings. > 3) how do you handle line breaks in link defitions ? Link names are whitespace-normalized. Multiple spaces or newlines are converted to a single space. I'll spell this out better in the spec. I'm also toying with the idea of removing leading numbers from implicit link names, so a section titled "3. Conclusion" can be referred to by "Conclusion_" (i.e., without the "3."). > Is `this is a very very very very very very very very > very very long tag`_ a valid tag ? Yes. > Section title ? The idea of multi-line section titles has come up. I've yet to be convinced they're necessary (80 characters ought to be enough). I'm currently working on a parser, which recognizes single-line titles; multi-line titles would be harder to parse (both by software and by the human eye/brain). Arguments pro/con anybody? > 4) I guess that an "include" statement could be very > helpful, e.g. for books, bibliography, commonly used definitions and > so on. If you plan to delegate it to the directives, I think > it should be a required one. Easy enough to implement. > 5) More directives I would whish to see: > - TOC Implementable. > - glossary (in addition to footnotes) Please elaborate. Why not just a section called "Glossary" containing a definition list? > 6) I like your table tag. Still, I think in some cases > one could appreciate an additional, simplified syntax like:: > > +---------- > | key | value > +---------- > | foo | bar > | bar | long line follows here ............................... > +---------- > > From my experience, most tables are very simple: one key, a few > values. If you're saying, let the right-hand border be optional, it's possible. Opinions? > 7) Also from experience, somtimes one whishes to force a line > or page break (like the
tag). Use literal blocks, or implement a directive. No obvious syntax for
comes to mind. I'd be reluctant to add it anyhow (purely presentational). This isn't XML, after all; we have to live with some limitations. > 8) why should URLs be case insentive ? Servlet Engines > typically are not, as well as querystrings. URLs aren't specified as case-insensitive. Internal/indirect hyperlinks are. All would be case-preserving regardless. Link_ and link_ and LinK_ would all refer to the same element. The case and spacing of the processed, visible text wouldn't be altered. > 9) a typo: "hhere" Wwhere? :-) Thank you for your feedback. -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sf.net - reStructuredText: http://structuredtext.sf.net - The Go Tools Project: http://gotools.sf.net From fdrake@acm.org Tue Jul 10 17:27:12 2001 From: fdrake@acm.org (Fred L. Drake) Date: Tue, 10 Jul 2001 12:27:12 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20010710162712.D2A8F2892B@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Updated to reflect the recent checkins for the Python/C API manual, which cover a number of the object creation and initialization functions. From wwwjessie@21cn.com Thu Jul 12 11:01:48 2001 From: wwwjessie@21cn.com (wwwjessie@21cn.com) Date: Thu, 12 Jul 2001 18:01:48 +0800 Subject: [Doc-SIG] =?gb2312?B?xvPStcnPzfijrNK7sr21vc67KFlvdXIgb25saW5lIGNvbXBhbnkp?= Message-ID: <34e8a01c10ab9$aa80f490$9300a8c0@ifood1gongxing> This is a multi-part message in MIME format. ------=_NextPart_000_34E8B_01C10AFC.B8A43490 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: base64 1/C+tLXEu+HUsaOsxPq6w6Oh0rzKs8a31tC5+s34t/7O8dDFz6K5qcT6ss6/vKO6ICANCg0K07XT 0NfUvLq1xM34yc+5q8u+o6zVucq+uavLvrL6xre6zbf+zvGjrMzhuN/G89K1vrrV+cGmLMT609DB vdbW0aHU8aO6DQoNCjEvIM341b62qNbGIDxodHRwOi8vd3d3Lmlmb29kMS5jb20vYWJvdXR1cy9v dXJzZXJ2aWNlcy93ZWIuYXNwPiAgOg0K19S8us6su6S4/NDCo6y53MDtx7DMqLrzzKijrLj5vt3G 89K10OjSqqOsvajBotfUvLq1xM34yc+5q8u+o6zK/b7dv+LEo7/pyM7E+tGh1PGjusnMx+nQxc+i t6KyvCzN+MnPsvrGt9W5yr6jrL/Nu6e3/s7x1tDQxCzN+MnPubrO78+1zbMsv827p7nYDQrPtbnc wO0szfjJz8LbzLMszfjJz7vh0unW0NDELM34yc/V0Ma4LM22xrHPtc2zLNfKwc/PwtTY1tDQxCzO yr7ttfey6Swg1dCx6rLJubrPtc2zLLfDzsrV382zvMa31s72LCDBxMzsytIovbvB96GizLjF0Cmh raGtDQoNCs/rwcu94sr9vt2/4sSjv+nR3cq+1tDQxKO/x+vBqs+1o7ogc2FsZXNAaWZvb2QxLmNv bSA8bWFpbHRvOnNhbGVzQGlmb29kMS5jb20+DQqhobXnu7CjujA3NTUtMzc4NjMwOaGhz/rK27K/ yfLQob3jDQoNCjIvINK8zfjNqCA8aHR0cDovL29uZXQuaWZvb2QxLmNvbS8+DQot19TW+sq9vajN +KOsstnX97zytaWjrLy0vai8tNPDo7q/ydW5yr4zMNXFu/K4/Lbg1dXGrKOs19TW+sq9zqy7pKOs v8nL5sqxuPzQws28xqy6zc7E19bE2sjdo6zU2s/ft6KyvLL6xrfQxc+ioaK5q8u+tq/MrLXIo6zU +cvNtv68trn6vMrT8sP7KA0KyOdodHRwOi8veW91cm5hbWUuaWZvb2QxLmNvbSmjrNPr0rzKs8a3 1tC5+s34KNKzw+bkr8DAwb/UwtPiMjAwzfK0zim99MPcway906OszOG438LyvNK6zbnLv823w87K wb+jrLaoxtrK1bW90rzKsw0KxrfW0Ln6zfjM4bmptcS/zbun0OjH87rNssm5utDFz6Khow0KDQoN Cg0KN9TCMzDI1cewyerH67KiuLa/7sq508PSvM34zaijrMzYsfDTxbvdvNszODAw1KovxOqjrNT5 y83M9cLrueO45rKiw+K30dTayrPGt9eo0rXU09a+v6+1x7mpo6zH86OstPrA7aOsus/X99DFz6IN Cs/rwcu94rj8tuA/IKGhx+vBqs+1o7ogc2FsZXNAaWZvb2QxLmNvbSA8bWFpbHRvOnNhbGVzQGlm b29kMS5jb20+DQqhobXnu7CjujA3NTUtMzc4NjMwOaGhoaHP+srbsr/J8tChveMNCrvyILfDzsrO 0sPHtcTN+NKzIDxodHRwOi8vd3d3Lmlmb29kMS5jb20vYWJvdXR1cy9vdXJzZXJ2aWNlcy9jcHNl cnZpY2UuYXNwPg0KOnd3dy5pZm9vZDEuY29tDQoNCrvY1rSjqMfrtKvV5qO6MDc1NS0zMjM5MDQ3 u/K3orXn19PTyrz+o7ogc2FsZXNAaWZvb2QxLmNvbSA8bWFpbHRvOnNhbGVzQGlmb29kMS5jb20+ IKOpDQoNCqH1ILG+uavLvrbUzfjVvrao1sa40NDLyKShoaGhICAgICAgICAgICAgICAgICAgICAg ofUgsb65q8u+ttTSvM34zai3/s7xuNDQy8ikDQoNCrmry77D+7PGo7pfX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX1/Bqs+1yMujul9fX19fX19fX19fX19fX19fXw0K X19fX18gDQoNCrXnu7Cjul9fX19fX19fX19fX19fX19fX19fX7Sr1eajul9fX19fX19fX19fX19f X19fX19fX19FLW1haWyjul9fX19fX19fX19fX19fX18NCl9fX19fXyANCg0K ------=_NextPart_000_34E8B_01C10AFC.B8A43490 Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: base64 PEhUTUw+DQo8SEVBRD4NCjxUSVRMRT5VbnRpdGxlZCBEb2N1bWVudDwvVElUTEU+IDxNRVRBIEhU VFAtRVFVSVY9IkNvbnRlbnQtVHlwZSIgQ09OVEVOVD0idGV4dC9odG1sOyBjaGFyc2V0PWdiMjMx MiI+IA0KPC9IRUFEPg0KDQo8Qk9EWSBCR0NPTE9SPSIjRkZGRkZGIiBURVhUPSIjMDAwMDAwIj4N CjxUQUJMRSBXSURUSD0iOTglIiBCT1JERVI9IjAiIENFTExTUEFDSU5HPSIwIiBDRUxMUEFERElO Rz0iMCI+PFRSPjxURD48UCBDTEFTUz1Nc29Ob3JtYWwgU1RZTEU9J21hcmdpbi1yaWdodDotMTcu ODVwdDtsaW5lLWhlaWdodDoxNTAlJz48Rk9OVCBTSVpFPSIyIj7X8L60tcS74dSxo6zE+rrDo6HS vMqzxrfW0Ln6zfi3/s7x0MXPormpxPqyzr+8o7ombmJzcDs8L0ZPTlQ+IA0KPC9QPjxQIENMQVNT PU1zb05vcm1hbCBTVFlMRT0nbWFyZ2luLXJpZ2h0Oi0xNy44NXB0O2xpbmUtaGVpZ2h0OjE1MCUn PjxGT05UIFNJWkU9IjIiPtO109DX1Ly6tcTN+MnPuavLvqOs1bnKvrmry76y+sa3us23/s7xo6zM 4bjfxvPStb661fnBpizE+tPQwb3W1tGh1PGjujxCUj48QlI+MS8gDQo8QQ0KSFJFRj0iaHR0cDov L3d3dy5pZm9vZDEuY29tL2Fib3V0dXMvb3Vyc2VydmljZXMvd2ViLmFzcCI+zfjVvrao1sY8L0E+ IDog19S8us6su6S4/NDCo6y53MDtx7DMqLrzzKijrLj5vt3G89K10OjSqqOsvajBotfUvLq1xM34 yc+5q8u+o6zK/b7dv+LEo7/pyM7E+tGh1PGjusnMx+nQxc+it6KyvCzN+MnPsvrGt9W5yr6jrL/N u6e3/s7x1tDQxCzN+MnPubrO78+1zbMsv827p7nYz7W53MDtLM34yc/C28yzLM34yc+74dLp1tDQ xCzN+MnP1dDGuCzNtsaxz7XNsyzXysHPz8LU2NbQ0MQszsq+7bX3suksIA0K1dCx6rLJubrPtc2z LLfDzsrV382zvMa31s72LCDBxMzsytIovbvB96GizLjF0CmhraGtPC9GT05UPjwvUD48UCBDTEFT Uz1Nc29Ob3JtYWwgU1RZTEU9J2xpbmUtaGVpZ2h0OjIwLjBwdCc+PEI+PEZPTlQgQ09MT1I9IiNG RjAwMDAiPs/rwcu94sr9vt2/4sSjv+nR3cq+1tDQxKO/PC9GT05UPjwvQj48Rk9OVCBTSVpFPSIy Ij7H68Gqz7WjujxBIEhSRUY9Im1haWx0bzpzYWxlc0BpZm9vZDEuY29tIj5zYWxlc0BpZm9vZDEu Y29tPC9BPiANCqGhtee7sKO6MDc1NS0zNzg2MzA5oaHP+srbsr/J8tChveM8L0ZPTlQ+PC9QPjxQ IENMQVNTPU1zb05vcm1hbCBTVFlMRT0nbGluZS1oZWlnaHQ6MjAuMHB0Jz48L1A+PFAgQ0xBU1M9 TXNvTm9ybWFsIFNUWUxFPSdsaW5lLWhlaWdodDoyMC4wcHQnPjxGT05UIFNJWkU9IjIiPjIvIA0K PEEgSFJFRj0iaHR0cDovL29uZXQuaWZvb2QxLmNvbS8iPtK8zfjNqDwvQT4t19TW+sq9vajN+KOs stnX97zytaWjrLy0vai8tNPDo7q/ydW5yr4zMNXFu/K4/Lbg1dXGrKOs19TW+sq9zqy7pKOsv8nL 5sqxuPzQws28xqy6zc7E19bE2sjdo6zU2s/ft6KyvLL6xrfQxc+ioaK5q8u+tq/MrLXIo6zU+cvN tv68trn6vMrT8sP7KMjnaHR0cDovL3lvdXJuYW1lLmlmb29kMS5jb20po6zT69K8yrPGt9bQufrN +CjSs8Pm5K/AwMG/1MLT4jIwMM3ytM4pvfTD3MGsvdOjrMzhuN/C8rzSus25y7/Nt8POysG/o6y2 qMbaytW1vdK8yrPGt9bQufrN+Mzhuam1xL/Nu6fQ6Mfzus2yybm60MXPoqGjPEJSPjwvRk9OVD48 L1A+PFAgQ0xBU1M9TXNvTm9ybWFsIFNUWUxFPSdtYXJnaW4tcmlnaHQ6LTE3Ljg1cHQ7bGluZS1o ZWlnaHQ6MTUwJSc+PEZPTlQgU0laRT0iMiI+PEJSPjwvRk9OVD4gDQo8Qj48Rk9OVCBDT0xPUj0i I0ZGMDAwMCI+NzwvRk9OVD48L0I+PEZPTlQgQ09MT1I9IiNGRjAwMDAiPjxCPtTCMzDI1cewyerH 67KiuLa/7sq508PSvM34zaijrMzYsfDTxbvdvNszODAw1KovxOqjrNT5y83M9cLrueO45rKiw+K3 0dTayrPGt9eo0rXU09a+v6+1x7mpo6zH86OstPrA7aOsus/X99DFz6I8L0I+PEJSPjwvRk9OVD4g DQo8Rk9OVCBTSVpFPSIyIj7P68HLveK4/LbgPyChocfrwarPtaO6PEEgSFJFRj0ibWFpbHRvOnNh bGVzQGlmb29kMS5jb20iPnNhbGVzQGlmb29kMS5jb208L0E+IA0KoaG157uwo7owNzU1LTM3ODYz MDmhoaGhz/rK27K/yfLQob3jPEJSPjwvRk9OVD48Rk9OVCBTSVpFPSIyIj678jxBDQpIUkVGPSJo dHRwOi8vd3d3Lmlmb29kMS5jb20vYWJvdXR1cy9vdXJzZXJ2aWNlcy9jcHNlcnZpY2UuYXNwIj63 w87KztLDx7XEzfjSszwvQT46d3d3Lmlmb29kMS5jb208L0ZPTlQ+PC9QPjxQIENMQVNTPU1zb05v cm1hbCBTVFlMRT0nbGluZS1oZWlnaHQ6MjAuMHB0JyBBTElHTj0iTEVGVCI+PC9QPjxQIENMQVNT PU1zb05vcm1hbCBBTElHTj1MRUZUIFNUWUxFPSdsaW5lLWhlaWdodDoyMC4wcHQnPjxGT05UIFNJ WkU9IjIiPjxCPrvY1rSjqMfrtKvV5qO6MDc1NS0zMjM5MDQ3u/K3orXn19PTyrz+o7o8L0I+PEEN CkhSRUY9Im1haWx0bzpzYWxlc0BpZm9vZDEuY29tIj5zYWxlc0BpZm9vZDEuY29tIDwvQT48Qj6j qTwvQj48L0ZPTlQ+PC9QPjxQPjxGT05UIFNJWkU9IjIiPqH1IA0Ksb65q8u+ttTN+NW+tqjWxrjQ 0MvIpKGhoaEmbmJzcDsmbmJzcDsgJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7 Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7IA0KJm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7Jm5ic3A7 Jm5ic3A7IKH1ILG+uavLvrbU0rzN+M2ot/7O8bjQ0MvIpDwvRk9OVD48L1A+PFAgQ0xBU1M9TXNv Tm9ybWFsIFNUWUxFPSdsaW5lLWhlaWdodDoyMC4wcHQnPjxGT05UIFNJWkU9IjIiPrmry77D+7PG o7pfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX1/Bqs+1yMujul9f X19fX19fX19fX19fX19fX19fX19fIA0KPEJSPiA8QlI+ILXnu7Cjul9fX19fX19fX19fX19fX19f X19fX7Sr1eajul9fX19fX19fX19fX19fX19fX19fX19FLW1haWyjul9fX19fX19fX19fX19fX19f X19fX18gDQo8L0ZPTlQ+PC9QPjxQIENMQVNTPU1zb05vcm1hbCBTVFlMRT0nbGluZS1oZWlnaHQ6 MjAuMHB0Jz48L1A+PC9URD48L1RSPjwvVEFCTEU+IA0KPC9CT0RZPg0KPC9IVE1MPg0K ------=_NextPart_000_34E8B_01C10AFC.B8A43490-- From fdrake@acm.org Thu Jul 12 15:48:03 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Thu, 12 Jul 2001 10:48:03 -0400 (EDT) Subject: [Doc-SIG] Docs for 2.1.1c1 frozen Message-ID: <15181.47267.392908.120928@cj42289-a.reston1.va.home.com> I'm freezing the Doc/ tree on the release21-maint branch until the 2.1.1c1 release is out. If you find a bug in that version of the docs, please report it via the SourceForge bug tracker, even if you have checkin permission, at least until the freeze is lifted. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From fdrake@cj42289-a.reston1.va.home.com Thu Jul 12 22:37:23 2001 From: fdrake@cj42289-a.reston1.va.home.com (Fred Drake) Date: Thu, 12 Jul 2001 17:37:23 -0400 (EDT) Subject: [Doc-SIG] [maintenance doc updates] Message-ID: <20010712213723.CE4202892B@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/maint-docs/ Final documentation build for Python 2.1.1 release candidate 1. This version is also available at the Python FTP site: ftp://ftp.python.org/pub/python/doc/2.1.1c1/ From fdrake@acm.org Fri Jul 13 00:50:43 2001 From: fdrake@acm.org (Fred L. Drake) Date: Thu, 12 Jul 2001 19:50:43 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Lots of small updates. Added Eric Raymond's documentation for the XML-RPM module added to the standard library. From esr@thyrsus.com Fri Jul 13 01:04:23 2001 From: esr@thyrsus.com (Eric S. Raymond) Date: Thu, 12 Jul 2001 20:04:23 -0400 Subject: [Doc-SIG] Re: [Python-Dev] [development doc updates] In-Reply-To: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com>; from fdrake@acm.org on Thu, Jul 12, 2001 at 07:50:43PM -0400 References: <20010712235043.5A42D2892B@cj42289-a.reston1.va.home.com> Message-ID: <20010712200423.A13553@thyrsus.com> Fred L. Drake : > The development version of the documentation has been updated: > > http://python.sourceforge.net/devel-docs/ > > Lots of small updates. > > Added Eric Raymond's documentation for the XML-RPM module added to > the standard library. Calling the effbot! Calling the effbot! Fredrik, please proofread my stuff and fill in any important bits you think are missing. -- Eric S. Raymond "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -- Benjamin Franklin, Historical Review of Pennsylvania, 1759. From fdrake@acm.org Sat Jul 14 04:20:22 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 13 Jul 2001 23:20:22 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20010714032022.74B0B28927@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ From wwwjessie@21cn.com Mon Jul 16 10:47:31 2001 From: wwwjessie@21cn.com (wwwjessie@21cn.com) Date: Mon, 16 Jul 2001 17:47:31 +0800 Subject: [Doc-SIG] =?gb2312?B?tPPBrC0yMDAxxOq5+rzKwszJq8qzxrfT68jLwOC9ob+1sqnAwLvhKA==?= =?gb2312?B?QWdybyBBbm51YWwgTWVldGluZyBDaGluYSAyMDAxKQ0=?= Message-ID: <2d8f601c10ddc$54e1e0a0$9300a8c0@ifood1gongxing> This is a multi-part message in MIME format. ------=_NextPart_000_2D8F7_01C10E1F.630520A0 Content-Type: text/plain; charset="gb2312" Content-Transfer-Encoding: base64 MjAwMcTq1tC5+rn6vMrFqdK1v8a8vMTqu+ENCrn6vMrCzMmryrPGt9PryMvA4L2hv7WyqcDAu+G8 sNGnyvXM1sLbu+ENCg0KCQ0K1bnG2qO6IAmhoTIwMDHE6jnUwjTI1S03yNUJDQq12LXjo7ogCaGh tPPBrNDHuqO74dW51tDQxAkNCtb3sOyjuiAJoaHW0LuqyMvD8bmyus25+sWp0rWyvw0KoaHW0Ln6 v8bRp7y8yvXQrbvhDQqhobTzwazK0MjLw/HV/riuDQoJDQqz0LDso7ogCaGh1tC5+sLMyavKs8a3 t6LVudbQ0MQNCqGh1tC5+sWp0ae74Q0KoaHW0Ln6wszJq8qzxrfQrbvhDQqhobTzwazK0MWp0rW+ 1g0KoaG088Gs0Me6o7vh1bnW0NDEDQoJDQrN+MLnt/7O8czhuanJzKO60rzKs8a31tC5+s34IGh0 dHA6Ly93d3cuaWZvb2QxLmNvbQ0KPGh0dHA6Ly93d3cuaWZvb2QxLmNvbS9pbmRleC5hc3A/ZnI9 ZG9jLXNpZ0BweXRob24ub3JnPiAJDQogCQ0Kofogzai5/dK8yrPGt9bQufrN+LGow/uyztW5o7q+ xdXb08W73SixyMjnz9bT0MO/uPYgM00gWCAzTSC1xLHq17zVuc671K2821JNQjQ1MDCjrM2ouf3O 0sPH1rvQ6Li2Uk1CNDA1MCmjrA0KsajD+73Y1rnI1cbaMjAwMcTqN9TCMjDI1SA8aHR0cDovL2dy ZWVuMjAwMS5pZm9vZDEuY29tL2Zyb20xLmFzcD4gDQqh+iC7ttOtIMPit9HXorLhIDxodHRwOi8v d3d3Lmlmb29kMS5jb20vc2lnbnVwL3NldmFncmVlbS5hc3A+ILPJzqq5q8u+u+HUsaGjDQo31MIy MMjVx7DXorLho6zE+r2r1No31MIyNcjVx7DNqLn9tefX09PKvP63vcq9w+K30bvxtcMzMMz1ssm5 utDFz6Khow0KyOe5+8T6srvP68rVtb3O0sPHtcTTyrz+o6zH6yDBqs+1ztLDxyA8bWFpbHRvOnVu c3Vic2NyaWJlQGlmb29kMS5jb20+IKOsztLDx9LUuvO9q7K71Nm3otPKvP64+MT6oaMNCrLp0a+j uiBzYWxlc0BpZm9vZDEuY29tIDxtYWlsdG86c2FsZXNAaWZvb2QxLmNvbT4gIKGhoaG157uwo7ow NzU1LTM3ODYzMDmhoc/6ytuyvw0KyfLQob3jILbFz8jJ+g0KDQoNCiANCg0Ku9gg1rQgo6jH67Sr 1eajujA3NTUtMzIzOTA0N7vyILeitefX09PKvP6juiBzYWxlc0BpZm9vZDEuY29tIDxtYWlsdG86 c2FsZXNAaWZvb2QxLmNvbT4NCqOpCQ0KofUgsb65q8u+09DS4s2ouf3SvMqzxrfW0Ln6zfiyztW5 IKGhoaEgofUgsb65q8u+xOK9+NK7sr3By73iuMOyqcDAu+GjrMfr0+vO0sPHwarPtQ0KDQq5q8u+ w/uzxqO6X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX18NCsGqz7XIy6O6X19f X19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fXw0Ktee7sKO6X19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fXw0KtKvV5qO6X19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fXw0KRS1tYWlso7pfX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19fX19f DQoJDQogCQ0KIAkNCiAJDQogCQ0KIAkNCg== ------=_NextPart_000_2D8F7_01C10E1F.630520A0 Content-Type: text/html; charset="gb2312" Content-Transfer-Encoding: base64 PGh0bWw+DQo8aGVhZD4NCjx0aXRsZT5VbnRpdGxlZCBEb2N1bWVudDwvdGl0bGU+IDxtZXRhIGh0 dHAtZXF1aXY9IkNvbnRlbnQtVHlwZSIgY29udGVudD0idGV4dC9odG1sOyBjaGFyc2V0PWdiMjMx MiI+IA0KPHN0eWxlIHR5cGU9InRleHQvY3NzIj4NCjwhLS0NCnRkIHsgIGxpbmUtaGVpZ2h0OiAy NHB4fQ0KLS0+DQo8L3N0eWxlPiANCjwvaGVhZD4NCg0KPGJvZHkgYmdjb2xvcj0iI0ZGRkZGRiIg dGV4dD0iIzAwMDAwMCI+DQo8ZGl2IGFsaWduPSJDRU5URVIiPjx0YWJsZSB3aWR0aD0iNzUlIiBi b3JkZXI9IjAiIGNlbGxzcGFjaW5nPSIwIiBjZWxscGFkZGluZz0iMCI+PHRyPjx0ZCBhbGlnbj0i Q0VOVEVSIj48YSBocmVmPSJodHRwOy8vZ3JlZW4yMDAxLmlmb29kMS5jb20iPjxiPjIwMDHE6tbQ ufq5+rzKxanStb/GvLzE6rvhPGJyPrn6vMrCzMmryrPGt9PryMvA4L2hv7WyqcDAu+G8sNGnyvXM 1sLbu+E8L2I+PC9hPjxicj48YnI+PC90ZD48L3RyPjx0cj48dGQgYWxpZ249IkNFTlRFUiI+PHRh YmxlIHdpZHRoPSI3NSUiIGJvcmRlcj0iMCIgY2VsbHNwYWNpbmc9IjAiIGNlbGxwYWRkaW5nPSIw Ij48dHI+PHRkIGhlaWdodD0iMTIiIHdpZHRoPSIzOSUiIGFsaWduPSJSSUdIVCI+PGI+PGZvbnQg c2l6ZT0iMiI+1bnG2qO6IA0KPC9mb250PjwvYj48L3RkPjx0ZCBoZWlnaHQ9IjEyIiB3aWR0aD0i NjElIj48Zm9udCBzaXplPSIyIj6hoTIwMDHE6jnUwjTI1S03yNU8L2ZvbnQ+PC90ZD48L3RyPjx0 cj48dGQgaGVpZ2h0PSIxMiIgd2lkdGg9IjM5JSIgYWxpZ249IlJJR0hUIj48Yj48Zm9udCBzaXpl PSIyIj612LXjo7ogDQo8L2ZvbnQ+PC9iPjwvdGQ+PHRkIGhlaWdodD0iMTIiIHdpZHRoPSI2MSUi Pjxmb250IHNpemU9IjIiPqGhtPPBrNDHuqO74dW51tDQxDwvZm9udD48L3RkPjwvdHI+PHRyPjx0 ZCBoZWlnaHQ9IjEyIiB3aWR0aD0iMzklIiBhbGlnbj0iUklHSFQiIHZhbGlnbj0iVE9QIj48Yj48 Zm9udCBzaXplPSIyIj7W97Dso7ogDQo8L2ZvbnQ+PC9iPjwvdGQ+PHRkIGhlaWdodD0iMTIiIHdp ZHRoPSI2MSUiPjxmb250IHNpemU9IjIiPqGhPC9mb250Pjxmb250IHNpemU9IjIiPtbQu6rIy8Px ubK6zbn6xanStbK/PGJyPqGh1tC5+r/G0ae8vMr10K274Txicj6hobTzwazK0MjLw/HV/riuPGJy PjwvZm9udD48L3RkPjwvdHI+PHRyPjx0ZCBoZWlnaHQ9IjEyIiB3aWR0aD0iMzklIiBhbGlnbj0i UklHSFQiIHZhbGlnbj0iVE9QIj48Yj48Zm9udCBzaXplPSIyIj6z0LDso7ogDQo8L2ZvbnQ+PC9i PjwvdGQ+PHRkIGhlaWdodD0iMTIiIHdpZHRoPSI2MSUiPjxmb250IHNpemU9IjIiPqGhPC9mb250 Pjxmb250IHNpemU9IjIiPtbQufrCzMmryrPGt7ei1bnW0NDEPGJyPqGh1tC5+sWp0ae74Txicj6h odbQufrCzMmryrPGt9Ctu+E8YnI+oaG088GsytDFqdK1vtY8YnI+oaG088Gs0Me6o7vh1bnW0NDE PGJyPjwvZm9udD48L3RkPjwvdHI+PHRyPjx0ZCBjb2xzcGFuPSIyIiBhbGlnbj0iQ0VOVEVSIj48 Zm9udCBzaXplPSIyIj7N+MLnt/7O8czhuanJzKO60rzKs8a31tC5+s34IA0KPGEgaHJlZj0iaHR0 cDovL3d3dy5pZm9vZDEuY29tL2luZGV4LmFzcD9mcj1kb2Mtc2lnQHB5dGhvbi5vcmciPmh0dHA6 Ly93d3cuaWZvb2QxLmNvbTwvYT48L2ZvbnQ+PC90ZD48L3RyPjx0cj48dGQgY29sc3Bhbj0iMiIg YWxpZ249IkNFTlRFUiI+Jm5ic3A7PC90ZD48L3RyPjx0cj48dGQgY29sc3Bhbj0iMiIgYWxpZ249 IkxFRlQiPjxwPjxmb250IHNpemU9IjIiPqH6IA0Kzai5/dK8yrPGt9bQufrN+LGow/uyztW5o7o8 Yj48Zm9udCBzaXplPSIzIiBjb2xvcj0iI0ZGMDAwMCI+vsXV29PFu908L2ZvbnQ+PC9iPiixyMjn z9bT0MO/uPYgM00gWCAzTSANCrXEserXvNW5zrvUrbzbUk1CNDUwMKOszai5/c7Sw8fWu9DouLZS TUI0MDUwKaOsIDxhIGhyZWY9Imh0dHA6Ly9ncmVlbjIwMDEuaWZvb2QxLmNvbS9mcm9tMS5hc3Ai PjxiPjxmb250IHNpemU9IjMiIGNvbG9yPSIjRkYwMDAwIj6xqMP7vdjWucjVxtoyMDAxxOo31MIy MMjVPC9mb250PjwvYj48L2E+PGJyPqH6IA0Ku7bTrTxhIGhyZWY9Imh0dHA6Ly93d3cuaWZvb2Qx LmNvbS9zaWdudXAvc2V2YWdyZWVtLmFzcCI+w+K30deisuE8L2E+s8nOqrmry7674dSxoaMgPGZv bnQgY29sb3I9IiNGRjAwMDAiPjxiPjxmb250IHNpemU9IjMiPjfUwjIwyNXHsNeisuGjrMT6vavU 2jfUwjI1yNXHsM2ouf2159fT08q8/re9yr3D4rfRu/G1wzMwzPWyybm60MXPoqGjPC9mb250Pjwv Yj48L2ZvbnQ+PGJyPsjnufvE+rK7z+vK1bW9ztLDx7XE08q8/qOsx+s8YSBocmVmPSJtYWlsdG86 dW5zdWJzY3JpYmVAaWZvb2QxLmNvbSI+warPtc7Sw8c8L2E+o6zO0sPH0tS6872rsrvU2bei08q8 /rj4xPqhozxicj6y6dGvo7o8YSBocmVmPSJtYWlsdG86c2FsZXNAaWZvb2QxLmNvbSI+c2FsZXNA aWZvb2QxLmNvbTwvYT4gDQqhoaGhtee7sKO6MDc1NS0zNzg2MzA5oaHP+srbsr8gyfLQob3jILbF z8jJ+jxicj48L2ZvbnQ+PC9wPjxwPiZuYnNwOzwvcD48L3RkPjwvdHI+PHRyPjx0ZCBoZWlnaHQ9 IjMwIiBjb2xzcGFuPSIyIiBhbGlnbj0iQ0VOVEVSIj48Zm9udCBzaXplPSIyIj48Yj672CANCta0 IKOox+u0q9Xmo7owNzU1LTMyMzkwNDe78iC3orXn19PTyrz+o7ogPGEgaHJlZj0ibWFpbHRvOnNh bGVzQGlmb29kMS5jb20iPnNhbGVzQGlmb29kMS5jb208L2E+IA0Ko6k8L2I+PC9mb250PjwvdGQ+ PC90cj48dHI+PHRkIGhlaWdodD0iMTIiIGNvbHNwYW49IjIiPjxmb250IHNpemU9IjIiPqH1ILG+ uavLvtPQ0uLNqLn90rzKs8a31tC5+s34ss7VuSANCqGhoaEgofUgsb65q8u+xOK9+NK7sr3By73i uMOyqcDAu+GjrMfr0+vO0sPHwarPtTxicj48YnI+uavLvsP7s8ajul9fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fPGJyPsGqz7XIy6O6X19fX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fXzxicj48L2ZvbnQ+PGZvbnQgc2l6ZT0iMiI+tee7sKO6X19fX19fX19fX19f X19fX19fX19fX19fX19fX19fX19fX19fXzxicj60q9Xmo7pfX19fX19fX19fX19fX19fX19fX19f X19fX19fX19fX19fX19fPGJyPkUtbWFpbKO6X19fX19fX19fX19fX19fX19fX19fX19fX19fX19f X19fX19fXzxicj48L2ZvbnQ+PC90ZD48L3RyPjx0cj48dGQgaGVpZ2h0PSIxMiIgY29sc3Bhbj0i MiIgYWxpZ249IkxFRlQiPiZuYnNwOzwvdGQ+PC90cj48dHI+PHRkIGhlaWdodD0iMTIiIGNvbHNw YW49IjIiIGFsaWduPSJMRUZUIj4mbmJzcDs8L3RkPjwvdHI+PHRyPjx0ZCBoZWlnaHQ9IjEyIiBj b2xzcGFuPSIyIiBhbGlnbj0iTEVGVCI+Jm5ic3A7PC90ZD48L3RyPjwvdGFibGU+PC90ZD48L3Ry Pjx0cj48dGQ+Jm5ic3A7PC90ZD48L3RyPjx0cj48dGQ+Jm5ic3A7PC90ZD48L3RyPjwvdGFibGU+ PC9kaXY+DQo8L2JvZHk+DQo8L2h0bWw+DQo= ------=_NextPart_000_2D8F7_01C10E1F.630520A0-- From garth@deadlybloodyserious.com Tue Jul 17 03:20:36 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Tue, 17 Jul 2001 12:20:36 +1000 Subject: [Doc-SIG] RE: reStructuredText In-Reply-To: Message-ID: >> Given that there's no parser yet, I suppose now is a good time to >> consider Wiki markup requirements and how they might be added as >> extensions or incorporated into the specification? > > I'm not a Wiki user, so I don't know what special Wiki > requirements might look like. Can you elaborate? See QuickWikiBackground_ below for a quick background on what the hell Wiki is. Skip to WikiModificationsToReStructuredText_ if you just want to find out what modifications might be required to the reStructuredText specification. Just for a play, I'm going to try writing this mail in reStructuredText. .. _QuickWikiBackground: QuickWikiBackground =================== Wiki fans, my apologies if any of this is a tad confrontational or bordering on flamebait [FlameFlame]_. I'm in a hurry. :/ A Wiki is a web-based content management system that makes buiding web sites as easy as tunnelling with a magic shovel. Three key reasons for this ease of use are: * Wiki is brower based, * Most of what you type in will display properly, and * You create links with WordsOfManyCapitalLetters [7]_. If I'd typed that into a Wiki, what I'd get when I hit the Submit would be some normal text, a bullet list, and a link called WordsOfManyCapitalLetters. If I clicked on it, I'd be fed the edit form for a newly created WordsOfManyCapitalLetters page. .. _[7]: Properly called WikiNames. .. _[8]: Purists might want blank lines between those, but I want this to be as close to copy-and-paste from email as I can get it. .. _[9]: I first noticed the need for automatic footnote numbering in footnote [1]_. Now I've finally hit the tangle. I've also just noticed that if I were rendering this for printing I'd want a different format for the explicit link to the footnote -- specifically, not a subscript. .. _[FlameFlame]: This document has so much markup and so many bloody footnotes [cough]_ it'll be a miracle if anyone manages to read far enough to find anything worth flaming. If I referred to DocstringProcessingSystem, that'd become a link, and if I clicked on it I'd be editing the DocstringProcessingSystem page. Creating a new Wiki has so little friction the process can become a non stop brain dump. I created 150+ pages on a ZWiki in spare time between meetings in a week. I'm sure others have done more. To have a play with a Python-based Wiki, check out MoinMoin_. My biggest problem with MoinMoin is that the default markup drives me nuts (sorry, Jurgen). I have some other issues (see BlogBlog_ [LameLame]_), but my biggest source of angst is probably the parser issue. MoinMoin supports plug-in parsers, but none of them grab me yet. See below. I'm slightly more comfortable with the markup used in ZWiki_, which is heavily based on StructuredText. StructuredText has its own problems, though. In short, it's bloody unpredictable. StructuredTextNG might fix it, and the BizarStructuredText plug-in contributed to MoinMoin by RichardJones [1]_ reStructuredText_ seems a lot cleaner to me than either StructuredText or WikiWiki markup. I have marginal concerns about how "normal" people will cope with the underscore suffix [2]_ for links, but my reading of the spec was pleasant enough. What I'd like to do is develop a reStructuredText plug-in parser for MoinMoin, also one for MoinWiki:BlogBlog [3]_ (which I've decided is going to pre-parse pages to XML for various reasons) .. _MoinMoin: http://moin.sourceforge.net/cgi-bin/moin/moin/ .. _BlogBlog: http://moin.sourceforge.net/cgi-bin/moin/moin/BlogBlog .. _ZWiki: http://zwiki.org/FrontPage .. _[LameLame]: I've been promising to finish work on a Wiki clone for years now. I did some work on ZWiki modifications (NooZWiki), which was fine until I got sick of a) edit conflict error handling in Zope, and b) the incredible costs of Zope posting at the time. I'm now caught between extending MoinMoin to suit my purposes -- it's really quite amazing -- or writing my own. I sympathise with anyone tempted to consider me a mere meddling troublemaker until I finally cough up some code. .. _NooZWiki: http://zwiki.org/NooZWiki .. _reStructuredText: http://structuredtext.sf.net .. _[1]: As you can see, automatically dropping in WikiNames is an easy habit to fall into -- you'll find yourself doing it accidentally in email. Whilst I'm in a reStructuredText footnote, however, surely they should be automatically numbered? [9]_ .. _[2]: The problem is that the underscore becomes a prefix when finally pointed. If it's confusing me, it's definitely going to confuse my users. They're going to have enough trouble just with .. `identifier`: URL. Is this another paragraph in the [2]_ footnote, or is this in the main text? .. _[3]: This is an example of an InterWiki_ link, a streamlined way of pointing to a particular page at another known Wiki. .. _InterWiki: http://moin.sourceforge.net/cgi-bin/moin/moin/InterWiki .. _WikiModificationsToReStructuredText: Pant. Wheeze. This is hard work. Especially the footnotes. WikiModificationsToReStructuredText [4]_ [5]_ =================================== Wiki would need the following from reStructuredText: * Suddenly, there's markup that doesn't look like punctuation. WikiName might well end up being a link. * As written, a reStructuredText document will always parse the same way. Once you introduce WikiNature, it'll parse differently depending on which other WikiNames are defined. [BlogBlogXML]_ [SickOfRememberingNumbers]_ [InternalWikiNames]_ * Square brackets are extremely useful in ZWiki markup to force a word that doesn't look like a WikiName to be treated as a link to a page of that name. For example, [David]. [10]_ * I dimly remember square brackets also being useful for an inline linking representation I don't recall from the reStructuredText markup. It's something like [here is some text: URL]. A more reStructuredText way of doing it would be something like `here is some text`:URL. [11]_ [12]_ * MoinMoin further overloads square brackets for an EXTREMELY [6a]_ useful macro system. .. _[BlogBlogXML]: ... which is why I want to preparse documents to XML in BlogBlog. , here we come. That brings us back to predictable output, which I can then reparse to produce appropriate HTML. .. _[SickOfRememberingNumbers]: Bah! And now, I'm suddenly wondering whether underscored link destinations in reStructuredText specifically use square brackets to say "this is a footnote", or whether that's handled by the presence of non-URL text after the colon without an intervening newline. .. _[InternalWikiNames]: Wouldn't it be nice for other pages in the Wiki to be able to refer to this footnote as WhateverThisDocumentIsNamed.InternalWikiNames? .. _[4]: The `MoinMoin heading style`_ seems to have a lot to offer reStructuredText, which seems unable to do sub-headings. At least, having read the spec, I don't remember how to do it -- and the markup really has to be that simple, hence my problems with the underscores [1]_ [6]_. .. _[5]: Another confusion: should the equals signs extend all the way, or not? I hope not, otherwise users editing with proportional fonts are going to have an awful time. .. _[6]: Aha! Sometimes, you need [6a]_ to be able to explicitly target a previous footnote. Now I'm thinking "named, automatically numbered footnotes". .. _[6a]: Hang on, how do I embolden again? Suddenly, the StructuredText *embolden this* format looks lovely. `embolden this`*? *`embolden this`*? .. _`MoinMoin heading style`: http://moin.sourceforge.net/cgi-bin/moin/moin/HelpOnHeadlines .. _[10]: Some Wikis scan any word with an initial capital to see whether or not there's a page of that name, in which case you only need the square brackets to force a link to the uncreated page when you're first creating it. .. _[11]: I like the backquotes! .. _[12]: We need to also consider `here is some text`:WikiName. My brain hurts. This is harder than it looks. I should never [6a]_ have started with the footnotes. :) Regards, Garth. .. _[cough]: You are in a twisty maze of little passages, all alike. From fdrake@acm.org Tue Jul 17 19:21:29 2001 From: fdrake@acm.org (Fred L. Drake, Jr.) Date: Tue, 17 Jul 2001 14:21:29 -0400 (EDT) Subject: [Doc-SIG] Docs for 2.2a1 frozen Message-ID: <15188.33321.534672.664230@cj42289-a.reston1.va.home.com> Please do not make any documentation checkins on the trunk or descr-branch; we're getting things ready for the 2.2a1 release. Thanks! -Fred -- Fred L. Drake, Jr. PythonLabs at Digital Creations From fdrake@acm.org Wed Jul 18 00:31:07 2001 From: fdrake@acm.org (Fred L. Drake) Date: Tue, 17 Jul 2001 19:31:07 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20010717233107.C0EAE2892B@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Final update of the 2.2a1 documentation. From fdrake@beowolf.digicool.com Wed Jul 18 21:10:31 2001 From: fdrake@beowolf.digicool.com (Fred Drake) Date: Wed, 18 Jul 2001 16:10:31 -0400 (EDT) Subject: [Doc-SIG] [maintenance doc updates] Message-ID: <20010718201031.F34352892C@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/maint-docs/ Current status of the 2.1.1 documentation -- very few changes since the 2.1.1c1 release. From garth@deadlybloodyserious.com Thu Jul 19 02:12:56 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Thu, 19 Jul 2001 11:12:56 +1000 Subject: [Doc-SIG] RE: reStructuredText In-Reply-To: Message-ID: It just occurred to me: The lowest friction implementation for a Wiki using reStructuredText is to implicitly assume that if a _ target doesn't exist in the current document, the link must be intended to refer to another document in the system. This would still require people to add the _ suffix for anything they want linked, but would eliminate the concerns about non-punctuation markup and ways of forcing links to pages whose names aren't that WikiLike. I'm concerned, however, that people could end up with some truly wierd names:: See `the rest of my documentation on the fnarzle system`_ for more details. ... would imply a page named "the rest of my... system". I think it's a tad ugly, and that it might encourage slackness that WikiNamesOfTooManyWordsLookVerySilly tends to implicitly suppress. :) On the other hand, the system can always refuse to create pages of such names. This brings me back to the potential need for a "short form" of the link syntax. Consider an attempt to have the above example link to a more reasonably named FnarzleDocs:: See `the rest of my documentation on the fnarzle system`_ for more details. .. _`the rest of my documentation on the fnarzle system`: FnarzleDocs_ ... with FnarzleDocs_ being unresolved within the current document, resolving to an in-system URL, and collapsed by the parser into a single link with the appropriate text. I think my users would prefer:: See `the rest of my documentation on the fnarzle system`:FnarzleDocs for more details. BTW, I just found the section in the spec that permits:: I think you should `download Python 2.1`:http://www.python.org/2.1/ before you touch that ugly Perl code. ... which makes my suggestion look fairly reasonable. Miscellaneous concerns: - embedding directives like includes or macro calls mid-paragraph. - being able to recognise indented paragraphs solely by their first line, so that people can lazily just keep typing (like I'm doing now) without having to manually terminate lines and indent the next one. The behaviour is arguably implied by the specification ("This is a paragraph continuation, not a sublist"), but I'd love explicit confirmation because it might be argued that an outdent implicitly terminates the previous paragraph but an indent doesn't. Needless to say, I'd argue against that. - underlined style headings; I *really* like MoinMoin's use of a number of equals signs on each side of a paragraph: if the number of equals signs is the same, the paragraph is a heading of ident level equal to the number of equals signs. For example:: = this is a level 1 heading = == this is a level 2 heading == == this is also a level 2 heading despite the fact that it has been wrapped for some reason == === this is not a heading because the number of equals signs isn't equal . == - I would hope that the reStructuredText parser is smart enough to figure out that the example text above is part of the
  • above? - What's wrong with an automatically wrapped comment block like the following? :: .. This would be a comment block except this line hasn't been indented because some client re-wrapped the lines. Damn! - I'm trying to figure out how to refer to a target within another document: - \`hyperlink targets`_ in the spec refers to the appropriate heading; - \`the specification`_ could refer to the spec if there was an appropriate target specification later in the document:: .. `the specification`: reStructuredText (implicitly targeting the reStructuredText peer of the referrer); - \`the specification`:reStructuredText could do the same given the [mild] extension I suggested earlier; - Using a dot as a delimiter could be treaded by the link resolver as an implied sub-target:: See `hyperlinks`:reStructuredText.`hyperlink targets`. ... but that starts to get ugly. You can see why it's worrying me. :) - I hope the parser is able to deal with paragraphs inconveniently wrapped in the middle of some emphasis:: This is *emphasised text* with an inconvenient wrap point. ... otherwise I'd need an option to require delimiting the text:: This is *`emphasised text`* with an... ... a shortcut for which could be:: This is `emphasized text`* with an... ... which matches nicely with the link syntax, at least if you don't need to embold the link, which sometimes I do:: This is an `emphasized link`*_ with an... That's not as ugly as I thought it would be. - Leading and trailing whitespace should be trimmed from inline literals so that someone can do this:: Find the `` `interpreted text` `` in this paragraph! If I should just make up my own mind on each and submit a diff per to be discussed|debated|mangled, someone let me know, okay? :) Regards, Garth. From dgoodger@bigfoot.com Thu Jul 19 06:11:15 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Thu, 19 Jul 2001 01:11:15 -0400 Subject: [Doc-SIG] RE: reStructuredText In-Reply-To: Message-ID: on 2001-07-18 9:12 PM, Garth T Kidd (garth@deadlybloodyserious.com) wrote: > It just occurred to me: > > The lowest friction implementation for a Wiki using reStructuredText is > to implicitly assume that if a _ target doesn't exist in the current > document, the link must be intended to refer to another document in the > system. The "create a new page if the link doesn't exist" mechanism is an application issue. If a Wiki uses reStructuredText, it's free to do whatever it likes. The markup doesn't need to know about it though. > This would still require people to add the _ suffix for anything they > want linked, but would eliminate the concerns about non-punctuation > markup Wiki ImplicitLinksUsingCamelCase are so ambiguous, they must cause lots of problems. Like a discussion about "Old MacDonald"... Better to have some unambiguous syntax saying "this is a link". > and ways of forcing links to pages whose names aren't that > WikiLike. I'm concerned, however, that people could end up with some > truly wierd names:: ... > On the other hand, the system can always refuse to create pages of such names. Again, an application issue. > This brings me back to the potential need for a "short form" of the link > syntax. Consider an attempt to have the above example link to a more > reasonably named FnarzleDocs:: > > See `the rest of my documentation on the fnarzle system`_ for more > details. > > .. _`the rest of my documentation on the fnarzle system`: FnarzleDocs_ Multiply-indirect hyperlinks? Interesting idea. I don't know if it's worth the trouble though. Of course, Wiki users could just write:: See the rest of my documentation on the fnarzle system (FnarzleDocs_) for more details. > I think my users would prefer:: > > See `the rest of my documentation on the fnarzle system`:FnarzleDocs > for more details. > > BTW, I just found the section in the spec that permits:: > > I think you should `download Python 2.1`:http://www.python.org/2.1/ > before you touch that ugly Perl code. Which section of which spec? That looks like StructuredText hyperlink markup, which I rejected for reStructuredText. I chose a modified Setext indirect hyperlink style because of WYSIWYG. In the processed page, we don't want the URL of the hyperlink to be visible. In the raw text, having the URL immediately after the link text is distracting; it breaks the flow of the text. > Miscellaneous concerns: > > - embedding directives like includes or macro calls mid-paragraph. Could be done with interpreted text. But is it really necessary? Could you provide some examples? > - being able to recognise indented paragraphs solely by their first > line, so that people can lazily just keep typing (like I'm doing now) > without having to manually terminate lines and indent the next one. The > behaviour is arguably implied by the specification ("This is a paragraph > continuation, not a sublist"), but I'd love explicit confirmation > because it might be argued that an outdent implicitly terminates the > previous paragraph but an indent doesn't. Needless to say, I'd argue > against that. Unfortunately, that syntax is ambiguous if blank lines between list items are optional, which reStructuredText allows. You can have one or the other, not both. For example, if a list item's paragraph containing text "x = x - 1" were to word wrap badly, you'd end up with:: - This is list item 1. Here's a formula: "x = x - 1". - Here's list item 2. Sure looks like item 3 though. And that's too dangerous to allow. I agree that the lazy typing style is convenient, but reStructuredText has avoid ambiguity as much as possible. The Doc-SIG historical record shows that allowing intra-list-item blank lines to be optional is more in demand. Opinions or counter-arguments anyone? > - underlined style headings; I *really* like MoinMoin's use of a number > of equals signs on each side of a paragraph: if the number of equals > signs is the same, the paragraph is a heading of ident level equal to > the number of equals signs. It's a workable alternate syntax. > - I would hope that the reStructuredText parser is smart enough to > figure out that the example text above is part of the
  • above? Once you indent the list item's paragraph, and further indent the example text, yes. :-) > - What's wrong with an automatically wrapped comment block like the > following? :: > > .. This would be a comment block except > this line hasn't been indented > because some client re-wrapped the > lines. Damn! Same as list items: ambiguity. > - I'm trying to figure out how to refer to a target within another > document: There's HTML's fragment syntax, which could be used:: .. _link to 'refname' within another file: fileURL#refname > - I hope the parser is able to deal with paragraphs inconveniently > wrapped in the middle of some emphasis:: > > This is *emphasised > text* with an > inconvenient wrap > point. Yes, already implemented. > ... otherwise I'd need an option to require delimiting the text:: Ugh. Thank goodness, unnecessary. > ... which matches nicely with the link syntax, at least if you don't > need to embold the link, which sometimes I do:: > > This is an `emphasized link`*_ with an... reStructuredText doesn't support nested inline markup. That way lies madness... > That's not as ugly as I thought it would be. Ugly enough ;-) > - Leading and trailing whitespace should be trimmed from inline > literals so that someone can do this:: > > Find the `` `interpreted text` `` in this paragraph! Inline markup start-strings must be followed by non-whitespace, end-strings preceeded by non-whitespace, so that won't work. What will work, though, is:: Find the ```interpreted text``` in this paragraph! or:: Find the \`interpreted text` in this paragraph! > If I should just make up my own mind on each and submit a diff per to be > discussed|debated|mangled, someone let me know, okay? :) Please discuss reStructuredText syntax issues here. If you'd like to start a variant syntax, or a completely unrelated syntax, you're free to do so; indeed I'd encourage it. You're welcome to use my codebase. Please make it compatible with the DPS (whose API is in its infancy, a blank slate). As I mentioned in private email, I'll be posting version 0.3 of both reStructuredText and the DPS by the end of the weekend. Look for: several thousand lines of code; most constructs implemented; warning & error generation; many unittests (over 90 & counting, just for the parser); DOM generation; oodles of fun for all ages. -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sf.net - reStructuredText: http://structuredtext.sf.net - The Go Tools Project: http://gotools.sf.net From tony@lsl.co.uk Thu Jul 19 10:36:34 2001 From: tony@lsl.co.uk (Tony J Ibbs (Tibs)) Date: Thu, 19 Jul 2001 10:36:34 +0100 Subject: [Doc-SIG] RE: reStructuredText In-Reply-To: Message-ID: <009301c11036$4cb1e590$f05aa8c0@lslp7o.int.lsl.co.uk> David Goodger wrote: > ... I'll be posting version 0.3 of both reStructuredText > and the DPS by the end of the weekend. Look for: several > thousand lines of code; most constructs implemented; > warning & error generation; many unittests (over 90 & > counting, just for the parser); DOM generation; oodles > of fun for all ages. Aagh! OK. Congratulations, that's great, definitely a Good Thing, and I'm seriously envious. (shades of "heh, look folks, Python Doc-SIG has a *product*, all that wait and effort *was* worth doing".) But... I'm off for a week of holiday (visiting relatives in Germany) on Sunday, and was planning to take the various spec documents with me to peruse and mark up with any comments (yes, I know, but there's more chance of doing it next week than during "normal" time, so far). And now there's going to be a new release whilst I'm away (possibly even whilst in the air). Curses, foiled again. More seriously - David, is there any chance of getting a copy of the updated specs (both DPS and reStructuredText) before then, or is it worth my commenting on the documents "as is", assuming that changes in the "what it provides" will be minor? It would have to be before mid-Friday afternoon my time... Tibs -- Tony J Ibbs (Tibs) http://www.tibsnjoan.co.uk/ "How fleeting are all human passions compared with the massive continuity of ducks." - Dorothy L. Sayers, "Gaudy Night" My views! Mine! Mine! (Unless Laser-Scan ask nicely to borrow them.) From garth@deadlybloodyserious.com Fri Jul 20 01:42:27 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Fri, 20 Jul 2001 10:42:27 +1000 Subject: [Doc-SIG] ImplicitLinksUsingCamelCase In-Reply-To: Message-ID: .. The subject line used to be ``re: reStructuredText``, but we're getting so much point-bloat that I'm splitting it up. > Wiki ImplicitLinksUsingCamelCase are so ambiguous, they must > cause lots of problems. Like a discussion about "Old MacDonald"... You can escape the word with a bang (!), but, because the bang is also used to escape whole lines it turns out to be a mind-bender to escape just the first word in a line. Maybe you can break the paragraph, maybe not -- I forget, and that's the whole problem. It's something you have to *think* about. Ugh. > Better to have some unambiguous syntax saying "this is a link". The explicit but still simple reStructuredText link format is growing on me, for sure. Regards, Garth. -- Garth T Kidd Mobile: +61-411-596-593 Consulting Systems Engineer, Aust/NZ Direct: +61-2-9779-5614 Network Appliance http://www.netapp.com/ From garth@deadlybloodyserious.com Fri Jul 20 03:24:39 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Fri, 20 Jul 2001 12:24:39 +1000 Subject: [Doc-SIG] Lazy paragraph identation In-Reply-To: Message-ID: .. This also used to be in the ``re: reStructuredText`` thread, but it's sufficiently contentious that I think it deserves its own subject line. ... being able to recognise indented paragraphs solely by their first line, so that people can lazily just keep typing (like I'm doing now) without having to manually terminate lines and indent the next one. Unfortunately, that syntax is ambiguous if blank lines between list items are optional, which reStructuredText allows. You can have one or the other, not both. .. The spec supports nested block quotes, right? Consider: * People who want to use reStructuredText in docstrings need the blank lines between list items to be optional, and will be using proper programming editors that can handle indentation for them. This group is quite obvious at the moment. * People who want to write reStructuredText in mail clients and web browsers will be constantly frustrated if they are forced to manually indent everything, and won't mind at all if they're forced to put blank lines between list items. Nobody appears to have spent much time considering the requirements of this group yet. Those are both sizable target groups, right? Now, I believe the following: * We don't want to exclude either of those target groups. * On the other hand, we don't want to make reStructuredText ambiguous. It sure looks to me like a requirement for a switchable mode in the parser. Different applications can choose different defaults. Or, the parser could attempt to automatically figure it out. If the very first bullet point or indented paragraph you see looks like this, you probably want to select for lazy paragraph indenting:: * If the line after the first bullet point or indented paragraph starts at column zero and is not empty, lazy paragraph indenting can be assumed by applications that expect that some users might be using crummy editors. Docstring processors would explicitly suppress such automatic selection. You point out ambiguity in your example of a badly wrapped paragraph containing the bullet selector:: - This is list item 1. Here's a formula: "x = x - 1". - Here's list item 2. Sure looks like item 3 though. .. _abuse of the word "ambiguous": To me, that's not ambiguous. The bad wrapping makes it explicitly a three item list. It's not what the user intended [2]_, but there are so many ways for the user to unambiguously fix it I don't think it's a problem: * Manually wrap it closer to column zero:: - This is list item 1. Here's a formula: "x = x - 1" - Here's list item 2... * Use a different bullet:: * This is list item 1. Here's a formula: "x = x - 1". * Here's list item 2. Implication: a rule in the parser that says that blank lines are required between adjacent but different lists at the same indentation level, even if lazy paragraph formatting is turned on. That nicely matches the * Use an inline literal:: - This is list item 1. Here's a formula: ``x = x - 1``. - Here's list item 2, as the parser considers the second line in this example part of the literal started in line 1. > The Doc-SIG historical record shows that allowing intra-list-item > blank lines to be optional is more in demand. I can *readily* imagine that intra-list-item blank lines being optional is more in demand at the moment. The majority of the people discussing this specification are probably Python programmers who want to use it for Python code (in docstrings) and the documentation for their Python code which they'll probably be editing in the same indentation-smart text editor they use for their code. > Opinions or counter-arguments anyone? I'm not sure we should dig your heels in and assert that reStructuredText should *only* be useful for Python programmers with an indentation-smart text editor. There are hundreds of billions [1]_ of frustrated Wiki users out there pounding their heads against the Wiki markup syntax, and almost as many ZWiki users ripping their hair out because StructuredText is just as bad or worse. Telling them we're not going to throw them a line and rescue them from shark infested water because they might get our precious rope wet seems a tad... stingy. Getting into the mud on ambiguity: .. _explicit discussion of ambiguity: I'm going to come under some well deserved flack for my `abuse of the word "ambiguous"` above, so I'm going to break it out a little. If the specification is changed as I suggest, *and* the parser is implemented as I'm saying, *and* the user tries to do what David suggests, *and* their text gets badly wrapped in the position David indicates, then: * The *specification* is not ambiguous, and * The *parser* won't find the input ambiguous, but * The *user* might be a little confused for a moment. The user is going to spend a lot of time confused regardless. Every time I try and represent a bullet list for which each item owns a literal block, for example, I forget to indent the literal block and have to go back and fix it. Users are going to spend a lot of time going back and fixing things that they got wrong. Going back and fixing the list won't be any additional hassle. I'm wary of insisting upon serious inconvenience to a large segment of the user population for [3]_ to save inconvenience to the occasional user who stumbles across the edge case of a list item that happens to have a list delimiter just after the wrap column. More glibly put: two out of three ain't bad. I think they'll cope. :) .. _[1] I counted them. Really! .. _[2] Before firing missiles on my use of the word "ambiguous", please see my `explicit discussion of ambiguity`, upon which you can unload entire batteries if you want. :) .. _[3] e.g. having to manually indent every single list item as punishment for using an editor that doesn't handle indentation properly and that wraps long paragraphs with newlines. Regards, Garth. From dgoodger@bigfoot.com Fri Jul 20 04:37:59 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Thu, 19 Jul 2001 23:37:59 -0400 Subject: [Doc-SIG] Lazy paragraph identation In-Reply-To: Message-ID: One point I hadn't made explicitly was: if lazy paragraph indentation on list items is enabled, each list item may contain only one paragraph. Second elements of any kind (including sublists) are not possible without significantly reworking other aspects of the markup. For example, here's a list in 'strict' reStructuredText:: - List 1, item 1, para 1. Item 1, para 2. - List 1, item 2, para 1. Item 2, para 2. * Sublist A of item 2. In with lazy indentation, that structure is impossible to represent:: - List 1, item 1, para 1. Item 1, para 2. - List 1, item 2, para 1. Item 2, para 2. * Sublist A of item 2. If we don't indent, like 'Item 1, para 2', we get a one-item list, followed by a paragraph, followed by a second, separate list. If we do indent, like 'Item 2, para 2', we have a block quote containing a paragraph followed by a list. But lazy indenters are loathe to indent, so the point is moot. Lazy indentation would only be useful for the simplest of documents: flat, limited to one paragraph per list item, no nested lists possible. This seriously limits the expressive power of the markup. Is the lazy variant sufficiently powerful to be useful to anyone? If anyone can come up with or refer to a self-consistent scheme to combine lazy indentation with powerful expressivity, please do chime in. -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sf.net - reStructuredText: http://structuredtext.sf.net - The Go Tools Project: http://gotools.sf.net From garth@netapp.com Fri Jul 20 07:07:48 2001 From: garth@netapp.com (Garth Kidd) Date: Fri, 20 Jul 2001 16:07:48 +1000 Subject: [Doc-SIG] Comments within lists Message-ID: DISCLAIMER: I only just checked back and noticed that there was a particular DTD the output should conform to. Cool, but this message might be redundant when I finally read the DTD. Or maybe not. It just occurred to me that we could well end up producing an "edge cases" appendix to the reStructuredText specification by pulling reStructuredText docstrings out of the unit tests for the parser. :) Anyway, on with the story. This input:: A paragraph. .. A comment with two paragraphs. Yep. Another paragraph. ... should render to XML as something like::

    A paragraph.

    A comment with two paragraphs.

    Yep.

    Another paragraph.

    ... so a single-paragraph comment, for consistency::

    A comment with two paragraphs.

    ... and a comment between two paragraphs in a list item::

    The first paragraph of the list item.

    A comment with two paragraphs.

    Yep.

    Another paragraph.

    I'm asking because I wanted to drop a comment in the middle of a list, and it occurred to me that I didn't want to break the spec any more than I wanted to break the list. What I ended up with was a little odd: A paragraph. .. A comment between two paragraphs. Another paragraph. .. A comment between a paragraph and a list. 1. List item 1. .. A comment between list items 1 and 2. 2. List item 2. The second paragraph of list item 2. .. A comment between two paragraphs in list item 2. 3. List item 3. .. A comment between a list and a paragraph. The last paragraph. List item 2 is the scorcher. To be consistent with comments between paragraphs in column 0, comments within list items must be at the same indent level as the paragraphs (if any) around them. That implies that comments between list items must be at the same indent level as their tags or bullets, hence the above. Hmmm:: .. this is a comment this is the comment's second line How is that rendered? One ```` tag containing two paragraphs? Two comment tags containing a paragraph each? One comment tag containing three lines of CDATA of which one is blank? Regards, Garth. PS: Something tells me I'm going to have a lot of fun writing test cases. :/ From garth@deadlybloodyserious.com Fri Jul 20 07:17:57 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Fri, 20 Jul 2001 16:17:57 +1000 Subject: [Doc-SIG] RE: reStructuredText In-Reply-To: Message-ID: More time-wasting goodness. Thankfully, a short "yup" will straighten most of it out. - `subjects with momentum`_ - `multiply indirect links`_ - `general coding attitude`_ - `embedding directives mid-paragraph`_ - `alternative heading format`_ - `block quotes and literal blocks in list items`_ - `miscellaneous`_ - `tying up loose ends`_ I'm leading with the items that have a bit of momentum. I've bumped to the tail end of the message all of the knotted loose ends: the yeps, the uh-huhs, and the oopses. Paragraph indentation, I put in a completely different message; it looks like it could take a while. If reading my email in reStructuredText 0.2.2 is driving anyone nuts, please let me know. .. _subjects with momentum: Kicking off with the momentum: .. is that English? :) .. _multiply indirect links: Multiply-indirect hyperlinks? Interesting idea. I don't know if it's worth the trouble though. Collapsing multiply indirect links seems clean and trivial. Why not? :) That said, it's not necessarily *important*, so feel free to leave it in the pile of features labelled "Garth can code these if he wants them so bloody badly." .. _my general coding attitude: In general, if it's easier to make the parser accomodate a user behaviour than to persuade the users to select another behaviour, I'll consistently be in favour of changing the parser. That doesn't override my insistence on clean code, especially if it's hanging out there in public. If I can't do it without tangling the parser, I'll hold off until I can figure out a way to refactor it cleanly. Finally, I don't mind putting my code where my mouth is. .. That almost ended up "... don't mind putting my keyboard where my mouse is." I clearly need more sleep than I'm getting. Summary: * If it's easier to change the code's behaviour than change the users' behaviour, change the code. * Writing dirty, ugly code is harder than changing user behaviour. * If it's not obvious whether the code is going to be clean or not, I'll find out by trying to write it. .. _embedding directives mid-paragraph: Embedding directives mid-paragraph... Could be done with interpreted text. But is it really necessary? Could you provide some examples? If I had an example in mind, I've since forgotten it. Howabout we nail down how to hand a block of text to a directive, and then leave messy stuff like doing something unusual mid-paragraph to application-specific directives. If it turns out to be mind-blowingly popular, it can be factored in later. .. _alternative heading format: I *really* like MoinMoin's use of a number of equals signs on each side of a paragraph [to indicate a heading]: if the number of equals signs is the same, the paragraph is a heading of ident level equal to the number of equals signs. It's a workable alternate syntax. Yep. .. _block quotes and literal blocks in list items: On block quotes and literal blocks in list items, which I still find mildly confusing (at least, when I'm typing -- my fingers aren't used to it yet, if you know what I mean): I would hope that the reStructuredText parser is smart enough to figure out that the example text above is part of the
  • above? Once you indent the list item's paragraph, and further indent the example text, yes. :-) I keep forgetting the extra indents. Just to confirm:: - list item new paragraph in list item:: literal block in list item .. outdented comment to force the end of the literal block block quote in list item block quote outside of list item (oops!) - another list item .. _miscellaneous: A few quick questions: * If I accidentally indent a bullet list, does that become a bullet list inside a block quote? :: I'm not sure whether the following is a bullet list at the same level as this paragraph, or a bullet list inside a block quote: * Anyone? Anyone? Bueller? I suspect the answer is: yes. * Does the specification implicitly or explicitly support the use of an outdented comment to force the end of an indented block, as above? I suspect the behaviour isn't yet defined. .. tying up loose ends: tying up loose ends ------------------- We turn out to completely agree on the following: * Treating unresolved links as opportunitities to create a new page is up to Wiki, not the parser. * Suppressing page creation for unresolved link destinations that would make wierd page names is also up to Wiki. I've discovered the following blunders: * Inline URL specification for links is too ugly to bear:: `download Python 2.1`:http://www.python.org/2.1/ I must have missed the word "rejected" at the time. My blunder. * Interpreted literals and literal interpretation:: Find the `` `interpreted text` `` in this paragraph! What was I on? Regards, Garth. From Juergen Hermann" Message-ID: On Thu, 19 Jul 2001 11:12:56 +1000, Garth T Kidd wrote: > - underlined style headings; I *really* like MoinMoin's use of a number >of equals signs on each side of a paragraph: if the number of equals >signs is the same, the paragraph is a heading of ident level equal to >the number of equals signs. For example:: > > = this is a level 1 heading = > > == this is a level 2 heading == > > == this is also a level 2 heading > despite the fact that it has been > wrapped for some reason == > > === this is not a heading because the number of equals signs isn't > equal . == This is not exactly the rules MoinMoin implements, but close. ;) From Juergen Hermann" Message-ID: On Thu, 19 Jul 2001 01:11:15 -0400, David Goodger wrote: >As I mentioned in private email, I'll be posting version 0.3 of both >reStructuredText and the DPS by the end of the weekend. Look for: several >thousand lines of code; most constructs implemented; warning & error >generation; many unittests (over 90 & counting, just for the parser); DOM >generation; oodles of fun for all ages. You did not _explicitely_ mention documentation (docstrings & docs). If those are included, I'll try integration into MoinMoin. From fdrake@acm.org Fri Jul 20 22:02:42 2001 From: fdrake@acm.org (Fred L. Drake) Date: Fri, 20 Jul 2001 17:02:42 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20010720210242.487322892E@cj42289-a.reston1.va.home.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Some additional information for extension writers. From dgoodger@bigfoot.com Fri Jul 20 23:25:32 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Fri, 20 Jul 2001 18:25:32 -0400 Subject: [Doc-SIG] Lazy paragraph identation In-Reply-To: Message-ID: I've had some time to respond to individual comments: on 2001-07-19 10:24 PM, Garth T Kidd (garth@deadlybloodyserious.com) wrote: > .. The spec supports nested block quotes, right? Yes. So? I don't get your point. > It sure looks to me like a requirement for a switchable mode in the > parser. Different applications can choose different defaults. This is workable. If you can come up with consistent, unambiguous, safe rules for lazy indentation, then Wikis and other apps could use the lazy variant. > Or, the > parser could attempt to automatically figure it out. That's a dangerous path. Explicit is better. > You point out ambiguity in your example of a badly wrapped paragraph > containing the bullet selector:: > > - This is list item 1. Here's a formula: "x = x > - 1". > - Here's list item 2. Sure looks like item 3 though. I think intervening blank lines are an absolute requirement for lazy indentation. So the example would be like this:: - This is list item 1. Here's a formula: "x = x - 1". - Here's list item 2. Sure looks like item 3 though. If *only* lazy indentation is used, no problem. If the parser tries to infer the author's style, it would mistakenly infer strict indentation. [Garth lists workarounds:] > * Manually wrap it closer to column zero:: Yes, but we are trying to avoid surprises when accidental bad wrapping takes place. The user doesn't always have control. My email client wraps my paragraphs, even if I don't want it to. > * Use a different bullet:: Change the example to "x = (x + 1) * 3 - 2" (all possible bullets included), and this workaround won't always work. > Implication: a rule in the parser that says that blank lines > are required between adjacent but different lists at the same > indentation level, even if lazy paragraph formatting is turned > on. My parser actually does this. I'll add mention of it to the spec. > * Use an inline literal:: > > - This is list item 1. Here's a formula: ``x = x > - 1``. > - Here's list item 2, as the parser considers the second > line in this example part of the literal started in line 1. Although not explicitly stated in the spec (yet), the way I've implemented the parser is to do line/block parsing first, then inline markup parsing afterwards (standalone URI parsing last). So in the case above, the "- 1``." would be recognized as a new list item before being examined for inline literals. The "\``x = x" at the end of the first line would generate a warning, "Inline literal start-string without end-string." > There are hundreds of billions [1]_ of frustrated Wiki users out there > pounding their heads against the Wiki markup syntax, and almost as many > ZWiki users ripping their hair out because StructuredText is just as bad > or worse. Telling them we're not going to throw them a line and rescue > them from shark infested water because they might get our precious rope > wet seems a tad... stingy. I'm all in favor of throwing them a line. But (to extend your analogy further) I want the line to be strong and well anchored, so they don't get tangled up in it and drown. :-) > * The *user* might be a little confused for a moment. > > The user is going to spend a lot of time confused regardless. Confusion is OK, as long as it stems from ignorance; education/experience fixes that. Confusion stemming from surprising (even if *very occasionally* surprising) side-effects of the markup, that's not acceptable. > I'm wary of insisting upon serious inconvenience to a large segment of > the user population for [3]_ to save inconvenience to the occasional > user who stumbles across the edge case of a list item that happens to > have a list delimiter just after the wrap column. In putting together these specs and the parser software, I've always kept this in mind: If it can go wrong, it will. Writing the spec and implementing the parser, I've tried to avoid surprises and ambiguity wherever possible. If avoidance is not possible, then the possible surprises have to be minimized, explicity documented, and warned of by the parser. Also, there has to be an "out" or workaround (which is where backslash-escapes come in handy). > More glibly put: two out of three ain't bad. I think they'll cope. :) You're a programmer. Imagine if Python had funny edge cases. Would you *cope*? Or would you scream bloody murder? Out of respect for the eventual users of reStructuredText, we can't allow *any* surprises. It will be great if you can come up with a consistent indentation-minimized syntax; I'm all for it. All you need to do is devise an alternative representation of hierarchical structures, one that doesn't use indentation or begin/end markers. If it *does* use begin/end markers, we'll call it something else ;-), and start another parser component project for it. From dgoodger@bigfoot.com Fri Jul 20 23:27:15 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Fri, 20 Jul 2001 18:27:15 -0400 Subject: [Doc-SIG] Re: Comments within lists In-Reply-To: Message-ID: on 2001-07-20 2:07 AM, Garth Kidd (garth@netapp.com) wrote: > This input:: > > A paragraph. > > .. A comment with two paragraphs. > > Yep. > > Another paragraph. > > ... should render to XML as something like:: > >

    A paragraph.

    > >

    A comment with two paragraphs.

    >

    Yep.

    >
    >

    Another paragraph.

    In the DTD, contains only #PCDATA. The entire comment block is treated as a text blob. None of it is processed further. I'll add mention of this to the spec. The above example would actually turn into the following (pretty-printed)::

    A paragraph.

    A comment with two paragraphs. Yep.

    Another paragraph.

    > I'm asking because I wanted to drop a comment in the middle of a list, > and it occurred to me that I didn't want to break the spec any more than > I wanted to break the list. What I ended up with was a little odd: > > A paragraph. > .. A comment between two paragraphs. > Another paragraph. > .. A comment between a paragraph and a list. > 1. List item 1. > .. A comment between list items 1 and 2. > 2. List item 2. > The second paragraph of list item 2. > .. A comment between two paragraphs in list item 2. > 3. List item 3. > .. A comment between a list and a paragraph. > The last paragraph. > > List item 2 is the scorcher. To be consistent with comments between > paragraphs in column 0, comments within list items must be at the same > indent level as the paragraphs (if any) around them. That implies that > comments between list items must be at the same indent level as their > tags or bullets, hence the above. Except lists cannot contain comments; they contain only list items. List *items* can contain comments. I was wrestling with exactly this issue the other day. In SGML, we could always use an inclusion exception:: (If my SGML isn't too rusty ;-) But those are dangerous and unwieldy at the best of times. I used an XML DTD to specify the document model, and so far it's working out well. Shortcomings, like not being able to have comments just anywhere, are annoying at first. But in hindsight these "shortcomings" almost invariably turn out to be the only sane way to do it. > Hmmm:: > > .. this is a comment > > this is the comment's second line > > How is that rendered? One ```` tag containing two paragraphs? > Two comment tags containing a paragraph each? No to both. > One comment tag containing three lines of CDATA of which one is blank? Yes. > PS: Something tells me I'm going to have a lot of fun writing test > cases. :/ Unit testing is *definitely* the way to go. And fun too. -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sf.net - reStructuredText: http://structuredtext.sf.net - The Go Tools Project: http://gotools.sf.net From dgoodger@bigfoot.com Fri Jul 20 23:29:12 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Fri, 20 Jul 2001 18:29:12 -0400 Subject: [Doc-SIG] RE: reStructuredText In-Reply-To: Message-ID: on 2001-07-20 2:17 AM, Garth T Kidd (garth@deadlybloodyserious.com) wrote: > If reading my email in reStructuredText 0.2.2 is driving anyone nuts, > please let me know. Looks OK to me! :-) New in the spec and parser (which will be posted to the web site this weekend) is implicit hyperlink targets in titles, so you could use titles instead of ".. _internal hyperlink targets:". > Multiply-indirect hyperlinks? Interesting idea. I don't know > if it's worth the trouble though. > > Collapsing multiply indirect links seems clean and trivial. Why not? :) But is the gain worth the added complexity? > That said, it's not necessarily *important*, so feel free to leave it > in the pile of features labelled "Garth can code these if he wants > them so bloody badly." Consider it so left. And it's good to hear the Queen's English. [Summary of Garth's coding attitude:] > * If it's easier to change the code's behaviour than change the > users' behaviour, change the code. > > * Writing dirty, ugly code is harder than changing user behaviour. So write clean code! :> > * If it's not obvious whether the code is going to be clean or not, > I'll find out by trying to write it. Sounds reasonable. > Howabout we nail down how to hand a block of text to a directive, and > then leave messy stuff like doing something unusual mid-paragraph to > application-specific directives. If it turns out to be mind-blowingly > popular, it can be factored in later. Yes. The parser currently *parses* directives just fine, but doesn't actually *do* anything with them yet. I'll have to code up a directive or two to see what they'll need. Any suggestions? > I keep forgetting the extra indents. Just to confirm:: > > - list item > > new paragraph in list item:: > > literal block in list item > > .. outdented comment to force the end of the literal block > > block quote in list item > > block quote outside of list item (oops!) > > - another list item Correct on all counts. > * If I accidentally indent a bullet list, does that become a bullet > list inside a block quote? :: ... > I suspect the answer is: yes. Your suspicions are well-founded. (Yes.) > * Does the specification implicitly or explicitly support the use of > an outdented comment to force the end of an indented block, as > above? > > I suspect the behaviour isn't yet defined. Right again. The spec doesn't explicitly or implicitly say anything. The parser *does* support such (ab)use of an unindented comment. I'll modify the spec. Thanks for the feedback. -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sf.net - reStructuredText: http://structuredtext.sf.net - The Go Tools Project: http://gotools.sf.net From dgoodger@bigfoot.com Fri Jul 20 23:29:55 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Fri, 20 Jul 2001 18:29:55 -0400 Subject: [Doc-SIG] RE: reStructuredText In-Reply-To: Message-ID: on 2001-07-20 6:44 AM, Juergen Hermann (jh@web.de) wrote: > This is not exactly the rules MoinMoin implements, but close. ;) Could you post a link to the rules that MoinMoin *does* implement please? (I visited the site but couldn't find the markup rules.) From dgoodger@bigfoot.com Fri Jul 20 23:31:17 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Fri, 20 Jul 2001 18:31:17 -0400 Subject: [Doc-SIG] RE: reStructuredText In-Reply-To: Message-ID: [replying to me about the upcoming post of spec & code] on 2001-07-20 6:48 AM, Juergen Hermann (jh@web.de) wrote: > You did not _explicitely_ mention documentation (docstrings & docs). If > those are included, I'll try integration into MoinMoin. I didn't *explicitly* mention documentation because, outside of the specs, there is none yet. Apart from dps.statemachine, the docstings are nowhere near complete or acceptable. Please beware, the code is not in any semblance of completion yet. There are many holes in it. Examples: I haven't tested the parser's external API at all; there's no output formatter (except for raw XML). The code should be considered experimental. Having said that, please do take a look. I'd like to hear what MoinMoin (or any other potential client application) would need from the DPS and/or reStructuredText. -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sf.net - reStructuredText: http://structuredtext.sf.net - The Go Tools Project: http://gotools.sf.net From jwt@OnJapan.net Sat Jul 21 06:06:57 2001 From: jwt@OnJapan.net (Jim Tittsler) Date: Sat, 21 Jul 2001 14:06:57 +0900 Subject: [Doc-SIG] RE: reStructuredText (MoinMoin rules) In-Reply-To: ; from dgoodger@bigfoot.com on Fri, Jul 20, 2001 at 06:29:55PM -0400 References: Message-ID: <20010721140657.A21807@server.onjapan.net> On Fri, Jul 20, 2001 at 06:29:55PM -0400, David Goodger wrote: > on 2001-07-20 6:44 AM, Juergen Hermann (jh@web.de) wrote: > > This is not exactly the rules MoinMoin implements, but close. ;) > > Could you post a link to the rules that MoinMoin *does* implement please? (I > visited the site but couldn't find the markup rules.) http://moin.sourceforge.net/cgi-bin/moin/moin/HelpOnEditing From garth@deadlybloodyserious.com Sat Jul 21 09:43:57 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Sat, 21 Jul 2001 18:43:57 +1000 Subject: [Doc-SIG] Directive idea Message-ID: For content management systems in news organisations: .. pullquote:: This quote is so important it should be rendered nice and large somewhere near this point, but not necessarily immediately next to it. Drawing the user to the article is more important than to the specific text. Regards, Garth. From garth@deadlybloodyserious.com Sat Jul 21 10:04:20 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Sat, 21 Jul 2001 19:04:20 +1000 Subject: [Doc-SIG] Lazy paragraph identation In-Reply-To: Message-ID: > > It sure looks to me like a requirement for a switchable mode in the > > parser. Different applications can choose different defaults. > > This is workable. If you can come up with consistent, unambiguous, safe > rules for lazy indentation, then Wikis and other apps could use the lazy > variant. Wicked. > > Or, the parser could attempt to automatically figure it out. > > That's a dangerous path. Explicit is better. Fair enough. This quote is out of order because it's more important: > Although not explicitly stated in the spec (yet), the way > I've implemented the parser is to do line/block parsing first, > then inline markup parsing afterwards (standalone URI parsing last). "Sorry, changing the parser order is just too hard at this stage to relax the requirement for blank lines between list entries in lazy mode" is a perfectly reasonable argument in favour of that requirement, and I'm entirely happy to accept it. [3]_ .. _[3] You have no idea how frustrated my fiance gets when, after attempting to justify a decision with several sadly illogical [4]_ arguments in its favour and listening to me patiently dissect and dismiss each one, discovers that "I feel like it, okay?" was all that she needed to say. Well, maybe you have a slight idea. :) One of these days, I'm going to clue up and ask right after the first one whether she just feels like it. It'll save a lot of time and angst. Similarly, I should have asked up front whether the implementation of my proposal was going to be difficult. .. _[4] No, this is not an attempt to slyly call your arguments illogical. Misguided, much Frowned upon by God, and if not abandoned sure to lead to your Eternal Damnnation in Hell, but not illogical by any shake of the stick. :) The now sadly irrelevant argument in favour of a less strict lazy mode follows anyway. Summarizing the issue of badly wrapped lists and lazy mode: * We're either in lazy mode, or not. No automatic selection. Cool. * The following example is still contentious:: - This is list item 1. Here's a formula: "x = x - 1". - Here's list item 2. Sure looks like item 3 though. * The strict approach to the example: * A parser permitting lazy indentation without insisting upon blank lines would interpret the above as "three lists", and * A human reader strictly reading the specification would reach a similar conclusion, but * That's obviously not what the user intended when they wrote :: - This is list item 1. Here's a formula: "x = x - 1". before their editor badly wrapped the line. * We can could this disconnect "ambiguity", * In the parser world, "ambiguity" is a bad word, * Therefore blank lines **must** be insisted upon between list items in lazy mode. * Arguments in favour of being more forgiving: *Ambiguity ain't always that ambiguous*: The kind of ambiguity we're most worried about is *circumstances for which the parser's behaviour is undefined*. The parser needs to be able to consistently make a decision, and programmers implementing parsers need to be able to make a decision. This clearly isn't such a case. The user will be typing a list. When they see the results, they'll mutter dark words about the stupid editor their company insist they use, and they'll fix the markup somehow (see below). If asked "hey, do you consider what just happened ambiguous?", I don't imagine many users would reply in the affirmative. They explicitly typed something. Their editor explicitly stuffed it up. The parser explicitly interpreted the text, and the user explicitly said expletives and explicitly fixed the problem. Any confusion in the user's mind when seeing the output will disappear when the system sends them their text back for editing and they see what their text editor did. *Consider the user impact*: This kind of a strict "never suffer ambiguity to live" attitude imposes a heavy burden on the user every time they use a list (probably quite often) in order to save them from something untoward that might happen to them only once a year, if ever. A comparison might be made to money handling. If your current cash register techniques occasionally let minor mistakes to be made, you could well lose hundreds of dollars per year. Insisting that all totals are manually verified by a supervisor will save those hundreds of dollars, but cost tens of thousands in additional salary. Moreover, all of your customers might abandon your store because they're sick of the hassle. *Users can avoid the problem very, very easily*: Any user aware that their editor wraps lines for them, and aware that a copy of the list delimiter unfortunately wrapped to the beginning of the line will cause the parser to start a new list item, will do one of the following: * Manually wrap such a long item well before the wrap point:: - This is list item 1. Here's a formula: "x = x - 1". - Here's list item 2. * Choose a different list delimiter. * Use literals (assuming the parser is changed so that literals bind harder than the beginning of list items). * Drop into strict mode temporarily: _[2] :: .. strict:: - This is list item 1, which contains a formula that I'm not sure will wrap appropriately, so I'm going to drop into strict mode and manually wrap each and every line well before the wrap point. Anyway, here's the formula: "x = x - 1". - Here's list item 2. .. lazy:: I suspect the first two will be slightly more popular. :) Any user waking up regularly dripping with sweat because of recurring nightmares about having to go back and fix their markup will, I think, go to the effort of finding an editor that will write their markup for them. *What would the user choose?*: Given a choice between the following: * a *strict* mode that insists that users manually wrap each and every line well before their editor's wrap point *and* manually indent those lines as well, * a *strictly lazy* mode that relaxes the requirements for manual wrapping and indentation but insists upon blank lines between all list items, and * a hypothetical *bloody lazy* [1]_ mode that doesn't insist upon those blank lines but that requires users to consider editor wrap points when putting list delimiters in the middle of list items, I somewhat suspect that many users would end up being bloody lazy. Certainly, if bloody laziness were the default, I sincerely doubt that many people would bother switching to a stricter mode, even if they got caught out once or twice. .. _[1] There's the `Queen's English` again. .. _[2] Well, there's an example of a parser directive, if we need one. > Yes, but we are trying to avoid surprises when accidental bad > wrapping takes place. The user doesn't always have control. > My email client wraps my paragraphs, even if I don't want it to. Well, exactly, but there's nothing wrong with surprises if the user can figure out how to respond to the surprise. Users are going to be stuffing up quite often, will be surprised to see that what they did didn't work, and will look at their markup again and maybe refer to the specification to figure out what happened and what to do about it. If we're not worried about that (leading to directives like: "users must never write their own markup, but must use an editor that doesn't let them make mistakes"), why are we worried about this wrapping and list items issue? The user has enough control over the wrapping to force a wrap earlier than the parser did, which is more than s/he needs to either dodge or fix the problem. > > * The *user* might be a little confused for a moment. > > > > The user is going to spend a lot of time confused regardless. > > Confusion is OK, as long as it stems from ignorance; > education/experience fixes that. Confusion stemming from surprising > (even if *very occasionally* surprising) side-effects of the markup, > that's not acceptable. Call it a side-effect of the editor. If anyone gets particularly detail-oriented and angst ridden about the whole thing, direct them to the list archives (of which I'm sure I'm going to be sufficiently embarrassed), point out that it's all my fault, and give them my email address. :) > Writing the spec and implementing the parser, I've tried to > avoid surprises and ambiguity wherever possible. If avoidance is > not possible, then the possible surprises have to be minimized, > explicity documented, and warned of by the parser. Also, there > has to be an "out" or workaround (which is where > backslash-escapes come in handy). Let's say that it were impossible to insist on the blank lines for non-technical reasons (the managing director hates them). I think the possible surprises are minimal, I'll write the documentation, I'll try and figure out a way to warn about the situation (spotting a broken literal is the easiest way until we climb into the ordered list rat-hole), and there's an easy out. Close enough? > > More glibly put: two out of three ain't bad. I think > > they'll cope. :) > > You're a programmer. Imagine if Python had funny edge cases. Would > you *cope*? Or would you scream bloody murder? Python surprises me every week. Then I figure out that my editor broke the indentation. I fix what my editor broke, and keep working. I cope. :) > Out of respect for the eventual users of reStructuredText, we can't > allow *any* surprises. We're doing it for your own good! Out of respect for people already suffering crummy editors, I'm trying to cut them as many breaks as I can. Users who absolutely cannot stand surprises can always turn on strictness or strict laziness, eh? It just occurred to me that I've spent more time discussing this than I could possibly have spent as a user swearing about needing to put blank lines in. Sorry about that. I'm mainly worried about people cutting and pasting mail in to their web browser (it'll happen). Saving them the effort of breaking the bullet lists apart seems like a fair thing. > It will be great if you can come up with a consistent > indentation-minimized syntax; I'm all for it. Still working on it! Oh, the shame: a Python programmer trying to figure out how to avoid indenting... > All you need to do is devise an alternativerepresentation of > hierarchical structures, one that doesn't use indentation > or begin/end markers. If it *does* use begin/end markers, > we'll call it something else ;-), and start another parser > component project for it. If it had to use begin and end markers, we may as well write it in *Perl*. Ewwwww... Regards, Garth. From garth@deadlybloodyserious.com Sat Jul 21 10:04:26 2001 From: garth@deadlybloodyserious.com (Garth T Kidd) Date: Sat, 21 Jul 2001 19:04:26 +1000 Subject: [Doc-SIG] Re: Comments within lists In-Reply-To: Message-ID: > Except lists cannot contain comments; they contain only list items. Hmmm. :: 1. List item. 2. Another. .. Guys, should we remove this one? It's angry. 3. Angry. 4. Fnee. I just ended up with two lists, didn't I? :) > List *items* can contain comments. Cool. I can think of circumstances for which people will want to use comments to dedent. Undent? Oh, I give up. :) > > PS: Something tells me I'm going to have a lot of fun writing test > > cases. :/ > > Unit testing is *definitely* the way to go. And fun too. Already submitted a patch with more tests. :) Regards, Garth. From dgoodger@bigfoot.com Mon Jul 23 05:08:06 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Mon, 23 Jul 2001 00:08:06 -0400 Subject: [Doc-SIG] Docstring Processing System release 0.3 posted Message-ID: Release 0.3 of the Python Docstring Processing System (DPS) has been posted to the web site, http://docstring.sourceforge.net. Quick link to the download: http://prdownloads.sourceforge.net/docstring/dps.0.3.tar.gz Lots of progress has been made, driven by the reStructuredText parser. The project files are now accessible via CVS; I will be making regular checkins. Thanks to Garth Kidd for his pushing and prodding. If you would like to be added as a developer, please let me know. -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From dgoodger@bigfoot.com Mon Jul 23 05:08:18 2001 From: dgoodger@bigfoot.com (David Goodger) Date: Mon, 23 Jul 2001 00:08:18 -0400 Subject: [Doc-SIG] reStructuredText release 0.3 posted Message-ID: Release 0.3 of the reStructuredText input parser (a component of the Python Docstring Processing System) has been posted to the web site, http://structuredtext.sourceforge.net. Quick link to the download: http://prdownloads.sourceforge.net/structuredtext/rst.0.3.tar.gz Most syntax constructs have been implemented, but there is no front-end yet. The project files are now accessible via CVS; I will be making regular checkins. Thanks to Garth Kidd for his pushing and prodding. If you would like to be added as a developer, please let me know. -- David Goodger dgoodger@bigfoot.com Open-source projects: - Python Docstring Processing System: http://docstring.sourceforge.net - reStructuredText: http://structuredtext.sourceforge.net - The Go Tools Project: http://gotools.sourceforge.net From fdrake@acm.org Mon Jul 23 23:04:50 2001 From: fdrake@acm.org (Fred L. Drake) Date: Mon, 23 Jul 2001 18:04:50 -0400 (EDT) Subject: [Doc-SIG] [development doc updates] Message-ID: <20010723220450.33D4428932@beowolf.digicool.com> The development version of the documentation has been updated: http://python.sourceforge.net/devel-docs/ Various minor updates. From garth@netapp.com Fri Jul 27 06:53:19 2001 From: garth@netapp.com (Garth Kidd) Date: Fri, 27 Jul 2001 15:53:19 +1000 Subject: [Doc-SIG] usage docstrings and bibliographic field lists Message-ID: I think we need to support field list option 3 (``reStructuredText.txt`` revision 1.1.1.1 line 478) to support PEP-0257 style usage docstrings in Python scripts without preventing script authors from specifying bibliographic field lists. Background: The reStructuredText spec supports field lists of RFC-822 style headers:: Special case: use unadorned RFC822_ for the very first or very last text block of a docstring:: """ Author: Me Version: 1 The rest of the docstring... """ Note that it says *docstring*, not *document*. :) Later, the spec says:: One special context is defined for field lists. A field list as the very first non-comment block, or the second non-comment block immediately after a title, is interpreted as document bibliographic data. No special syntax is required, just unadorned RFC822_. The first ... Making the least effort to eliminate ambiguity, I come up with: * When using reStructuredText in docstrings, you can have as many field lists as you like, in any docstring, so long as they're either the first or last block in the docstring. - In a document, you can have only two: one at the beginning, and one at the end. * A field list as the first non-comment block in a document becomes ``document bibliographic data``. - In docstrings, this becomes tricky. To me, it ends up saying that a field list at the top of the *first docstring* (the second if the first is a title) becomes bibliographic data for the file. PEP-0257 says:: The docstring of a script (a stand-alone program) should be usable as its "usage" message, printed when the script is invoked with incorrect or missing arguments (or perhaps with a "-h" option, for "help"). Such a docstring should document the script's function ... To cut a long story short, when these two lock horns it looks like we can't have reStructuredText bibliographic field lists in scripts because otherwise the script couldn't print its usage with a simple ``print __doc__``. If by this point you've already figured out how to resolve the issue by straightening out wording in the spec, not changing any code, please let me know. :) The simplest fix I can see is to accept the previously-rejected "option 3" of a ``fields`` directive:: .. fields:: If specified, it would signal to the parser that the next text block was a field list, not a paragraph. Furthermore; if it was the first field list in the document (ie there wasn't one at the top), the field list would be treated as bibliographic data. The ``fields`` directive would not be required for field lists at the beginning or end of a document or docstring. We might generate a warning. Regards, Garth. PS: The usage for many scripts looks like this:: """\ Usage: cvs [cvs-options] command [command-options-and-arguments] where ... """ Perhaps we should suppress recognition of bibliographic field lists without generating warnings if the first field name seems to be ``Usage``. From sorifu_info@ec-shock.com Sat Jul 28 06:03:51 2001 From: sorifu_info@ec-shock.com (=?ISO-2022-JP?B?GyRCJWEhPCVrJSIlcyUxITwlSDt2TDM2SRsoQg==?=) Date: Sat, 28 Jul 2001 14:03:51 +0900 Subject: [Doc-SIG] =?ISO-2022-JP?B?GyRCPi5AdEZiM1UbKEIgGyRCO1k7fUlUO1k7fTZbNV4bKEI=?= =?ISO-2022-JP?B?GyRCJSIlcyUxITwlSBsoQg==?= Message-ID: <20010728.1403500034.babaq@sorifu_info-ec-shock.com> 小泉内閣 支持・不支持 緊急アンケート お忙しいところ、ご迷惑をおかけしますが、 下のURLをクリックして、アンケートにご協力お願いいたします。 http://211.9.37.210/koizumi/koizumi_an.asp?id=530263