[Doc-SIG] docutils feedback

David Goodger goodger@users.sourceforge.net
Wed, 12 Jun 2002 21:52:05 -0400


Simon Hefti wrote:
> - one problem resulted in this error message:
>
> Traceback (most recent call last):
...
>   File ".../docutils/writers/__init__.py", line 86, in recordfile
>     output = output.encode('raw-unicode-escape')    # @@@ temporary
> UnicodeError: ASCII decoding error: ordinal not in range(128)
>
> ... where I wished to see the line number of the rst document more
> than the stack trace. In some error cases, the line number is shown,
> though.

This is a bug in the Docutils code, not a problem with the data, so
it's appropriate to see a Python traceback.  The code crashed!

However, I believe the problem is solved in current CVS.  Try
installing from the CVS snapshot:
http://docutils.sf.net/docutils-snapshot.tgz

If the bug persists, please open a bug report on SourceForge, and
include a minimal set of data and instructions to reproduce the error.

> - just realized that the --version option is mssing:
>   python html.py --version
>   should, in my opinion, write some version info to stdout.

If you're using the docutils-0.1 release, the front ends have no
command line options *at all*.  I've recently added a command-line
interface to the front ends (install the CVS snapshot).  By
coincidence, I added a "--version" option just before I got your
message.  The changes will be checked in soon.

I've been working on adding some features to Optik_ (which Docutils
uses for command-line option processing), which is why Docutils has
been quiet lately.

.. _Optik: http://optik.sf.net/

> - cannot handle umlauts !
>
>   I can see that you do not want to deal with exotic characters.

Docutils *does* want to deal with non-ASCII characters; it's on the
to-do list (quite high-priority).  Support for encodings just hasn't
been implemented yet.  I haven't needed it personally, and no one has
stepped forward to implement it.  I'd be happy for your help!

>   I'm a Tcl guy, where the programming language itself supports all
>   kinds of encodings. I assume python can do this as well.

Yes, Python has very good Unicode & encoding support.  I just don't
understand it yet.  And I'm not sure how best to implement such
support for Docutils.

> - I'm not sure I like the table markup.

You're welcome to implement another!

You have to realize, this is a volunteer open source project.  Nobody
is paying me to do any of this; telling me "I'm not sure I like the
table markup" won't make me want to change it.  You're welcome to
implement new features; I encourage you to, and I'll be happy to help.
If you do write some code, I hope you'll give it back to the project
for everyone's benefit.

There's a saying that's applicable to all open-source projects:

    If you want something done, hire someone -- if you want something
    done right, do it yourself.  (John Jacob Astor)

>   I think that it is very usefull and complete, but perhaps
>   sometimes a simpler variant could be helpful, e.g. like
>   ---+-----
>   id | desc
>   ---+-----
>   1  | foo
>   2  | a rather long line which is not matched by the others
>   ---+-----

I don't see that as significantly easier than the existing markup.
See http://docutils.sf.net/spec/rst/problems.html#tables, alternative
2, for a much simpler syntax.  Your idea of leaving the rightmost
column unbounded may be useful (I remember you suggested this almost a
year ago), but you need some way to indicate row separations.  Using
the syntax from alternative 2, slightly modified to incorporate your
idea, your table could look like this::

    == ====
    id desc
    == ====
    1  foo
    -- ----
    2  a rather long line which is not matched by the others
    == ====

Using a directive, any (reasonable) syntax you like is feasible.  You
could implement a table where each line is assumed to be a row, or
each entry in the first column implies a new row, or anything else.
Here's another syntax idea, combining simple table syntax with bullet
lists to indicate new rows, resulting in very simple and compact
tables::

    col 1 col 2
    ===== =====
    - 1   Second column of row 1.
    - 2   Second column of row 2.
          Second line of paragraph.
    - 3   Second column of row 3.

          Second paragraph of row 3,
          column 2

>   One problem I had was with a "...." in the table, which was
>   interpreted as a title rather. I was confused because I was not
>   thinking of writing titles within tables.

What is "...."?  Are you thinking of ellipsis_?  That's "..." (3
periods, not 4 [*]_), in English anyhow.  The reStructuredText spec
explicitly requires at least 4 repeated non-alphanumeric characters,
"to avoid mistaking ellipses ["..."] for overlines".

Do other languages use 4 periods to mean something significant?  If
so, I'll consider changing the spec & parser.  Please provide an
authoritative reference, online if possible.

.. _ellipsis: http://webster.commnet.edu/grammar/marks/ellipsis.htm

.. [*] Four periods *can* be seen at the end of a sentence: three
   ellipsis points plus the sentence-ending fullstop.  However the
   ellipsis should be immediately adjacent to the final word; there
   should never be any space between the final word and the ellipsis,
   therefore the ellipsis cannot wrap to the next line and be
   misinterpreted as a title underline or overline.

>   Another problem of the table syntax is that monospaced chars are
>   often not approriate to write complex tables (this is one reason
>   why spread sheets are successfull), and long text in cells tends
>   to blow the ASCII art up so much that it is not readable, or
>   editable easily.

So use a spreadsheet, or a word processor.  reStructuredText is
designed for plain text, which has limitations.  We have to live with
them.  To write or read any kind of ASCII-art tables, a mononspaced
typeface is a must.

There's table-mode for emacs: http://table.sourceforge.net/.  It
doesn't support header row separators yet ("=" instead of "-"; any
Elispers out there?).

If you have another solution, I'd like to hear it.

> - how to mark-up examples ?
>
>   .. example:: would be nice

This could be done.  But what are the semantics?  What is the
behaviour?  (What should the "example" directive *do*?)

> - literal blocks:
>
>   require no identation, please ! (but end-of-block)
...
>   I would consider somthing like:
>
>   previous para::my_end_tag
>
> select foo from bar where id=1;
>
> .. my_end_tag
>
>   at least as a variant.

It would be easy enough to write a directive with semantics equivalent
to the shell's "here-file" I/O redirection
(e.g., ``command <<EndOfFileMarker``).  It could look like this::

    An ordinary paragraph.

    .. here-literal:: [[END]]
    This text belongs to
    the literal block.
    [[END]]

    Another ordinary paragraph.

(I used "[[END]]" only because it's distinctive.  Could be anything.)

This seems like a long way to go just to avoid a little indentation.
What's the use case?  (Why do you need it?  Why is indentation bad?)
Perhaps you have a better suggestion for syntax?

A similar issue was raised and resolved regarding comments.  See the
Doc-SIG thread "Comments on the reST specification" beginning
2001-08-03.  A summary of the issues and final decision is at:
http://docutils.sf.net/spec/rst/alternatives.html#comments

>   This was one of the main reasons for POD: start a document,
>   copy-paste code or whatever, and process it into man pages, html
>   and so forth.

I don't follow.  Please explain.

> - relative URLs ?
>
>   I would like to add relative URLs, e.g.:
>
>   see also /some/where/more.html
>
>   Is there a mark-up for that ? Could docutils provide one ?

There is no direct syntax now.  You could re-code as::

    see also more.html_.

    .. _more.html: /some/where/more.html

Or you could use interpreted text with a "relative reference" role::

    see also `/some/where/more.html`:rel:

(Note: this is not implemented yet.)

If you use it a lot, you could specify that "rel" is the default role
for interpreted text in your documents.  (Not implemented yet,
either.)

In conclusion, you have some interesting ideas, but most are not
compelling enough for me to do anything with them any time soon.
I'll put some of them on the to-do list.  If you'd like to see
something done about them in the short term, I welcome your
participation.  I'll be happy to help you to understand the inner
workings of the code (and those discussions will contribute to the
implementation docs).

Thanks for the feedback, and I hope to see some code out of you!

-- 
David Goodger  <goodger@users.sourceforge.net>  Open-source projects:
  - Python Docutils: http://docutils.sourceforge.net/
    (includes reStructuredText: http://docutils.sf.net/rst.html)
  - The Go Tools Project: http://gotools.sourceforge.net/