From john_sips_tea at yahoo.com  Mon Mar  6 19:07:32 2006
From: john_sips_tea at yahoo.com (John M. Gabriele)
Date: Mon, 6 Mar 2006 10:07:32 -0800 (PST)
Subject: [Doc-SIG] docutils, ReST, and pydoc
Message-ID: <20060306180732.7242.qmail@web80902.mail.scd.yahoo.com>

Holy mackerel! I just read a good portion of PEP 287
http://www.python.org/peps/pep-0287.html and it seems
clear to me that I should be able to put reStructured
text right into my docstrings and then read them nicely
rendered with the pydoc command.

Is it a foregone conclusion that this functionality
will soon be built into standard Python?

If so, how long until that happens? What sticking points
are we currently facing?

I took a brief look at the docutils page. Is it possible
that the project has bitten off more than it can chew? That
is, it looks like they're building a very general tool,
whereas Python may simply need to have pydoc properly render
ReST in docstrings so I can run "pydoc some_module" and
get some nice manpage-style (perldoc style) documentation
right there in my terminal.

Thanks,
---John


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

From goodger at python.org  Tue Mar  7 04:19:53 2006
From: goodger at python.org (David Goodger)
Date: Mon, 06 Mar 2006 22:19:53 -0500
Subject: [Doc-SIG] docutils, ReST, and pydoc
In-Reply-To: <20060306180732.7242.qmail@web80902.mail.scd.yahoo.com>
References: <20060306180732.7242.qmail@web80902.mail.scd.yahoo.com>
Message-ID: <440CFBD9.2070108@python.org>

[John M. Gabriele]
> Holy mackerel! I just read a good portion of PEP 287
> http://www.python.org/peps/pep-0287.html and it seems
> clear to me that I should be able to put reStructured
> text right into my docstrings and then read them nicely
> rendered with the pydoc command.

(note: it's spelled "reStructuredText", all one word,
abbreviated reST or ReST or RST, but not REST)

> Is it a foregone conclusion that this functionality
> will soon be built into standard Python?

No.  PEP stands for "Python Enhancement *Proposal*", and PEP 287 is
still "State: Draft".

> What sticking points are we currently facing?

Completion of the necessary features, especially the Docutils PySource
Reader.

> I took a brief look at the docutils page. Is it possible
> that the project has bitten off more than it can chew?

It's an ambitious project, certainly.  What's the point if it's not a
challenge? (0.5 ;-)  Many lesser attempts have fallen by the wayside.
Docutils has had a lot of success so far.

> That is, it looks like they're building a very general tool,
> whereas Python may simply need to have pydoc properly render
> ReST in docstrings so I can run "pydoc some_module" and
> get some nice manpage-style (perldoc style) documentation
> right there in my terminal.

Such tools already exist, such as Epydoc, Pudge, and Endo.  Simply
rendering reST is easy.  Adding hyperlinks and the correct context is
the challenge.  Doing it without importing the code you're documenting
is important too.

Read PEP 258, especially the "Python Source Reader" section for more
on the vision behind the tool *I* want, and that I'll build
(eventually) if no one beats me to it.  (Note: I haven't taken a good
look at Pudge or Endo yet; they may have already done the hard
lifting.)

-- 
David Goodger <http://python.net/~goodger>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/doc-sig/attachments/20060306/de53cb45/attachment.pgp 

From nico at tekNico.net  Tue Mar  7 08:21:32 2006
From: nico at tekNico.net (Nicola Larosa)
Date: Tue, 07 Mar 2006 08:21:32 +0100
Subject: [Doc-SIG] docutils, ReST, and pydoc
In-Reply-To: <440CFBD9.2070108@python.org>
References: <20060306180732.7242.qmail@web80902.mail.scd.yahoo.com>
	<440CFBD9.2070108@python.org>
Message-ID: <dujca1$et2$1@sea.gmane.org>

>> That is, it looks like they're building a very general tool,
>> whereas Python may simply need to have pydoc properly render
>> ReST in docstrings so I can run "pydoc some_module" and
>> get some nice manpage-style (perldoc style) documentation
>> right there in my terminal.

> Such tools already exist, such as Epydoc, Pudge, and Endo.  Simply
> rendering reST is easy.  Adding hyperlinks and the correct context is
> the challenge.  Doing it without importing the code you're documenting
> is important too.

For reference:

Epydoc
http://epydoc.sourceforge.net/

Pudge
http://pudge.lesscode.org/

Endo (part of the Enthought Tool Suite)
http://code.enthought.com/ets/

Did not know about this, thanks David.

There's also docextractor:

http://codespeak.net/svn/user/mwh/docextractor/trunk/

API docs
http://radeex.blogspot.com/2006/02/api-docs.html

-- 
Nicola Larosa - http://www.tekNico.net/

It will always be true that people that drive slower than me are morons,
and people that drive faster than me are idiots. :)
 -- Matthew Carlisle on Slashdot, December 2005


From fredrik at pythonware.com  Tue Mar  7 09:18:33 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 7 Mar 2006 09:18:33 +0100
Subject: [Doc-SIG] docutils, ReST, and pydoc
References: <20060306180732.7242.qmail@web80902.mail.scd.yahoo.com><440CFBD9.2070108@python.org>
	<dujca1$et2$1@sea.gmane.org>
Message-ID: <dujfkr$of7$1@sea.gmane.org>

Nicola Larosa wrote:

> There's also docextractor:
>
> http://codespeak.net/svn/user/mwh/docextractor/trunk/

and pythondoc:

    http://www.effbot.org/zone/pythondoc.htm

</F>


From manlio_perillo at libero.it  Fri Mar 10 12:18:14 2006
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Fri, 10 Mar 2006 09:18:14 -0200
Subject: [Doc-SIG] coordination for translation of the Python Documentation
Message-ID: <44116076.7030308@libero.it>

Regards.

There exists an effort to aid the coordination of the translations of
the Python Documentation?


Thanks  Manlio Perillo

From aahz at pythoncraft.com  Fri Mar 10 18:34:07 2006
From: aahz at pythoncraft.com (Aahz)
Date: Fri, 10 Mar 2006 09:34:07 -0800
Subject: [Doc-SIG] coordination for translation of the Python
	Documentation
In-Reply-To: <44116076.7030308@libero.it>
References: <44116076.7030308@libero.it>
Message-ID: <20060310173407.GB7625@panix.com>

On Fri, Mar 10, 2006, Manlio Perillo wrote:
> 
> There exists an effort to aid the coordination of the translations of
> the Python Documentation?

AFAIK, there are no formal efforts, though there are many people willing
to help.  You'll probably get more useful responses if you ask a more
specific question.
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

"19. A language that doesn't affect the way you think about programming,
is not worth knowing."  --Alan Perlis

From manlio_perillo at libero.it  Sun Mar 12 17:45:47 2006
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Sun, 12 Mar 2006 14:45:47 -0200
Subject: [Doc-SIG] coordination for translation of the Python
	Documentation
In-Reply-To: <20060310173407.GB7625@panix.com>
References: <44116076.7030308@libero.it> <20060310173407.GB7625@panix.com>
Message-ID: <4414503B.6070701@libero.it>

Aahz ha scritto:
> On Fri, Mar 10, 2006, Manlio Perillo wrote:
>> There exists an effort to aid the coordination of the translations of
>> the Python Documentation?
> 
> AFAIK, there are no formal efforts, though there are many people willing
> to help.  You'll probably get more useful responses if you ask a more
> specific question.

Well, I'm the new maintainer of python.it, the italian Python web site.

I hoped for some formal guidelines or support to aid the translations.

Currently I'm porting all the documents to a Subversion repository
http://svn.python.it,
and I'm trying to write some scripts for automate the transtation of the
Standard Documentation (or at least to help merging changes in the
original documents into the translations)
http://svn.python.it/admin/trunk/scripts/update-version.py

There are some similar efforts?


Thanks and regards  Manlio Perillo

From rsenra at acm.org  Tue Mar 14 00:06:12 2006
From: rsenra at acm.org (Rodrigo Senra)
Date: Mon, 13 Mar 2006 20:06:12 -0300
Subject: [Doc-SIG] coordination for translation of the
	Python	Documentation
In-Reply-To: <20060310173407.GB7625@panix.com>
References: <44116076.7030308@libero.it> <20060310173407.GB7625@panix.com>
Message-ID: <AD11A9AA-B83F-470D-A4C7-DC4604C8EC63@acm.org>


On 10Mar 2006, at 2:34 PM, Aahz wrote:

> On Fri, Mar 10, 2006, Manlio Perillo wrote:
>>
>> There exists an effort to aid the coordination of the translations  
>> of the Python Documentation?
>
> AFAIK, there are no formal efforts, though there are many people  
> willing to help.  You'll probably get more useful responses if
> you ask a more specific question.

If you are planning to target Italian language, you can check what
other people is doing to coordinate their own efforts.
Some examples:

French - [1] (Translation effort)
Brazilian Portuguese - [2] (Translation effort)
Italian - [3] (actual documentation translated)
Others - [4] (where to dig further)

[1] http://frpython.sourceforge.net/
[2] http://www.pythonbrasil.com.br/moin.cgi/PythonDoc
[3] http://www.python.it/doc/Python-Docs/html/
[4] http://www.python.org/doc/nonenglish/

best regards,
Senra


Rodrigo Senra
______________
rsenra @ acm.org
http://rodrigo.senra.nom.br


From manlio_perillo at libero.it  Tue Mar 14 11:55:20 2006
From: manlio_perillo at libero.it (Manlio Perillo)
Date: Tue, 14 Mar 2006 08:55:20 -0200
Subject: [Doc-SIG] coordination for translation of the
	Python	Documentation
In-Reply-To: <AD11A9AA-B83F-470D-A4C7-DC4604C8EC63@acm.org>
References: <44116076.7030308@libero.it> <20060310173407.GB7625@panix.com>
	<AD11A9AA-B83F-470D-A4C7-DC4604C8EC63@acm.org>
Message-ID: <4416A118.70803@libero.it>

Rodrigo Senra ha scritto:
> 
> On 10Mar 2006, at 2:34 PM, Aahz wrote:
> 
>> On Fri, Mar 10, 2006, Manlio Perillo wrote:
>>>
>>> There exists an effort to aid the coordination of the translations of
>>> the Python Documentation?
>>
>> AFAIK, there are no formal efforts, though there are many people
>> willing to help.  You'll probably get more useful responses if
>> you ask a more specific question.
> 
> If you are planning to target Italian language, you can check what
> other people is doing to coordinate their own efforts.
> Some examples:
> 
> French - [1] (Translation effort)
> Brazilian Portuguese - [2] (Translation effort)
> Italian - [3] (actual documentation translated)

Well, I'm indeed the new maintainer of python.it.

The problem is that the translated documentation is not "integrated"
into the CPython distribution.

I whould like, at least, to have it.python.org/docs or
docs.python.org/it points to the translated documentation.

Moreover the i18n projects is dead...

The only translations "integrated" in the main site are the FAQ
(BTW, who is the maintainer of this part of the site?)


Thanks and regards  Manlio Perillo

From edloper at gradient.cis.upenn.edu  Fri Mar 17 18:07:08 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Fri, 17 Mar 2006 12:07:08 -0500
Subject: [Doc-SIG] How should variables' docstrings be written?
Message-ID: <441AECBC.702@gradient.cis.upenn.edu>

Epydoc 3 supports extracting information about Python modules by
parsing.  As a result, it can extract "docstrings" for variables.
There are several possible ways these docstrings could be expressed in
the Python source file, and I wanted to get some feedback on which
ways people prefer.  It's my hope that some consensus can be reached
on this, so that any tools that extract variable docstrings can use
the same conventions.

The conventions I've seen are:

     class A:

         a = 1
         """string literal following the assignment"""

         ##
         # Comment whose first line starts with a double-hash,
         # preceeding the assignment.
         b = 2

         #: Comment that begins with a special marker string on all
         #: lines, preceeding the assignment
         c = 3

         d = 4  #: Comment w/ marker on the same line the as assignment

         e = [
             #: Comment w/ special marker, within the value expression.
             1,
             2,
             3]

String literal:
   This is the closest form to existing docstrings.  I think it works
   well if the assignment line is fairly short, but for multiline
   values (eg large dictionaries or multiline strings), the docstring
   can become too far from the name of the variable it describes.
   Also, if the value is a multiline string, the division between
   the end of the value and the start of the docstring isn't obvious.

Comment whose first line is a double hash:
   This is used by doxygen and Fredrik Lundh's PythonDoc.  In doxygen,
   if there's text on the line with the double hash, it is treated as
   a summary string.  I dislike this convention because it seems too
   likely to result in false positives.  E.g., if you comment-out a
   region with a comment in it, you get a double-hash.

Comment that begins with a special marker string:
   This is my current favorite.  But there's a question of what the
   special marker string should be.  Enthought proposes "#*", partially
   because it works well with line wrapping for some versions of emacs.
   But if a different marker string is deciced on, then python-mode.el
   could certainly be made aware of it.  The markers that look
   reasonably good to my eye are:

     #: #| #*

Currently, epydoc supports both string literals and comments with the
special marker "#:".  The comment-docstrings can be placed before the
assignment, after it on the same line, or within the value (or any
combination thereof).

So..  Which conventions do people prefer?

-Edward


From john_sips_tea at yahoo.com  Fri Mar 17 19:31:18 2006
From: john_sips_tea at yahoo.com (John M. Gabriele)
Date: Fri, 17 Mar 2006 10:31:18 -0800 (PST)
Subject: [Doc-SIG] How should variables' docstrings be written?
In-Reply-To: <441AECBC.702@gradient.cis.upenn.edu>
Message-ID: <20060317183118.45440.qmail@web80908.mail.scd.yahoo.com>

--- Edward Loper <edloper at gradient.cis.upenn.edu> wrote:

> [snip]
> 
> Comment that begins with a special marker string:
>    This is my current favorite.  [snip]
>    The markers that look
>    reasonably good to my eye are:
> 
>      #: #| #*
> 
> [snip]
> 
> So..  Which conventions do people prefer?
> 
> -Edward
> 

I like that one too because it saves space and also b/c
it comes before the variable. I never understood why
docstrings come *after* the thing they're describing...

Anyway, I think any of your suggestions look pretty good
to the eye, and with proper syntax highlighting they will
all stand out nicely in one's editor.

I'd prefer #@ or #% though, since they're easier to type.
I can hold down the shift key with my right pinky and
just hit those two keys (3-2 or 3-5) very quickly without
missing a beat.

---John


__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 

From edloper at gradient.cis.upenn.edu  Fri Mar 17 22:10:17 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Fri, 17 Mar 2006 16:10:17 -0500
Subject: [Doc-SIG] non-ascii docstrings
Message-ID: <441B25B9.6080509@gradient.cis.upenn.edu>

I've been working on epydoc, and the question has come up of how I 
should treat non-unicode docstrings that contain non-ascii characters. 
An example of such a file is "python2.4/encodings/string_escape.py", 
whose module docstring contains an 'o' with an umlaut.

In particular, the question is whether I should assume that the 
docstring is encoded with the encoding specified by the "-*- coding -*-" 
directive at the top of the file.

The reason why we *wouldn't* use the encoding is that PEP 263 [1], which 
defines the coding directive, says that it does *not* apply to 
non-unicode string literals.  In particular, PEP 263 says that the 
entire file should be read & tokenized using the specified coding, but 
once string objects are created, they should be reencoded back into 
8-bit strings using the file encoding.

So the "correct" fix is for the author of the module to use unicode 
literals instead of string literals for docstrings that contain 
non-ascii characters.  This has the advantage that if a user tries to 
look at the docstring via introspection, it will be correct.

On the other hand, epydoc is often used by people other than the author 
of a module, and requiring them to go through and replace all string 
literal docstrings with unicode literals seems a bit unreasonable.

In a way, this is similar to the mistake I've seen many times of using 
non-escaped backslashes inside docstrings.  e.g.:

def wc(filename):
     """
     Count the number of words in the given file. E.g.:
         >>> wc("c:\test\new.txt")
         100
     """

Which looks fine in the source file, but looks quite broken if you print 
its __doc__:

 >>> print wc.__doc__
     Count the number of words in the given file. E.g.:
          >>> wc("c:     est
ew.txt")
     100

(The right fix in that case is probably to use a raw-string.)

So the question is..  Should epydoc (and other tools like it) be 
compliant with PEP 263 (and consistent with Python); or should they "do 
what I mean, not what I say" and treat non-ascii docstrings as if they 
were encoded using the module's encoding?

-Edward

http://www.python.org/doc/peps/pep-0263/

From fuzzyman at voidspace.org.uk  Fri Mar 17 22:49:49 2006
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 17 Mar 2006 21:49:49 +0000
Subject: [Doc-SIG] How should variables' docstrings be written?
In-Reply-To: <441AECBC.702@gradient.cis.upenn.edu>
References: <441AECBC.702@gradient.cis.upenn.edu>
Message-ID: <441B2EFD.2090807@voidspace.org.uk>

Edward Loper wrote:
> Epydoc 3 supports extracting information about Python modules by
> parsing.  As a result, it can extract "docstrings" for variables.
>   
Fantastic news.

> There are several possible ways these docstrings could be expressed in
> the Python source file, and I wanted to get some feedback on which
> ways people prefer.  It's my hope that some consensus can be reached
> on this, so that any tools that extract variable docstrings can use
> the same conventions.
>
> The conventions I've seen are:
>
>      class A:
>
>          a = 1
>          """string literal following the assignment"""
>
>          ##
>          # Comment whose first line starts with a double-hash,
>          # preceeding the assignment.
>          b = 2
>
>   
My preference. :-)

>          #: Comment that begins with a special marker string on all
>          #: lines, preceeding the assignment
>   
As the colon is a character with special significance within Python
syntax, I would find this distracting when reading code.

>          c = 3
>
>          d = 4  #: Comment w/ marker on the same line the as assignment
>
>   
Inline comments are generally uglier.

>          e = [
>              #: Comment w/ special marker, within the value expression.
>              1,
>              2,
>              3]
>
> String literal:
>    This is the closest form to existing docstrings.  I think it works
>    well if the assignment line is fairly short, but for multiline
>    values (eg large dictionaries or multiline strings), the docstring
>    can become too far from the name of the variable it describes.
>    Also, if the value is a multiline string, the division between
>    the end of the value and the start of the docstring isn't obvious.
>
> Comment whose first line is a double hash:
>    This is used by doxygen and Fredrik Lundh's PythonDoc.  In doxygen,
>    if there's text on the line with the double hash, it is treated as
>    a summary string.  I dislike this convention because it seems too
>    likely to result in false positives.  E.g., if you comment-out a
>    region with a comment in it, you get a double-hash.
>
> Comment that begins with a special marker string:
>    This is my current favorite.  But there's a question of what the
>    special marker string should be.  Enthought proposes "#*", partially
>    because it works well with line wrapping for some versions of emacs.
>    But if a different marker string is deciced on, then python-mode.el
>    could certainly be made aware of it.  The markers that look
>    reasonably good to my eye are:
>
>      #: #| #*
>
>   
Bearable. :-)

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml

> Currently, epydoc supports both string literals and comments with the
> special marker "#:".  The comment-docstrings can be placed before the
> assignment, after it on the same line, or within the value (or any
> combination thereof).
>
> So..  Which conventions do people prefer?
>
> -Edward
>
> _______________________________________________
> Doc-SIG maillist  -  Doc-SIG at python.org
> http://mail.python.org/mailman/listinfo/doc-sig
>
>   


From goodger at python.org  Sat Mar 18 03:54:16 2006
From: goodger at python.org (David Goodger)
Date: Fri, 17 Mar 2006 21:54:16 -0500
Subject: [Doc-SIG] How should variables' docstrings be written?
In-Reply-To: <441AECBC.702@gradient.cis.upenn.edu>
References: <441AECBC.702@gradient.cis.upenn.edu>
Message-ID: <441B7658.50807@python.org>

[Edward Loper]
> So..  Which conventions do people prefer?

I prefer string literals after the assignments.  If you support doc
comments, I suggest allowing the marker to be specified per-module.

-- 
David Goodger <http://python.net/~goodger>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/doc-sig/attachments/20060317/f9231ac2/attachment.pgp 

From edloper at gradient.cis.upenn.edu  Sat Mar 18 07:12:14 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Sat, 18 Mar 2006 01:12:14 -0500
Subject: [Doc-SIG] How should variables' docstrings be written?
In-Reply-To: <441B7658.50807@python.org>
References: <441AECBC.702@gradient.cis.upenn.edu> <441B7658.50807@python.org>
Message-ID: <441BA4BE.5010701@gradient.cis.upenn.edu>

David Goodger wrote:
> [Edward Loper]
>> So..  Which conventions do people prefer?
> 
> I prefer string literals after the assignments.  If you support doc
> comments, I suggest allowing the marker to be specified per-module.
> 

How would it be specified?  Some __special_variable__?

-Edward

From goodger at python.org  Sat Mar 18 15:34:12 2006
From: goodger at python.org (David Goodger)
Date: Sat, 18 Mar 2006 09:34:12 -0500
Subject: [Doc-SIG] How should variables' docstrings be written?
In-Reply-To: <441BA4BE.5010701@gradient.cis.upenn.edu>
References: <441AECBC.702@gradient.cis.upenn.edu> <441B7658.50807@python.org>
	<441BA4BE.5010701@gradient.cis.upenn.edu>
Message-ID: <441C1A64.9000007@python.org>

> David Goodger wrote:
>> I prefer string literals after the assignments.  If you support doc
>> comments, I suggest allowing the marker to be specified per-module.

[Edward Loper]
> How would it be specified?  Some __special_variable__?

Sure, __docmarker__ perhaps.

-- 
David Goodger <http://python.net/~goodger>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/doc-sig/attachments/20060318/78f74091/attachment.pgp 

From goodger at python.org  Fri Mar 24 14:53:47 2006
From: goodger at python.org (David Goodger)
Date: Fri, 24 Mar 2006 08:53:47 -0500
Subject: [Doc-SIG] non-ascii docstrings
In-Reply-To: <441B25B9.6080509@gradient.cis.upenn.edu>
References: <441B25B9.6080509@gradient.cis.upenn.edu>
Message-ID: <4423F9EB.1020803@python.org>

[Edward Loper]
> I've been working on epydoc, and the question has come up of how I
> should treat non-unicode docstrings that contain non-ascii
> characters.  An example of such a file is
> "python2.4/encodings/string_escape.py", whose module docstring
> contains an 'o' with an umlaut.
>
> In particular, the question is whether I should assume that the
> docstring is encoded with the encoding specified by the "-*- coding
> -*-" directive at the top of the file.

I think that although it's the only possible assumption, it's also
potentially a wrong assumption.  IOW, don't assume anything.

> The reason why we *wouldn't* use the encoding is that PEP 263 [1],
> which defines the coding directive, says that it does *not* apply to
> non-unicode string literals.  In particular, PEP 263 says that the
> entire file should be read & tokenized using the specified coding,
> but once string objects are created, they should be reencoded back
> into 8-bit strings using the file encoding.

One reason is that the module code may expect such string literals to
have their original encoding.  String literals can contain arbitrary
8-bit data (strings are bytes, not characters).  Attempting to decode
such strings is inviting misinterpretation.

Another reason is simple: "In the face of ambiguity, refuse the
temptation to guess."

> So the "correct" fix is for the author of the module to use unicode
> literals instead of string literals for docstrings that contain
> non-ascii characters.  This has the advantage that if a user tries
> to look at the docstring via introspection, it will be correct.
>
> On the other hand, epydoc is often used by people other than the
> author of a module, and requiring them to go through and replace all
> string literal docstrings with unicode literals seems a bit
> unreasonable.

Yes, it's unreasonable.  But such code is buggy IMO.  It's also
unreasonable to expect Epydoc to correctly interpret garbage input.
Don't do it.

> So the question is..  Should epydoc (and other tools like it) be
> compliant with PEP 263 (and consistent with Python); or should they
> "do what I mean, not what I say" and treat non-ascii docstrings as
> if they were encoded using the module's encoding?

Be compliant with PEP 263, issue a warning (PEP 263, Implementation,
step 1), and either ignore such string literals or represent them as
strings of bytes (using "\xYY" notation).

-- 
David Goodger <http://python.net/~goodger>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 249 bytes
Desc: OpenPGP digital signature
Url : http://mail.python.org/pipermail/doc-sig/attachments/20060324/865b4cc0/attachment.pgp 

From edloper at gradient.cis.upenn.edu  Sat Mar 25 05:32:57 2006
From: edloper at gradient.cis.upenn.edu (Edward Loper)
Date: Fri, 24 Mar 2006 23:32:57 -0500
Subject: [Doc-SIG] non-ascii docstrings
In-Reply-To: <4423F9EB.1020803@python.org>
References: <441B25B9.6080509@gradient.cis.upenn.edu>
	<4423F9EB.1020803@python.org>
Message-ID: <4424C7F9.9000300@gradient.cis.upenn.edu>

David Goodger wrote:
>> In particular, the question is whether I should assume that the
>> docstring is encoded with the encoding specified by the "-*- coding
>> -*-" directive at the top of the file.
> 
> I think that although it's the only possible assumption, it's also
> potentially a wrong assumption.  IOW, don't assume anything.

That was my inclination at first, but it appears that there are a large 
number of python files out there that use non-ascii docstrings.  Asking 
the epydoc user (who is very often not the package author) to go through 
and add a 'u' in front of every docstring (but *not* any other string -- 
that might break the program) seems unreasonable.  And I have yet to see 
a single python module where the -*- coding -*- directive is *not* the 
right encoding for the docstrings.

> Another reason is simple: "In the face of ambiguity, refuse the
> temptation to guess."

Practicality beats purity. :)

> Yes, it's unreasonable.  But such code is buggy IMO.  It's also
> unreasonable to expect Epydoc to correctly interpret garbage input.

Small consolation to the user who's just trying to learn how to use a 
package that they didn't write.

-Edward


From jerdonek at gmail.com  Sat Mar 25 06:16:46 2006
From: jerdonek at gmail.com (Chris Jerdonek)
Date: Fri, 24 Mar 2006 21:16:46 -0800
Subject: [Doc-SIG] non-ascii docstrings
In-Reply-To: <4424C7F9.9000300@gradient.cis.upenn.edu>
References: <441B25B9.6080509@gradient.cis.upenn.edu>
	<4423F9EB.1020803@python.org>
	<4424C7F9.9000300@gradient.cis.upenn.edu>
Message-ID: <045d618b1f8b76c6615c4e0bab09ca04@gmail.com>

On Mar 24, 2006, at 8:32 PM, Edward Loper wrote:

> David Goodger wrote:
>>> In particular, the question is whether I should assume that the
>>> docstring is encoded with the encoding specified by the "-*- coding
>>> -*-" directive at the top of the file.
>>
>> Yes, it's unreasonable.  But such code is buggy IMO.  It's also
>> unreasonable to expect Epydoc to correctly interpret garbage input.
>
> Small consolation to the user who's just trying to learn how to use a
> package that they didn't write.

Can't you make it an option (messy/pure)?

--Chris


From lac at strakt.com  Sat Mar 25 06:22:18 2006
From: lac at strakt.com (Laura Creighton)
Date: Sat, 25 Mar 2006 06:22:18 +0100
Subject: [Doc-SIG] non-ascii docstrings
In-Reply-To: Message from Edward Loper <edloper@gradient.cis.upenn.edu> 
	of "Fri,
	24 Mar 2006 23:32:57 EST." <4424C7F9.9000300@gradient.cis.upenn.edu> 
References: <441B25B9.6080509@gradient.cis.upenn.edu>
	<4423F9EB.1020803@python.org>
	<4424C7F9.9000300@gradient.cis.upenn.edu> 
Message-ID: <200603250522.k2P5MIfE025596@theraft.strakt.com>


I have never seen a module where the -*- coding -*- is not the same as
the docstring, either.  And the greatest number of times I have seen
this is where people are using some company-wide tool, possibly
third-party and possibly to integrate with java code -- to extract
the docstrings, and also have a requirement that the docstring
contains the name of the person who wrote, and or modified the code.

Indeed, in the matter of encoding, I wish that python would guess 
a whole lot more.  One of the most common 'first python programs'
that non-computer people write is 'my phonelist manager' and another is
'my cd collection manager'.  I think that they have plenty enough
to worry about without needing to find out about encodings before 
their first python program runs.  Most places _have_ a locale sort
of setting, and I would be in favour of trying whatever is there
when encountering something that is not ascii.

Laura


From fredrik at pythonware.com  Tue Mar 28 18:48:37 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Tue, 28 Mar 2006 18:48:37 +0200
Subject: [Doc-SIG] non-ascii docstrings
References: <441B25B9.6080509@gradient.cis.upenn.edu><4423F9EB.1020803@python.org><4424C7F9.9000300@gradient.cis.upenn.edu>
	<200603250522.k2P5MIfE025596@theraft.strakt.com>
Message-ID: <e0bpd6$n29$1@sea.gmane.org>

Laura Creighton wrote:

> Indeed, in the matter of encoding, I wish that python would guess
> a whole lot more.  One of the most common 'first python programs'
> that non-computer people write is 'my phonelist manager' and another is
> 'my cd collection manager'.  I think that they have plenty enough
> to worry about without needing to find out about encodings before
> their first python program runs.  Most places _have_ a locale sort
> of setting, and I would be in favour of trying whatever is there
> when encountering something that is not ascii.

as long as the interpreter prints a warning when it falls back on the
default...  oh, wait.

$ python2.2 welcome.py
Welcome to Link�ping

$ python2.3 welcome.py
sys:1: DeprecationWarning: Non-ASCII character '\xf6' in file welcome.py on line 1, but no encoding declared; see
http://www.python.org/peps/pep-0263.html for details
Welcome to Link�ping

$ python2.4 welcome.py
sys:1: DeprecationWarning: Non-ASCII character '\xf6' in file welcome.py on line 1, but no encoding declared; see
http://www.python.org/peps/pep-0263.html for details
Welcome to Link�ping

$ python2.5 welcome.py
  File "welcome.py", line 1
SyntaxError: Non-ASCII character '\xf6' in file /users/fredrik/welcome.py on line 1, but no encoding declared; see
http://www.python.org/peps/pep-0263.html for details

guess this means that newbies should make sure to run their first
program under multiple Python versions...

</F>


From fredrik at pythonware.com  Wed Mar 29 21:56:57 2006
From: fredrik at pythonware.com (Fredrik Lundh)
Date: Wed, 29 Mar 2006 21:56:57 +0200
Subject: [Doc-SIG] introducing the experimental pytut wiki
Message-ID: <e0eoqc$c6k$1@sea.gmane.org>

without further ado, here's

    http://pytut.infogami.com/

inspired by

    http://article.gmane.org/gmane.comp.python.general/455724

and related to several earlier threads on this mailing list.

</F>