[PYTHON DOC-SIG] Documenting frameworks: suggestions?

Sun, 11 May 1997 12:08:57 -0400 (EDT)

I've written a first cut at documentation for SocketServer.py, and
think it raises some issues that are worth discussing here.  The TeX
source for the docs is included below.  (Comments on the text are
welcome, though that's not why I'm posting it.)  

Most sections in the library reference simply have to document
wrappers around a few functions, or an object that won't usually be
subclassed.  SocketServer is a fairly simple framework for writing
servers of various kinds.  There are several base classes, and several
items to document:

	* class variables
	* instance variables
	* functions for external callers
	* functions to be overridden by subclasses
	* functions to be used by those overriding functions, but probably
aren't worth overriding themselves.

	The documentation is thus broken up into sections; right now
I've written little fragments of text like "The server classes support
the following class variables" and then the class variables are listed
and explained.  Ditto for instance variables and all the different
types of functions.  That breaks up the text into a lot of different
sections, and makes it difficult to scan through, so a better approach
might be in order.

Texinfo has special environments for this sort of thing, and I like
its rendering; the output looks like:

    request_queue_size				Class variable 
    	bla bla bla text goes here bla bla bla

    server_address				Instance variable
    	bla bla bla

Perhaps for each function, we could indicate whether it's public or
not, overridable or not.  Or a tabular format--have a table lists all
the attributes and methods, which ones are public, which are
overridable, etc, and then just put everything in one big alphabetical
list.  Or, there could be two subsections; the first is the public
interface, and the second is subclassing...

	Should the documentation be sufficient to use the module, or
is it intended that users will look at the code for SocketServer.py?
For example, the default implementation of functions like
handle_error() is mentioned; should it perhaps be assumed that users
will go look at that function's code and decide if it suits them?

There are other frameworks, like BaseHTTPServer, that need to be
documented for 1.5, so now's the time to improve the structure of the
documentation.  

	A.M. Kuchling
	amk@magnet.com
	http://people.magnet.com/%7Eamk/

	Note to Guido: this isn't finished, so please don't add it to
the docs yet.  Thanks!

====================================================================
\section{Standard Module \sectcode{SocketServer}}
\stmodindex{SocketServer}

The \code{SocketServer} module simplifies the task of writing network
servers.  Classes are available to write servers that use Internet
protocols or Unix domain sockets, and to use them as either streams or
datagrams.  

There are four basic server classes: \code{TCPServer} uses the
Internet TCP protocol, which provides continuous streams of data
between the client and server.  \code{UDPServer} uses datagrams, which
are discrete packets of information that may arrive out of order or be
lost while in transit.  The more infrequently used
\code{UnixStreamServer} and \code{UnixDatagramServer} are similar, but
use Unix domain sockets.  For more details on network programming,
consult a book such as W. Richard Steven's _XXX_ or XXX (Windows
version?).

These four classes process requests \dfn{synchronously}; each request
must be completed before the next request can be started.  This isn't
suitable if each request takes a long time to complete; this may occur
if a request requires a lot of computation, or if the request returns
a lot of data and the client has a slow connection.  The solution is
to create a separate process or thread to handle each request; the
\code{ForkingMixIn} and \code{ThreadingMixIn} mix-in classes can be
used to support asynchronous behaviour.  

Creating a server requires several steps.  First, you must create a
request handler class by subclassing the \code{BaseRequestHandler}
class and overriding its \code{handle()} method; this method processes
incoming requests.  Second, you must instantiate one of the server
classes, passing it the server's address and the request handler
class.  Finally, call the \code{serve_forever()} or
\code{handle_request()} method of the server object.

Server classes have the same external methods and attributes, no
matter what network protocol they use:

%XXX should data and methods be intermingled, or separate?
% how should the distinction between class and instance variables be
% drawn?

\begin{funcdesc}{fileno}{}
Return a file descriptor for the socket on which the server is
listening; this is just an integer.  This function is most commonly
passed to \code{select.select()}, to allow monitoring multiple servers
in the same process.  
\end{funcdesc}

\begin{funcdesc}{handle_request}{}
Process a single request.  This function calls the following methods,
in order: \code{get_request()}, \code{verify_request()}, and
\code{process_request()}.  If the user-provided \code{handle()} method
of the handler class raises an exception, the server's
\code{handle_error()} method will be called.
\end{funcdesc}

\begin{funcdesc}{serve_forever}{}
Handle an infinite number of requests.  This simply calls
\code{handle_request()} inside an infinite loop.
\end{funcdesc}

% XXX should class variables be covered before instance variables, or
% vice versa?
% Perhaps it would be more understandable and compact to do something
% like Texinfo does?

The server classes support the following class variables:

\begin{datadesc}{request_queue_size}
The size of the request queue.  If it takes a long time to process a
single request, any requests that arrive while the server is busy are
placed into a queue, up to \code{request_queue_size} requests.  Once
the queue is full, further requests from clients will get a
``Connection denied'' error.
The default value is usually 5, but this can be overridden by subclasses.
\end{datadesc}

\begin{datadesc}{socket_type}
The type of socket used by the server; \code{socket.SOCK_STREAM} and
\code{socket.SOCK_DGRAM} are possible values.
\end{datadesc}

Instances of server classes have the following instance variables:

\begin{datadesc}{address_family}
The family of protocols to which the server's socket belongs.
\code{socket.AF_INET} and \code{socket.AF_UNIX} are possible values.
\end{datadesc}

\begin{datadesc}{RequestHandlerClass}
The user-provided request handler class; an instance of this class is
created for each request.
\end{datadesc}

\begin{datadesc}{server_address}
The address on which the server is listening.  This is a tuple
containing a string giving the IP address, and an integer port number:
\code{('127.0.0.1', 80)}, for example.
\end{datadesc}

\begin{datadesc}{socket}
The socket object on which the server will listen.
\end{datadesc}

There are various server methods that can be overridden by subclasses
of base server classes like \code{TCPServer}; these methods aren't
useful to external users of the server object.

% should the default implementations of these be documented, or should
% it be assumed that the user will look at SocketServer.py?

\begin{funcdesc}{finish_request}{}
Actually processes the request by instantiating
\code{RequestHandlerClass} and calling its \code{handle()} method.   
% XXX 
\end{funcdesc}

\begin{funcdesc}{get_request}{}
Must accept a request from the socket, and return a 2-tuple containing
the \emph{new} socket object to be used to communicate with the
client, and the client's address (which is itself a 2-tuple).  
\end{funcdesc}

\begin{funcdesc}{handle_error}{request\, client_address}
This function is called if the \code{RequestHandlerClass}'s
\code{handle} method raises an exception.  The default action is to print
the traceback to standard output and continue handling further requests.
\end{funcdesc}

\begin{funcdesc}{process_request}{request\, client_address}
Calls \code{finish_request()} to create an instance of the
\code{RequestHandlerClass}.  If desired, this function can create a new
process or thread to handle the request; the \code{ForkingMixIn} and
\code{ThreadingMixIn} classes do this.
\end{funcdesc}

% Is there any point in documenting the following two functions?
% What would the purpose of overriding them be: initializing server
% instance variables, adding new network families?

\begin{funcdesc}{server_activate}{}
Called by the server's constructor to activate the server. 
May be overridden.
\end{funcdesc}

\begin{funcdesc}{server_bind}{}
Called by the server's constructor to bind the socket to the desired
address.  May be overridden.
\end{funcdesc}

\begin{funcdesc}{verify_request}{request\, client_address}
Must return a Boolean value; if the value is true, the request will be
processed, and if it's false, the request will be denied.  
This function can be overridden to implement access controls for a server.
\end{funcdesc}

_______________
DOC-SIG  - SIG for the Python Documentation Project

send messages to: doc-sig@python.org
administrivia to: doc-sig-request@python.org
_______________