[Python-3000] PEP 3108: Standard Library Reorganization

Brett Cannon brett at python.org
Tue Jan 2 01:14:45 CET 2007


As part of my New Years resolution to get all of my current and planned PEPs
actually written, accepted, and implemented for 2007, here is the stdlib
reorg PEP.  I have already checked it into svn but I have inlined it below
for comments.

The PEP is separated into two parts: removal and renaming.  For removal I
tried to keep the list to what I thought was very reasonable.  If you look
in Open Issues there are some possible removals that I am not so sure on so
I would like feedback on those.  I also have not touched the various Mac
modules since I never use them and a bunch of them are undocumented.  If Bob
or Donald (or anyone else for that matter) could help me identify Mac
modules that could go I would really appreciate it.

For the renaming I listed all of the modules that either violate PEP 8, have
both a Python and C implementation, or do not have a public API and thus
should be hidden.  The only possibly controversial renaming is of _winreg,
dummy_threading, and repr.  _winreg should be renamed since the higher-level
version never materialized.  I personally want dummy_threading renamed
mockthreading since dummythreading just looks bad to me, it really is more
of a mock implementation, and I wrote the module to begin with.  =)  and
repr should be renamed reprlib (if it is not flagged for removal of its
public interface) so it doesn't shadow a built-in.

The Open Issues section also has a bunch of suggested groupings of modules
that seemed extremely obvious to me or that other people pointed out.  I
know Guido didn't OK this but they were just staring me in the face so I
just went ahead and listed what I thought could be reasonable packages to
create or modules that really don't need to be separate.  Since we are doing
renames anyway I figured it wouldn't hurt to at least consider these.  They
are also not extensive in terms of putting the entire stdlib into a handful
of packages, just stuff that is obviously the same theme (e.g., collections)
or are tightly coupled (e.g., urllib/urllib2).  I leave it to Guido and the
rest of you guys to either tell me to ignore these possibilities or to
actually move forward with some of them.

Anyway, have at it!

-Brett


-----------------------------------------

Abstract
========

Just like the language itself, Python's standard library (stdlib) has
grown over the years to be very rich.  But over time some modules
have lost their need to be included with Python.  There has also been
an introduction of a naming convention for modules since Python's
inception that not all modules follow.

With Python 3.0 a chance to remove modules that do not have long term
usefulness has presented itself.  This chance also allows for the
renaming of modules so that they follow the Python style guide
[#pep-0008]_.  This PEP lists modules that should not be included in
Python 3.0 and what modules need to be renamed.


Modules to Remove
=================

Guido pronounced that "silly old stuff" is to be deleted from the
stdlib for Py3K [#silly-old-stuff]_.  This is open-ended on purpose.
Each module to be removed needs to have a justification as to why it
should no longer be distributed with Python.  This can range from the
module being deprecated in Python 2.x to being for a platform that is
no longer widely used.

This section of the PEP lists the various modules to be removed. Each
subsection represents a different reason for modules to be
removed.  Each module must have a specific justification on top of
being listed in a specific subsection so as to make sure only modules
that truly deserve to be removed are in fact removed.

When a reason mentions how long it has been since a module has been
"uniquely edited", it is in reference to how long it has been since a
checkin was done specifically for the module and not for a change that
applied universally across the entire stdlib.  If an edit time is not
denoted as "unique" then it is the last time the file was edited,
period.


Previously deprecated
---------------------

Modules in this section have been deprecated at some point in the
Python 2.x release series but are currently still distributed with
Python.  Deprecation information is gathered either from PEP 4 or the
Global Module Index [#pep-0004]_, [#module-index]_.  Each module is
listed with the Python version that the deprecation started in.

* buildtools
    2.3
* cfmfile
    2.4
* gopherlib
    2.5
* macfs
    2.3
* md5
    2.5
* mimetools
    2.3
* MimeWriter
    2.3
* mimefy
    2.3
* multifile
    2.3
* posixfile
    1.5
* rfc822
    2.3
* rgbimg
    2.5
* sha
    2.5


Platform-specific with minimal use
----------------------------------

Python supports many platforms, some of which are not widely held.
And on some of these platforms there are modules that have limited use
to people on those platforms.  Because of their limited usefulness it
would be better to no longer burden the Python development team with
their maintenance.

* IRIX (which is no longer produced [#irix-retirement]_)
    + AL/al
        - Provides sound support on Indy and Indigo workstations.
            * Both workstations are no longer available.
        - Code has not been uniquely edited in three years.
    + cd
        - CD drive control for SGI systems.
            * SGI no longer sells machines with IRIX on them.
        - Code has not been uniquely edited in 14 years.
    + DEVICE/GL/gl/cgen/cgensuport
        - OpenGL access.
        - Has not been edited in at least eight years.
        - Third-party libraries provide better support.
            * PyOpenGL [#pyopengl]_
    + FL/fl/flp
        - Wrapper for the FORMS library [#irix-forms]_
            * FORMS has not been edited in 12 years.
        - Library is not widely used.
            * First eight hits on Google are for Python docs for fl.
    + fm
        - Wrapper to the IRIS Font Manager library.
            * Only available on SGI machines which no longer come with
              IRIX.
    + imgfile
        - Wrapper for SGI libimage library for imglib image files
          (``.rgb`` files).
        - Python Imaging Library provdes read-only support [#pil]_.
        - Not uniquely edited in 13 years.
    + jpeg
        - Wrapper for JPEG (de)compressor.
        - Code not uniquely edited in nine years.
        - Third-party libraries provide better support.
            * Python Imaging Library [#pil]_
    + sv
        - Wrapper for Indigo video card.
            * Harware is no longer manufactured.
        - Undocumented.
        - Code not uniquely edited in 13 years.
* Solaris
    + SUNAUDIODEV/sunaudiodev
        - Access to the sound card on Sun machines.
        - Code not uniquely edited in over eight years.
* Mac
    + applesingle
        - Undocumented.
            * AppleSingle is a binary file format for A/UX.
                + A/UX no longer distributed.
* UNIX
    + nis
        - Wrapper for NIS.
            * NIS has been replaced by LDAP, DNS, and Kerberos.


Minimal usage
-------------

Some modules that are platform-independent have minimal usage.  This
can be from how easy it is to implement the functionality from scratch
or because the audience for the code is small.

* audiodev
    + Undocumented.
    + Not edited in five years.
    + If removed sunaudio should go as well.
        - Undocumented.
        - Not edited in over seven years.
* fileinput
    + Basic functionality handled by ``itertools.chain``.
    + Using ``enumerate`` (for the line number in the file),
      ``itertools.repeat`` (for returning the filename with each
      line), and ``zip`` (for connecting the ``enumerate`` object and
      ``itertools.repeat`` object)  provides 95% of other unique
      abilities of fileinput.
* imputil
    + Undocumented.
    + Never updated to support absolute imports.
* mutex
    + Easy to implement using a semaphore and a queue.
    + Cannot block on a lock attempt.
    + Not uniquely edited since its addition 15 years ago.
* repr
    + Controls output of the repr of objects.
        - String slicing and string interpolation can do similar work.
    + Used by pdb, but do not need to expose API.
* symtable/_symtable
    + Undocumented.
* telnetlib
    + Telnet is not used very much anymore.
        - Telnet is unsafe.
        - Most people use SSH instead.
* toaiff
    + Undocumented.
    + Requires ``sox`` library to be installed on the system.
* user
    + Easily handled by allowing the application specify its own
      module name, check for existence, and import if found.
* new
    + Just a rebinding of names from the 'types' module.
    + Can also call ``type`` built-in to get most types easily.
* pure
    + Written before Pure Atria was bought by Rational which was then
      bought by IBM (in other words, very old).

Obsolete
--------

Becoming obsolete signifies that either another module in the stdlib
or a widely distributed third-party library provides a better solution
for what the module is meant for.

* base64/quopri/uu
    + Support exists in the codecs module.
    + If removed (along with binhex), also remove binascii.
        - C implementation of base64, binhex, and uu modules.
* asynchat/asyncore
    + Third-party libraries provide better solutions.
        - twisted [#twisted]_
    + Deprecation previously supported [#py-dev-summary-2004-11-01]_
* Bastion/rexec
    + Restricted execution / security.
    + Turned off in Python 2.3.
    + Modules deemed unsafe.
* dl
    + ctypes provides better support for same functionality.
* fpformat
    + All functionality is supported by string interpolation.
* getopt
    + optparse provides better functionality.
* ihooks
    + Documented except for saying that module might be obsolete.
    + For use with rexec which has been turned off since Python 2.3.
* imageop
    + Better support by third-party libraries.
        - Python Imaging Library [#pil]_.
* linuxaudiodev
    + Replaced by ossaudiodev.
* stat
    + ``os.stat`` now returns a tuple with attributes.
* statvfs
    + ``os.statvfs`` now returns a tuple with attributes.
* strop
    + Implements functions used by 'string' module that have now
      become methods on the str type.
* thread
    + People should use 'threading' instead.
        - Rename 'thread' to _thread.
        - Deprecate dummy_thread.
            * Rename to _mockthread.
            * Change of name better reflects modules purpose.
    + Guido has previously supported the deprecation
      [#thread-deprecation]_.
* timing
    + Use timeit or time.
    + Documentation says the module is obsolete [#timing-module]_.


Modules to Rename
=================

Along with the stdlib gaining some modules that are no longer
relevant, there is also the issue of naming.  Many modules existed in
the stdlib before PEP 8 came into existence [#pep-0008]_.  This has
led to some naming inconsistencies that should be addressed.

Any module that has been suggested for removal and does not meet the
required naming scheme is *not* listed below.


PEP 8 violations
----------------

PEP 8 specifies that modules "should have short, lowercase names,
without underscores" [#pep-0008]_.  There is no mention, though, if
this rule extends to modules contained within a package.  The
assumption is that underscores are acceptable in module names when
they are contained within a package but that any uppercase letters is
not.

* _winreg
    winreg (rename also because module has a public interface).
* autoGIL
    autogil
* BaseHTTPServer
    basehttpserver
* Carbon
    carbon
* CGIHTTPServer
    cgihttpserver
* ColorPicker
    colorpicker
* ConfigParser
    configparser
* Cookie
    cookie
* copy_reg
    copyreg
* cProfile
* DocXMLRPCServer
    docxmlrpcserver
* dummy_threading
    mockthreading (rename because "mock" makes more sense than
    "dummy" and rename already required).
* EasyDialogs
    easydialogs
* FrameWork
    framework
* HTMLParser
    htmlparser
* MacOS
    macos
* MiniAEFrame
    miniaeframe
* Nav
    nav
* PixMapWrapper
    pixmapwrapper
* py_compile
    pycompile
* Queue
    queue
* repr
    reprlib (rename because module name shadows a built-in).
* ScrolledText
    scrolledtext
* SimpleHTTPServer
    simplehttpserver
* SimpleXMLRPCServer
    simplexmlrpcserver
* SocketServer
    socketserver
* StringIO
    stringio
* Tix
    tix
* Tkinter
    tkinter
* UserDict
    userdict
* UserList
    userlist
* UserString
    userstring
* W
    w


Merging C and Python implementations of the same interface
----------------------------------------------------------

Several interfaces have both a Python and C implementation.  While it
is great to have a C implementation for speed with a Python
implementation as fallback, there is no need to expose the two
implementations independently in the stdlib.  For Python 3.0 all
interfaces with two implementations will be merged into a single
public interface.

The C module is to be given a leading underscore to delineate the fact
that it is not the reference implementation (the Python implementation
is).  This means that any semantic difference between the C and Python
versions must be dealt with before Python 3.0 or else the C
implementation will be removed until it can be fixed.

One interface that is not listed below is xml.etree.ElementTree.  This
is an externally maintained module and thus is not under the direct
control of the Python development team for renaming.  See `Open
Issues`_ for a discussion on this.

* pickle/cPickle
    + Rename cPickle to _pickle.
    + Semantic completeness of C implementation *not* verified.
* profile/cProfile
    + Rename cProfile to profile.
    + Semantic completeness of C implementation *not* verified.
* StringIO/cStringIO
    + Rename StringIO to stringio.
    + Rename cStringIO to _stringio.
    + Semantic completeness of C implementation *not* verified.


No public, documented interface
-------------------------------

There are several modules in the stdlib that have no defined public
interface.  These modules exist as support code for other modules that
are exposed.  Because they are not meant to be used directly they
should be renamed to reflect this fact.

* bdb
    _bdb
* markupbase
    _markupbase
* opcode
    _opcode
* dummythread
    _mockthread (assuming the deprecation of 'thread' occurs).


Transition Plan
===============

For modules to be removed
-------------------------

A PendingDeprecationWarning will be set in Python 2.6 for all modules
slated to be removed in Python 3.0.  This will allow people to know
which modules will not exist but without being overly noisy since
PendingDeprecationWarning is by default silenced.


For modules to be renamed
-------------------------

Modules will be renamed in Python 2.6 .  The original names of the
modules will still work but will raise ImportWarning upon import.  The
refactoring tool for transitioning to Python 3.0 will refactor imports
that use the original names to the new names.


Open Issues
===========

Consolidate dependent modules together into a single module or package?
-----------------------------------------------------------------------

The stdlib has several modules that have a level of dependency between
them (e.g., urllib and urllib2).  Usually one is a low-level module
that provides basic abilities with a corresponding higher-level API is
given in another module for simple use-cases.  In Python 3.0 we could
group these dependent modules together into a single module or package
to better reflect their relationship.

Keep in mind when looking at the groupings that deprecation or removal
is also a possibility if there is enough overlap or a module is
obsolete.

* Cookie/cookielib
* urllib/urllib2
    + urlparse?
    + httplib?
* cgi/cgitb
* Tix/Tkinter
* getpass/pwd/spwd/grp
* mailbox/mhlib
* anydbm/whichdbm
* bsddb/dbhash
* pickle/pickletools
* HTMLParser/htmllib
* ftplib/netrc
* parser/symbol


Consolidate certain modules with similar themes together in a package?
----------------------------------------------------------------------

Packages are often used to group together modules that have a similar
theme but do not have any direct relationship or dependency upon each
other.  For Python 3.0 obvious groupings could be done since renaming
of various modules is already occurring.

* collections
    + heapq
    + Queue
    + sets
    + UserDist
    + UserList
    + What to do with UserString?
        - Have a package for Python implementations of built-in types
          instead of putting the User* modules into 'collections'?
* mac
    + Various Mac-specific modules.
    + Same can be done for other platform-specific code.
* Profiling
    + cProfile
    + profile
    + hotshot
    + pstats
* email
    + mailbox
    + mhlib
* Databases
    + anydbm
    + dbhash
    + dbm
    + bsddb
    + dumbdbm
    + gdbm
    + whichdb
* Audio
    + aifc
    + audioop
    + chunk
    + ossaudiodev
    + sunau
    + wave
    + winsound
* Servers
    + BaseHTTPServer
    + CGIHTTPServer
    + DocXMLRPCServer
    + SimpleHTTPServer
    + SimpleXMLRPCServer
    + SocketServer


Modules reliant on obsolete/rarely used file formats?
-----------------------------------------------------

Several modules in the stdlib work on a specific file format.  It is
possible some of these formats are no longer used and thus the stdlib
modules for them can go.  Below is a list of some modules which rely
on a file format that may be obsolete.

* aifc
    AIFF and AIFF-C audio files.  Appears to be only user of the cl
    module (which is undocumented).
* audioop
    Raw (8|16|32)-bit wide audio files (as generated by the al and
    sunaudiodev modules).
* binhex
    binhex4 encoding.
* chunk
    AIFF, AIFF-C, and RMFF audio files.
* sunau
    Sun AU audio files [#sun-au]_.


Renaming of modules maintained outside of the stdlib
----------------------------------------------------

xml.etree.ElementTree not only does not meet PEP 8 naming standards
but it also has an exposed C implementation [#pep-0008]_.  It is an
externally maintained package, though [#pep-0360]_.  A request will be
made for the maintainer to change the name so that it matches PEP 8
and hides the C implementation.


References
==========

.. [#pep-0004] PEP 4: Deprecation of Standard Modules
    (http://www.python.org/dev/pep-0004/)

.. [#pep-0008] PEP 8: Style Guide for Python Code
    (http://www.python.org/dev/pep-0008/)

.. [#pep-0360] PEP 360: Externally Maintained Packages
    (http://www.python.org/peps/pep-0360/)

.. [#module-index] Python Documentation: Global Module Index
    (http://docs.python.org/modindex.html)

.. [#timing-module] Python Library Reference: Obsolete
    (http://docs.python.org/lib/obsolete-modules.html)

.. [#silly-old-stuff] Python-Dev email: "Py3k release schedule worries"
    (http://mail.python.org/pipermail/python-3000/2006-December/005130.html)

.. [#thread-deprecation] Python-Dev email: Autoloading?
    (http://mail.python.org/pipermail/python-dev/2005-October/057244.html)

.. [#py-dev-summary-2004-11-01] Python-Dev Summary: 2004-11-01
    (http://www.python.org/dev/summary/2004-11-01_2004-11-15/#id10)

.. [#pyopengl] PyOpenGL
    (http://pyopengl.sourceforge.net/)

.. [#pil] Python Imaging Library (PIL)
    (http://www.pythonware.com/products/pil/)

.. [#twisted] Twisted
    (http://twistedmatrix.com/trac/)

.. [#irix-retirement] SGI Press Release:
    End of General Availability for MIPS IRIX Products -- December 2006
    (http://www.sgi.com/support/mips_irix.html)

.. [#irix-forms] FORMS Library by Mark Overmars
    (ftp://ftp.cs.ruu.nl/pub/SGI/FORMS)

.. [#sun-au] Wikipedia: Au file format
    (http://en.wikipedia.org/wiki/Au_file_format)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.python.org/pipermail/python-3000/attachments/20070101/f38b0f15/attachment-0001.htm 


More information about the Python-3000 mailing list