PyCon 2010: Poster sessions
===============================================================
Due date: November 30, 2009
PyCon 2010 introduces a new type of presentation, the poster session.
Poster sessions consist of two pieces:
* A display space where you can put up information about a topic
* Live Q&A during a plenary timeslot where people can get more
information from you while you stand next to your display
For more information and to submit a poster proposal, visit
http://us.pycon.org/2010/conference/posters/
--
Aahz (aahz(a)pythoncraft.com) <*> http://www.pythoncraft.com/
[on old computer technologies and programmers] "Fancy tail fins on a
brand new '59 Cadillac didn't mean throwing out a whole generation of
mechanics who started with model As." --Andrew Dalke
I propose the following PEP for inclusion to Python 3.1.
Please comment.
Regards,
Martin
Abstract
========
Namespace packages are a mechanism for splitting a single Python
package across multiple directories on disk. In current Python
versions, an algorithm to compute the packages __path__ must be
formulated. With the enhancement proposed here, the import machinery
itself will construct the list of directories that make up the
package.
Terminology
===========
Within this PEP, the term package refers to Python packages as defined
by Python's import statement. The term distribution refers to
separately installable sets of Python modules as stored in the Python
package index, and installed by distutils or setuptools. The term
vendor package refers to groups of files installed by an operating
system's packaging mechanism (e.g. Debian or Redhat packages install
on Linux systems).
The term portion refers to a set of files in a single directory (possibly
stored in a zip file) that contribute to a namespace package.
Namespace packages today
========================
Python currently provides the pkgutil.extend_path to denote a package as
a namespace package. The recommended way of using it is to put::
from pkgutil import extend_path
__path__ = extend_path(__path__, __name__)
int the package's ``__init__.py``. Every distribution needs to provide
the same contents in its ``__init__.py``, so that extend_path is
invoked independent of which portion of the package gets imported
first. As a consequence, the package's ``__init__.py`` cannot
practically define any names as it depends on the order of the package
fragments on sys.path which portion is imported first. As a special
feature, extend_path reads files named ``*.pkg`` which allow to
declare additional portions.
setuptools provides a similar function pkg_resources.declare_namespace
that is used in the form::
import pkg_resources
pkg_resources.declare_namespace(__name__)
In the portion's __init__.py, no assignment to __path__ is necessary,
as declare_namespace modifies the package __path__ through sys.modules.
As a special feature, declare_namespace also supports zip files, and
registers the package name internally so that future additions to sys.path
by setuptools can properly add additional portions to each package.
setuptools allows declaring namespace packages in a distribution's
setup.py, so that distribution developers don't need to put the
magic __path__ modification into __init__.py themselves.
Rationale
=========
The current imperative approach to namespace packages has lead to
multiple slightly-incompatible mechanisms for providing namespace
packages. For example, pkgutil supports ``*.pkg`` files; setuptools
doesn't. Likewise, setuptools supports inspecting zip files, and
supports adding portions to its _namespace_packages variable, whereas
pkgutil doesn't.
In addition, the current approach causes problems for system vendors.
Vendor packages typically must not provide overlapping files, and an
attempt to install a vendor package that has a file already on disk
will fail or cause unpredictable behavior. As vendors might chose to
package distributions such that they will end up all in a single
directory for the namespace package, all portions would contribute
conflicting __init__.py files.
Specification
=============
Rather than using an imperative mechanism for importing packages, a
declarative approach is proposed here, as an extension to the existing
``*.pkg`` mechanism.
The import statement is extended so that it directly considers ``*.pkg``
files during import; a directory is considered a package if it either
contains a file named __init__.py, or a file whose name ends with
".pkg".
In addition, the format of the ``*.pkg`` file is extended: a line with
the single character ``*`` indicates that the entire sys.path will
be searched for portions of the namespace package at the time the
namespace packages is imported.
Importing a package will immediately compute the package's __path__;
the ``*.pkg`` files are not considered anymore after the initial import.
If a ``*.pkg`` package contains an asterisk, this asterisk is prepended
to the package's __path__ to indicate that the package is a namespace
package (and that thus further extensions to sys.path might also
want to extend __path__). At most one such asterisk gets prepended
to the path.
extend_path will be extended to recognize namespace packages according
to this PEP, and avoid adding directories twice to __path__.
No other change to the importing mechanism is made; searching
modules (including __init__.py) will continue to stop at the first
module encountered.
Discussion
==========
With the addition of ``*.pkg`` files to the import mechanism, namespace
packages can stop filling out the namespace package's __init__.py.
As a consequence, extend_path and declare_namespace become obsolete.
It is recommended that distributions put a file <distribution>.pkg
into their namespace packages, with a single asterisk. This allows
vendor packages to install multiple portions of namespace package
into a single directory, with no risk of overlapping files.
Namespace packages can start providing non-trivial __init__.py
implementations; to do so, it is recommended that a single distribution
provides a portion with just the namespace package's __init__.py
(and potentially other modules that belong to the namespace package
proper).
The mechanism is mostly compatible with the existing namespace
mechanisms. extend_path will be adjusted to this specification;
any other mechanism might cause portions to get added twice to
__path__.
Copyright
=========
This document has been placed in the public domain.
Another summit, another potential time to see if people want to change
anything about the issue tracker. I would bring up:
- Dropping Stage in favor of some keywords (e.g. 'needs unit test', 'needs
docs')
- Adding a freestyle text box to delineate which, if any, stdlib module is
the cause of a bug and tie that into Misc/maintainers.rst; would potentially
scale back the Component box
-Brett
Just wanted to publicly thank everyone who has been causing all the
checkins to fix and stabilize the test suite (I think it's mostly
Antoine and Mark, but I could be missing somebody; I'm under a
deadline so only have marginal higher brain functionality).
-Brett
At present, configuration of Python's logging package can be done in one of two
ways:
1. Create a ConfigParser-readable configuration file and use
logging.config.fileConfig() to read and implement the configuration therein.
2. Use the logging API to programmatically configure logging using getLogger(),
addHandler() etc.
The first of these works for simple cases but does not cover all of the logging
API (e.g. Filters). The second of these provides maximal control, but besides
requiring users to write the configuration code, it fixes the configuration in
Python code and does not facilitate changing it easily at runtime.
In addition, the ConfigParser format appears to engender dislike (sometimes
strong dislike) in some quarters. Though it was chosen because it was the only
configuration format supported in the stdlib at that time, many people regard it
(or perhaps just the particular schema chosen for logging's configuration) as
'crufty' or 'ugly', in some cases apparently on purely aesthetic grounds. Recent
versions of Python of course support an additional battery-included format which
can be used for configuration - namely, JSON. Other options, such as YAML, are
also possible ways of configuring systems, Google App Engine-style, and PyYAML
has matured nicely.
There has also been talk on the django-dev mailing list about providing better
support for using Python logging in Django. When it happens (as of course I hope
it does) this has the consequence that many new users who use Django but are
relatively inexperienced in Python (e.g. in PHP shops which are moving to
Django) will become exposed to Python logging. As Django is configured using a
Python module and use of ConfigParser-style files is not a common approach in
that ecosystem, users will find either of the two approaches outlined above a
particular pain point when configuring logging for their Django applications and
websites, unless something is done to avoid it.
All three of the contenders for the title of "commonly found configuration
mechanism" - JSON, YAML and Python code - will be expressible, in Python, as
Python dicts. So it seems to make sense to add, to logging.config, a new
callable bound to "dictConfig" which will take a single dictionary argument and
configure logging from that dictionary.
An important facet of implementing such a scheme will be the format or schema
which the dictionary has to adhere to. I have started working on what such a
schema would look like, and if people here think it's a good idea to go ahead
with this, I'll provide the details of the schema on a separate post which I'll
also cross-post on comp.lang.python so that I can get feedback from there, too.
In outline, the scheme I have in mind will look like this, in terms of the new
public API:
class DictConfigurator:
def __init__(self, config): #config is a dict-like object (duck-typed)
import copy
self.config = copy.deepcopy(config)
def configure(self):
# actually do the configuration here using self.config
dictConfigClass = DictConfigurator
def dictConfig(config):
dictConfigClass(config).configure()
This allows easy replacement of DictConfigurator with a suitable subclass where
needed.
What's the general feeling here about this proposal? All comments and
suggestions will be gratefully received.
Regards,
Vinay Sajip
Would it be worth spending some time discussing the buildbot situation
at the PyCon 2010 language summit? In the past, I've found the
buildbots to be an incredibly valuable resource; especially when
working with aspects of Python or C that tend to vary significantly
from platform to platform (for me, this usually means floating-point,
and platform math libraries, but there are surely many other things it
applies to). But more recently there seem to have been some
difficulties keeping a reasonable number of buildbots up and running.
A secondary problem is that it can be awkward to debug some of the
more obscure test failures on buildbots without having direct access
to the machine. From conversations on IRC, I don't think I'm alone in
wanting to find ways to make the buildbots more useful.
So the question is: how best to invest time and possibly money to
improve the buildbot situation (and as a result, I hope, improve the
quality of Python)? What could be done to make maintenance of build
slaves easier? Or to encourage interested third parties to donate
hardware and time? Are there good alternatives to Buildbot that might
make a difference? What do other projects do?
These are probably the wrong questions; I'm hoping that a discussion
would help produce the right questions, and possibly some answers.
Mark
Hello,
I wondered if someone had a clue about the following behaviour.
While debugging an erratic test_mailbox failure on RDM's buildbot (and other
machines), it turned out that the system sometimes set the wrong mtime on a
directory:
$ date && python -c 'import os; os.link("setup.py", "t/c")' && stat t && date
Sun Nov 1 09:49:04 EST 2009
File: `t'
Size: 144 Blocks: 0 IO Block: 4096 directory
Device: 811h/2065d Inode: 223152 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 1001/ pitrou) Gid: ( 1005/ pitrou)
Access: 2009-11-01 09:10:11.000000000 -0500
Modify: 2009-11-01 09:49:03.000000000 -0500
Change: 2009-11-01 09:49:03.000000000 -0500
Sun Nov 1 09:49:04 EST 2009
As you see above, the mtime for directory 't' is set to a full second before the
actual modification has happened.
Sprinkling traces of time.time() and os.path.getmtime() on Lib/mailbox.py shows
this is exactly what trips up test_mailbox. I've got posted a patch to fix it
(see issue #6896), but I would like to know if such OS behaviour is normal.
Regards
Antoine.