Python-checkins
Threads by month
- ----- 2024 -----
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2023 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2022 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2021 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2020 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2019 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2018 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2017 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2016 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2015 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2014 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2013 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2012 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2011 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2010 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2009 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2008 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2007 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2006 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2005 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2004 -----
- December
- November
- October
- September
- August
- July
- June
- May
- April
- March
- February
- January
- ----- 2003 -----
- December
- November
- October
- September
- August
August 2005
- 28 participants
- 485 discussions
python/nondist/sandbox/setuptools pkg_resources.py, 1.56, 1.57 setuptools.txt, 1.24, 1.25
by pje@users.sourceforge.net 05 Aug '05
by pje@users.sourceforge.net 05 Aug '05
05 Aug '05
Update of /cvsroot/python/python/nondist/sandbox/setuptools
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv4617
Modified Files:
pkg_resources.py setuptools.txt
Log Message:
Performance boosts: don't create environment during require()/resolve()
if all requirements can be met with items already in the working set.
Don't eagerly determine whether a path is a directory. Avoid redundant
path operations, etc. These changes dropped the test suite runtime from
over 3.4 seconds to around .34 seconds.
Index: pkg_resources.py
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/setuptools/pkg_resources.py,v
retrieving revision 1.56
retrieving revision 1.57
diff -u -d -r1.56 -r1.57
--- pkg_resources.py 3 Aug 2005 13:18:50 -0000 1.56
+++ pkg_resources.py 6 Aug 2005 02:30:52 -0000 1.57
@@ -460,8 +460,6 @@
already-installed distribution; it should return a ``Distribution`` or
``None``.
"""
- if env is None:
- env = AvailableDistributions(self.entries)
requirements = list(requirements)[::-1] # set up the stack
processed = {} # set of processed requirements
@@ -477,6 +475,8 @@
dist = best.get(req.key)
if dist is None:
# Find the best distribution and add it to the map
+ if env is None:
+ env = AvailableDistributions(self.entries)
dist = best[req.key] = env.best_match(req, self, installer)
if dist is None:
raise DistributionNotFound(req) # XXX put more info here
@@ -1232,8 +1232,6 @@
"""PEP 302 Importer that wraps Python's "normal" import algorithm"""
def __init__(self, path=None):
- if path is not None and not os.path.isdir(path):
- raise ImportError
self.path = path
def find_module(self, fullname, path=None):
@@ -1269,6 +1267,8 @@
return mod
+
+
def get_importer(path_item):
"""Retrieve a PEP 302 "importer" for the given path item
@@ -1357,9 +1357,8 @@
def find_on_path(importer, path_item, only=False):
"""Yield distributions accessible on a sys.path directory"""
- if not os.path.exists(path_item):
- return
path_item = normalize_path(path_item)
+
if os.path.isdir(path_item):
if path_item.lower().endswith('.egg'):
# unpacked egg
@@ -1370,10 +1369,10 @@
)
else:
# scan for .egg and .egg-info in directory
- for entry in os.listdir(path_item):
- fullpath = os.path.join(path_item, entry)
+ for entry in os.listdir(path_item):
lower = entry.lower()
if lower.endswith('.egg-info'):
+ fullpath = os.path.join(path_item, entry)
if os.path.isdir(fullpath):
# development egg
metadata = PathMetadata(path_item, fullpath)
@@ -1382,16 +1381,17 @@
path_item, metadata, project_name=dist_name
)
elif not only and lower.endswith('.egg'):
- for dist in find_distributions(fullpath):
+ for dist in find_distributions(os.path.join(path_item, entry)):
yield dist
elif not only and lower.endswith('.egg-link'):
- for line in file(fullpath):
+ for line in file(os.path.join(path_item, entry)):
if not line.strip(): continue
for item in find_distributions(line.rstrip()):
yield item
register_finder(ImpWrapper,find_on_path)
+
_namespace_handlers = {}
_namespace_packages = {}
Index: setuptools.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/setuptools/setuptools.txt,v
retrieving revision 1.24
retrieving revision 1.25
diff -u -d -r1.24 -r1.25
--- setuptools.txt 25 Jul 2005 03:12:51 -0000 1.24
+++ setuptools.txt 6 Aug 2005 02:30:52 -0000 1.25
@@ -1597,6 +1597,10 @@
containing ``setup.py``, not the highest revision number in the project.
* Added ``eager_resources`` setup argument
+
+ * Enhanced performance of ``require()`` and related operations when all
+ requirements are already in the working set, and enhanced performance of
+ directory scanning for distributions.
* Fixed some problems using ``pkg_resources`` w/PEP 302 loaders other than
``zipimport``, and the previously-broken "eager resource" support.
1
0
python/dist/src/Doc/lib libpoplib.tex, 1.17, 1.17.4.1
by birkenfeld@users.sourceforge.net 05 Aug '05
by birkenfeld@users.sourceforge.net 05 Aug '05
05 Aug '05
Update of /cvsroot/python/python/dist/src/Doc/lib
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19559/dist/src/Doc/lib
Modified Files:
Tag: release24-maint
libpoplib.tex
Log Message:
backport patch [ 1252706 ] poplib list() docstring fix (and docs too)
Index: libpoplib.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/lib/libpoplib.tex,v
retrieving revision 1.17
retrieving revision 1.17.4.1
diff -u -d -r1.17 -r1.17.4.1
--- libpoplib.tex 11 Jan 2004 23:00:16 -0000 1.17
+++ libpoplib.tex 5 Aug 2005 21:02:43 -0000 1.17.4.1
@@ -108,8 +108,8 @@
\begin{methoddesc}{list}{\optional{which}}
Request message list, result is in the form
-\code{(\var{response}, ['mesg_num octets', ...])}. If \var{which} is
-set, it is the message to list.
+\code{(\var{response}, ['mesg_num octets', ...], \var{octets})}.
+If \var{which} is set, it is the message to list.
\end{methoddesc}
\begin{methoddesc}{retr}{which}
1
0
05 Aug '05
Update of /cvsroot/python/python/dist/src/Lib
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19559/dist/src/Lib
Modified Files:
Tag: release24-maint
poplib.py
Log Message:
backport patch [ 1252706 ] poplib list() docstring fix (and docs too)
Index: poplib.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/poplib.py,v
retrieving revision 1.23
retrieving revision 1.23.4.1
diff -u -d -r1.23 -r1.23.4.1
--- poplib.py 12 Feb 2004 17:35:06 -0000 1.23
+++ poplib.py 5 Aug 2005 21:02:43 -0000 1.23.4.1
@@ -219,7 +219,7 @@
"""Request listing, return result.
Result without a message number argument is in form
- ['response', ['mesg_num octets', ...]].
+ ['response', ['mesg_num octets', ...], octets].
Result when a message number argument is given is a
single response: the "scan listing" for that message.
1
0
05 Aug '05
Update of /cvsroot/python/python/dist/src/Doc/lib
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19402/Doc/lib
Modified Files:
libpoplib.tex
Log Message:
bug [ 1252706 ] poplib list() docstring fix (and docs too)
Index: libpoplib.tex
===================================================================
RCS file: /cvsroot/python/python/dist/src/Doc/lib/libpoplib.tex,v
retrieving revision 1.17
retrieving revision 1.18
diff -u -d -r1.17 -r1.18
--- libpoplib.tex 11 Jan 2004 23:00:16 -0000 1.17
+++ libpoplib.tex 5 Aug 2005 21:01:57 -0000 1.18
@@ -108,8 +108,8 @@
\begin{methoddesc}{list}{\optional{which}}
Request message list, result is in the form
-\code{(\var{response}, ['mesg_num octets', ...])}. If \var{which} is
-set, it is the message to list.
+\code{(\var{response}, ['mesg_num octets', ...], \var{octets})}.
+If \var{which} is set, it is the message to list.
\end{methoddesc}
\begin{methoddesc}{retr}{which}
1
0
Update of /cvsroot/python/python/dist/src/Lib
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv19402/Lib
Modified Files:
poplib.py
Log Message:
bug [ 1252706 ] poplib list() docstring fix (and docs too)
Index: poplib.py
===================================================================
RCS file: /cvsroot/python/python/dist/src/Lib/poplib.py,v
retrieving revision 1.23
retrieving revision 1.24
diff -u -d -r1.23 -r1.24
--- poplib.py 12 Feb 2004 17:35:06 -0000 1.23
+++ poplib.py 5 Aug 2005 21:01:58 -0000 1.24
@@ -219,7 +219,7 @@
"""Request listing, return result.
Result without a message number argument is in form
- ['response', ['mesg_num octets', ...]].
+ ['response', ['mesg_num octets', ...], octets].
Result when a message number argument is given is a
single response: the "scan listing" for that message.
1
0
05 Aug '05
Update of /cvsroot/python/python/dist/src/Objects
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv3308/Objects
Modified Files:
setobject.c
Log Message:
* Improve a variable name: entry0 --> table.
* Give set_lookkey_string() a fast alternate path when no dummy entries
are present.
* Have set_swap_bodies() reset the hash field to -1 whenever either of
bodies is not a frozenset. Maintains the invariant of regular sets
always having -1 in the hash field; otherwise, any mutation would make
the hash value invalid.
* Use an entry pointer to simplify the code in frozenset_hash().
Index: setobject.c
===================================================================
RCS file: /cvsroot/python/python/dist/src/Objects/setobject.c,v
retrieving revision 1.40
retrieving revision 1.41
diff -u -d -r1.40 -r1.41
--- setobject.c 5 Aug 2005 00:01:15 -0000 1.40
+++ setobject.c 5 Aug 2005 17:19:54 -0000 1.41
@@ -46,7 +46,7 @@
register unsigned int perturb;
register setentry *freeslot;
register unsigned int mask = so->mask;
- setentry *entry0 = so->table;
+ setentry *table = so->table;
register setentry *entry;
register int restore_error;
register int checked_error;
@@ -55,7 +55,7 @@
PyObject *startkey;
i = hash & mask;
- entry = &entry0[i];
+ entry = &table[i];
if (entry->key == NULL || entry->key == key)
return entry;
@@ -74,7 +74,7 @@
cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
if (cmp < 0)
PyErr_Clear();
- if (entry0 == so->table && entry->key == startkey) {
+ if (table == so->table && entry->key == startkey) {
if (cmp > 0)
goto Done;
}
@@ -93,7 +93,7 @@
least likely outcome, so test for that last. */
for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
i = (i << 2) + i + perturb + 1;
- entry = &entry0[i & mask];
+ entry = &table[i & mask];
if (entry->key == NULL) {
if (freeslot != NULL)
entry = freeslot;
@@ -114,7 +114,7 @@
cmp = PyObject_RichCompareBool(startkey, key, Py_EQ);
if (cmp < 0)
PyErr_Clear();
- if (entry0 == so->table && entry->key == startkey) {
+ if (table == so->table && entry->key == startkey) {
if (cmp > 0)
break;
}
@@ -153,7 +153,7 @@
register unsigned int perturb;
register setentry *freeslot;
register unsigned int mask = so->mask;
- setentry *entry0 = so->table;
+ setentry *table = so->table;
register setentry *entry;
/* Make sure this function doesn't have to handle non-string keys,
@@ -165,31 +165,47 @@
return set_lookkey(so, key, hash);
}
i = hash & mask;
- entry = &entry0[i];
+ entry = &table[i];
if (entry->key == NULL || entry->key == key)
return entry;
- if (entry->key == dummy)
- freeslot = entry;
- else {
- if (entry->hash == hash && _PyString_Eq(entry->key, key))
- return entry;
- freeslot = NULL;
- }
+ if (so->fill != so->used) {
+ if (entry->key == dummy)
+ freeslot = entry;
+ else {
+ if (entry->hash == hash && _PyString_Eq(entry->key, key))
+ return entry;
+ freeslot = NULL;
+ }
- /* In the loop, key == dummy is by far (factor of 100s) the
- least likely outcome, so test for that last. */
- for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
- i = (i << 2) + i + perturb + 1;
- entry = &entry0[i & mask];
- if (entry->key == NULL)
- return freeslot == NULL ? entry : freeslot;
- if (entry->key == key
- || (entry->hash == hash
- && entry->key != dummy
- && _PyString_Eq(entry->key, key)))
+ /* In the loop, key == dummy is by far (factor of 100s) the
+ least likely outcome, so test for that last. */
+ for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
+ i = (i << 2) + i + perturb + 1;
+ entry = &table[i & mask];
+ if (entry->key == NULL)
+ return freeslot == NULL ? entry : freeslot;
+ if (entry->key == key
+ || (entry->hash == hash
+ && entry->key != dummy
+ && _PyString_Eq(entry->key, key)))
+ return entry;
+ if (entry->key == dummy && freeslot == NULL)
+ freeslot = entry;
+ }
+ } else {
+ /* Simplified loop that can assume are no dummy entries */
+ if (entry->hash == hash && _PyString_Eq(entry->key, key))
return entry;
- if (entry->key == dummy && freeslot == NULL)
- freeslot = entry;
+ for (perturb = hash; ; perturb >>= PERTURB_SHIFT) {
+ i = (i << 2) + i + perturb + 1;
+ entry = &table[i & mask];
+ if (entry->key == NULL)
+ return entry;
+ if (entry->key == key
+ || (entry->hash == hash
+ && _PyString_Eq(entry->key, key)))
+ return entry;
+ }
}
}
@@ -377,10 +393,8 @@
setentry small_copy[PySet_MINSIZE];
#ifdef Py_DEBUG
int i, n;
-#endif
-
assert (PyAnySet_Check(so));
-#ifdef Py_DEBUG
+
n = so->mask + 1;
i = 0;
#endif
@@ -841,7 +855,13 @@
memcpy(b->smalltable, tab, sizeof(tab));
}
- h = a->hash; a->hash = b->hash; b->hash = h;
+ if (PyType_IsSubtype(a->ob_type, &PyFrozenSet_Type) &&
+ PyType_IsSubtype(b->ob_type, &PyFrozenSet_Type)) {
+ h = a->hash; a->hash = b->hash; b->hash = h;
+ } else {
+ a->hash = -1;
+ b->hash = -1;
+ }
}
static int
@@ -1301,19 +1321,18 @@
frozenset_hash(PyObject *self)
{
PySetObject *so = (PySetObject *)self;
- long hash = 1927868237L;
- int i, j;
+ long h, hash = 1927868237L;
+ setentry *entry;
+ int i;
if (so->hash != -1)
return so->hash;
hash *= set_len(self) + 1;
- for (i=0, j=so->used ; j ; j--, i++) {
- setentry *entry;
- long h;
-
- while ((entry = &so->table[i])->key == NULL || entry->key==dummy)
- i++;
+ entry = &so->table[0];
+ for (i=so->used ; i ; entry++, i--) {
+ while (entry->key == NULL || entry->key==dummy)
+ entry++;
/* Work to increase the bit dispersion for closely spaced hash
values. The is important because some use cases have many
combinations of a small number of elements with nearby
1
0
python/nondist/sandbox/mailbox mailbox.py, 1.6, 1.7 test_mailbox.py, 1.5, 1.6 libmailbox.tex, 1.7, 1.8
by gregorykjohnson@users.sourceforge.net 05 Aug '05
by gregorykjohnson@users.sourceforge.net 05 Aug '05
05 Aug '05
Update of /cvsroot/python/python/nondist/sandbox/mailbox
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv27444
Modified Files:
mailbox.py test_mailbox.py libmailbox.tex
Log Message:
* Implement Babyl, except labels. Needs more tests.
* Introduce _singlefileMailbox class and refactor much of _mboxMMDF into
it for use by Babyl.
* Make various tweaks and rearrangements in mbox, MMDF, and _mboxMMDF.
Index: mailbox.py
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/mailbox/mailbox.py,v
retrieving revision 1.6
retrieving revision 1.7
diff -u -d -r1.6 -r1.7
--- mailbox.py 3 Aug 2005 23:50:43 -0000 1.6
+++ mailbox.py 5 Aug 2005 16:34:45 -0000 1.7
@@ -12,6 +12,7 @@
import email.Message
import email.Generator
import rfc822
+import StringIO
try:
import fnctl
except ImportError:
@@ -190,8 +191,6 @@
else:
raise TypeError, "Invalid message type"
-
-
class Maildir(Mailbox):
"""A qmail-style Maildir mailbox."""
@@ -410,11 +409,11 @@
raise KeyError, "No message with key '%s'" % key
-class _mboxMMDF(Mailbox):
- """An mbox or MMDF mailbox."""
+class _singlefileMailbox(Mailbox):
+ """A single-file mailbox."""
def __init__(self, path, factory=None):
- """Initialize an mbox or MMDF mailbox."""
+ """Initialize a single-file mailbox."""
Mailbox.__init__(self, path, factory)
try:
f = file(self._path, 'r+')
@@ -447,37 +446,9 @@
def __setitem__(self, key, message):
"""Replace the keyed message; raise KeyError if it doesn't exist."""
self._lookup(key)
- start, stop = self._append_message(message)
- self._toc[key] = (start, stop)
+ self._toc[key] = self._append_message(message)
self._pending = True
- def get_message(self, key):
- start, stop = self._lookup(key)
- self._assert_mtime()
- self._file.seek(start)
- from_line = self._file.readline()
- msg = self._message_factory(self._file.read(stop - self._file.tell()))
- msg.set_from(from_line[5:-1])
- return msg
-
- def get_string(self, key, from_=False):
- """Return a string representation or raise a KeyError."""
- start, stop = self._lookup(key)
- self._assert_mtime()
- self._file.seek(start)
- if not from_:
- self._file.readline()
- return self._file.read(stop - self._file.tell())
-
- def get_file(self, key, from_=False):
- """Return a file-like representation or raise a KeyError."""
- start, stop = self._lookup(key)
- self._assert_mtime()
- self._file.seek(start)
- if not from_:
- self._file.readline()
- return _PartialFile(self._file, self._file.tell(), stop)
-
def iterkeys(self):
"""Return an iterator over keys."""
self._lookup()
@@ -512,11 +483,12 @@
try:
_lock_file(f, self._path, dot=False)
try:
+ self._pre_mailbox_hook(f)
new_toc = {}
for key in sorted(self._toc.keys()):
- start, stop = self._toc[key]
+ start, stop = self._toc[key][:2]
self._file.seek(start)
- self._pre_write_hook(f)
+ self._pre_message_hook(f)
new_start = f.tell()
while True:
buffer = self._file.read(
@@ -524,8 +496,9 @@
if buffer == '':
break
f.write(buffer)
- new_toc[key] = (new_start, f.tell())
- self._post_write_hook(f)
+ new_toc[key] = (new_start, f.tell()) + \
+ self._toc[key][2:] # XXX: Wrong!
+ self._post_message_hook(f)
finally:
_unlock_file(f, self._path)
except:
@@ -536,17 +509,30 @@
self._file.close()
self._toc = new_toc
self._file = f
+ self._file.flush()
self._set_mtime()
os.remove(self._path + '~')
self._pending = False
+ def _pre_mailbox_hook(self, f):
+ """Called before writing the mailbox to file f."""
+ return
+
+ def _pre_message_hook(self, f):
+ """Called before writing each message to file f."""
+ return
+
+ def _post_message_hook(self, f):
+ """Called after writing each message to file f."""
+ return
+
def close(self):
"""Flush and close the mailbox."""
self.flush()
self._file.close()
def _lookup(self, key=None):
- """Return (start, stop) for given key, or raise a KeyError."""
+ """Return (start, stop), possibly with more info, or raise KeyError."""
if self._toc is None:
self._generate_toc()
if key is not None:
@@ -555,8 +541,64 @@
except KeyError:
raise KeyError, "No message with key '%s'" % key
+ def _assert_mtime(self):
+ """Raise an exception if the file has been externally modified."""
+ if self._mtime != os.fstat(self._file.fileno()).st_mtime:
+ raise Error, 'External modifications detected: use refresh()'
+
+ def _set_mtime(self):
+ """Store the current mtime."""
+ self._mtime = os.fstat(self._file.fileno()).st_mtime
+
def _append_message(self, message):
- """Append message to mailbox and return (start, stop) offset."""
+ """Append message to mailbox and return (start, stop, ...) offsets."""
+ _lock_file(self._file, self._path)
+ try:
+ self._assert_mtime()
+ self._file.seek(0, 2)
+ self._pre_message_hook(self._file)
+ offsets = self._install_message(message)
+ self._post_message_hook(self._file)
+ self._file.flush()
+ self._set_mtime()
+ finally:
+ _unlock_file(self._file, self._path)
+ return offsets
+
+
+class _mboxMMDF(_singlefileMailbox):
+ """An mbox or MMDF mailbox."""
+
+ def get_message(self, key):
+ """Return a Message representation or raise a KeyError."""
+ start, stop = self._lookup(key)
+ self._assert_mtime()
+ self._file.seek(start)
+ from_line = self._file.readline()
+ msg = self._message_factory(self._file.read(stop - self._file.tell()))
+ msg.set_from(from_line[5:-1])
+ return msg
+
+ def get_string(self, key, from_=False):
+ """Return a string representation or raise a KeyError."""
+ start, stop = self._lookup(key)
+ self._assert_mtime()
+ self._file.seek(start)
+ if not from_:
+ self._file.readline()
+ return self._file.read(stop - self._file.tell())
+
+ def get_file(self, key, from_=False):
+ """Return a file-like representation or raise a KeyError."""
+ start, stop = self._lookup(key)
+ self._assert_mtime()
+ self._file.seek(start)
+ if not from_:
+ self._file.readline()
+ return _PartialFile(self._file, self._file.tell(), stop)
+
+ def _install_message(self, message):
+ """Format a message and blindly write to self._file."""
from_line = None
if isinstance(message, str) and message[:5] == 'From ':
newline = message.find(os.linesep)
@@ -569,36 +611,16 @@
elif isinstance(message, _mboxMMDFMessage):
from_line = 'From ' + message.get_from()
elif isinstance(message, email.Message.Message):
- from_line = message.get_unixfrom()
+ from_line = message.get_unixfrom() # May be None.
if from_line is None:
from_line = 'From MAILER-DAEMON %s' % \
time.strftime('%a %b %d %H:%M:%S %Y', time.gmtime())
- _lock_file(self._file, self._path)
- try:
- self._assert_mtime()
- self._file.seek(0, 2)
- self._pre_write_hook(self._file)
- start = self._file.tell()
- self._file.write('%s%s' % (from_line, os.linesep))
- self._dump_message(message, self._file)
- stop = self._file.tell()
- self._post_write_hook(self._file)
- self._file.flush()
- self._set_mtime()
- finally:
- _unlock_file(self._file, self._path)
+ start = self._file.tell()
+ self._file.write('%s%s' % (from_line, os.linesep))
+ self._dump_message(message, self._file)
+ stop = self._file.tell()
return (start, stop)
- def _assert_mtime(self):
- """Raise an exception if the file has been externally modified."""
- if self._mtime != os.fstat(self._file.fileno()).st_mtime:
- raise Error, 'External modifications detected: use refresh()'
-
- def _set_mtime(self):
- """Store the current mtime."""
- self._mtime = os.fstat(self._file.fileno()).st_mtime
-
-
class mbox(_mboxMMDF):
"""A classic mbox mailbox."""
@@ -608,34 +630,26 @@
self._message_factory = mboxMessage
_mboxMMDF.__init__(self, path, factory)
- def _pre_write_hook(self, f):
- """Called by close before writing each message."""
+ def _pre_message_hook(self, f):
+ """Called before writing each message to file f."""
if f.tell() != 0:
f.write(os.linesep)
- def _post_write_hook(self, f):
- """Called by close after writing each message."""
- return
-
def _generate_toc(self):
"""Generate key-to-(start, stop) table of contents."""
starts, stops = [], []
- self._assert_mtime()
self._file.seek(0)
- prev_line = ''
+ self._assert_mtime()
while True:
- pos = self._file.tell()
+ line_pos = self._file.tell()
line = self._file.readline()
if line[:5] == 'From ':
- starts.append(pos)
- # The preceeding newline is part of the separator, e.g.,
- # "\nFrom .*\n", not part of the previous message. Ignore it.
- if prev_line != '':
- stops.append(pos - len(os.linesep))
+ if len(stops) < len(starts):
+ stops.append(line_pos - len(os.linesep))
+ starts.append(line_pos)
elif line == '':
- stops.append(pos)
+ stops.append(line_pos)
break
- prev_line = line
self._toc = dict(enumerate(zip(starts, stops)))
self._next_key = len(self._toc)
@@ -648,31 +662,35 @@
self._message_factory = MMDFMessage
_mboxMMDF.__init__(self, path, factory)
- def _pre_write_hook(self, f):
- """Called by close before writing each message."""
+ def _pre_message_hook(self, f):
+ """Called before writing each message to file f."""
f.write('\001\001\001\001\n')
- def _post_write_hook(self, f):
- """Called by close after writing each message."""
+ def _post_message_hook(self, f):
+ """Called after writing each message to file f."""
f.write('\n\001\001\001\001\n')
def _generate_toc(self):
"""Generate key-to-(start, stop) table of contents."""
starts, stops = [], []
- self._assert_mtime()
self._file.seek(0)
+ next_pos = 0
+ self._assert_mtime()
while True:
+ line_pos = next_pos
line = self._file.readline()
+ next_pos = self._file.tell()
if line[:4 + len(os.linesep)] == '\001\001\001\001' + os.linesep:
- starts.append(self._file.tell())
+ starts.append(next_pos)
while True:
- pos = self._file.tell()
+ line_pos = next_pos
line = self._file.readline()
+ next_pos = self._file.tell()
if line == '\001\001\001\001' + os.linesep:
- stops.append(pos - len(os.linesep))
+ stops.append(line_pos - len(os.linesep))
break
elif line == '':
- stops.append(pos)
+ stops.append(line_pos)
break
elif line == '':
break
@@ -912,6 +930,7 @@
os.rename(os.path.join(self._path, str(key)),
os.path.join(self._path, str(prev + 1)))
prev += 1
+ self._next_key = prev + 1
if len(changes) == 0:
return
keys = self.keys()
@@ -936,6 +955,160 @@
self.set_sequences(all_sequences)
+class Babyl(_singlefileMailbox):
+ """An Rmail-style Babyl mailbox."""
+
+ def get_message(self, key):
+ """Return a Message representation or raise a KeyError."""
+ start, stop = self._lookup(key)
+ self._assert_mtime()
+ self._file.seek(start)
+ self._file.readline() # XXX: parse this '1,' line for labels
+ original_headers = StringIO.StringIO()
+ while True:
+ line = self._file.readline()
+ if line == '*** EOOH ***' + os.linesep or line == '':
+ break
+ original_headers.write(line)
+ visible_headers = StringIO.StringIO()
+ while True:
+ line = self._file.readline()
+ if line == os.linesep or line == '':
+ break
+ visible_headers.write(line)
+ body = self._file.read(stop - self._file.tell())
+ msg = BabylMessage(original_headers.getvalue() + body)
+ msg.set_visible(visible_headers.getvalue())
+ return msg
+
+ def get_string(self, key):
+ """Return a string representation or raise a KeyError."""
+ start, stop = self._lookup(key)
+ self._assert_mtime()
+ self._file.seek(start)
+ self._file.readline() # Skip '1,' line.
+ original_headers = StringIO.StringIO()
+ while True:
+ line = self._file.readline()
+ if line == '*** EOOH ***' + os.linesep or line == '':
+ break
+ original_headers.write(line)
+ while True:
+ line = self._file.readline()
+ if line == os.linesep or line == '':
+ break
+ return original_headers.getvalue() + \
+ self._file.read(stop - self._file.tell())
+
+ def get_file(self, key):
+ """Return a file-like representation or raise a KeyError."""
+ return StringIO.StringIO(self.get_string(key))
+
+ def list_labels(self):
+ """Return a list of user-defined labels in the mailbox."""
+ raise NotImplementedError, 'Method not yet implemented'
+
+ def _generate_toc(self):
+ """Generate key-to-(start, stop, eooh, body) table of contents."""
+ starts, stops = [], []
+ self._file.seek(0)
+ next_pos = 0
+ self._assert_mtime()
+ while True:
+ line_pos = next_pos
+ line = self._file.readline()
+ next_pos = self._file.tell()
+ if line == '\037\014' + os.linesep:
+ if len(stops) < len(starts):
+ stops.append(line_pos - len(os.linesep))
+ starts.append(next_pos)
+ elif line == '\037' or line == '\037' + os.linesep:
+ if len(stops) < len(starts):
+ stops.append(line_pos - len(os.linesep))
+ elif line == '':
+ stops.append(line_pos - len(os.linesep))
+ break
+ self._toc = dict(enumerate(zip(starts, stops)))
+ self._next_key = len(self._toc)
+
+ def _pre_mailbox_hook(self, f):
+ """Called before writing the mailbox to file f."""
+ f.write('BABYL OPTIONS:%sVersion: 5%s\037' % (os.linesep, os.linesep))
+ # XXX: write "Labels:" line too
+
+ def _pre_message_hook(self, f):
+ """Called before writing each message to file f."""
+ f.write('\014\n')
+
+ def _post_message_hook(self, f):
+ """Called after writing each message to file f."""
+ f.write('\n\037')
+
+ def _install_message(self, message):
+ """Write message contents and return (start, stop, ...)."""
+ start = self._file.tell()
+ self._file.write('1,,\n') # XXX: check for labels and add them
+ if isinstance(message, email.Message.Message):
+ pseudofile = StringIO.StringIO()
+ ps_generator = email.Generator.Generator(pseudofile, False, 0)
+ ps_generator.flatten(message)
+ pseudofile.seek(0)
+ while True:
+ line = pseudofile.readline()
+ self._file.write(line)
+ if line == os.linesep or line == '':
+ break
+ self._file.write('*** EOOH ***' + os.linesep)
+ if isinstance(message, BabylMessage):
+ generator = email.Generator.Generator(self._file, False, 0)
+ generator.flatten(message.get_visible())
+ else:
+ pseudofile.seek(0)
+ while True:
+ line = pseudofile.readline()
+ self._file.write(line)
+ if line == os.linesep or line == '':
+ break
+ while True:
+ buffer = pseudofile.read(4069) # Buffer size is arbitrary.
+ if buffer == '':
+ break
+ self._file.write(buffer)
+ elif isinstance(message, str):
+ body_start = message.find(os.linesep + os.linesep) + \
+ 2 * len(os.linesep)
+ if body_start - 2 != -1:
+ self._file.write(message[:body_start])
+ self._file.write('*** EOOH ***' + os.linesep)
+ self._file.write(message[:body_start])
+ self._file.write(message[body_start:])
+ else:
+ self._file.write('*** EOOH ***%s%s' % (os.linesep, os.linesep))
+ self._file.write(message)
+ elif hasattr(message, 'readline'):
+ original_pos = message.tell()
+ first_pass = True
+ while True:
+ line = message.readline()
+ self._file.write(line)
+ if line == os.linesep or line == '':
+ self._file.write('*** EOOH ***' + os.linesep)
+ if first_pass:
+ first_pass = False
+ message.seek(original_pos)
+ else:
+ break
+ while True:
+ buffer = message.read(4096) # Buffer size is arbitrary.
+ if buffer == '':
+ break
+ self._file.write(buffer)
+ else:
+ raise TypeError, "Invalid message type"
+ stop = self._file.tell()
+ return (start, stop)
+
+
class Message(email.Message.Message):
"""Message with mailbox-format-specific properties."""
Index: test_mailbox.py
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/mailbox/test_mailbox.py,v
retrieving revision 1.5
retrieving revision 1.6
diff -u -d -r1.5 -r1.6
--- test_mailbox.py 2 Aug 2005 21:46:13 -0000 1.5
+++ test_mailbox.py 5 Aug 2005 16:34:45 -0000 1.6
@@ -675,6 +675,11 @@
_factory = lambda self, path, factory=None: mailbox.MH(path, factory)
+class TestBabyl(TestMailbox):
+
+ _factory = lambda self, path, factory=None: mailbox.Babyl(path, factory)
+
+
class TestMessage(TestBase):
_factory = mailbox.Message # Overridden by subclasses to reuse tests
@@ -1445,7 +1450,7 @@
def test_main():
- tests = (TestMaildir, TestMbox, TestMMDF, TestMH, TestMessage,
+ tests = (TestMaildir, TestMbox, TestMMDF, TestMH, TestBabyl, TestMessage,
TestMaildirMessage, TestMboxMessage, TestMHMessage,
TestBabylMessage, TestMMDFMessage, TestMessageConversion,
TestProxyFile, TestPartialFile)
Index: libmailbox.tex
===================================================================
RCS file: /cvsroot/python/python/nondist/sandbox/mailbox/libmailbox.tex,v
retrieving revision 1.7
retrieving revision 1.8
diff -u -d -r1.7 -r1.8
--- libmailbox.tex 3 Aug 2005 23:50:43 -0000 1.7
+++ libmailbox.tex 5 Aug 2005 16:34:45 -0000 1.8
@@ -506,19 +506,34 @@
\method{open()} function.
\end{classdesc}
-Babyl is a single-file mailbox format invented for use with the Rmail mail
-reading application that ships with Emacs. A Babyl mailbox begins with a
-so-called options section that indicates the format of the mailbox. Messages
-follow the options section, with the beginning and end of each message
-indicated by control characters. Each message in a Babyl mailbox has an
-accompanying list of \dfn{labels}, or short strings that record extra
-information about the message.
+Babyl is a single-file mailbox format invented for the \program{Rmail} mail
+reading application included with Emacs. A Babyl mailbox begins with an options
+section that indicates the format of the mailbox and contains a list of
+user-defined labels that appear in the mailbox. Messages follow the options
+section. The beginning of a message is indicated by a line containing exactly
+two control characters, namely Control-Underscore
+(\character{\textbackslash037}) followed by Control-L
+(\character{\textbackslash014}). The end of a message is indicated by the start
+of the next message or, in the case of the last message, a line containing only
+a Control-Underscore (\character{\textbackslash037}) character. Each message in
+a Babyl mailbox has an accompanying list of \dfn{labels}, or short strings that
+record extra information about the message.
+
+\class{Babyl} instances have all of the methods of \class{Mailbox} in addition
+to the following:
+
+\begin{methoddesc}{list_labels}{}
+Return a list of all user-defined labels in the mailbox.
+\end{methoddesc}
Some \class{Mailbox} methods implemented by \class{Babyl} deserve special
remarks:
\begin{methoddesc}{get_file}{key}
-XXX
+In Babyl mailboxes, the headers of a message are not stored contiguously with
+the body of the message. To generate a file-like representation, they are
+copied together into a \class{StringIO} instance (from the \module{StringIO}
+module), which may be used like a file.
\end{methoddesc}
\begin{seealso}
@@ -941,7 +956,7 @@
Each message in a Babyl mailbox has two sets of headers, original headers and
visible headers. Visible headers are typically a subset of the original
-headers reformatted to be more attractive. By default, \program{rmail} displays
+headers reformatted to be more attractive. By default, \program{Rmail} displays
only visible headers. \class{BabylMessage} uses the original headers because
they are more complete, though the visible headers may be accessed explicitly
if desired.
1
0
Update of /cvsroot/python/python/nondist/peps
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv26233
Modified Files:
pep-0347.txt
Log Message:
Add copyright.
Index: pep-0347.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0347.txt,v
retrieving revision 1.2
retrieving revision 1.3
diff -u -d -r1.2 -r1.3
--- pep-0347.txt 5 Aug 2005 00:16:49 -0000 1.2
+++ pep-0347.txt 5 Aug 2005 07:26:32 -0000 1.3
@@ -169,6 +169,11 @@
remains available. If desired, write access to the python and
distutils modules can be disabled through a CVS commitinfo entry.
+Copyright
+---------
+
+This document has been placed in the public domain.
+
..
Local Variables:
1
0
Update of /cvsroot/python/python/nondist/peps
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv11195
Modified Files:
pep-0348.txt
Log Message:
editing pass
Index: pep-0348.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0348.txt,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -d -r1.3 -r1.4
--- pep-0348.txt 4 Aug 2005 03:41:38 -0000 1.3
+++ pep-0348.txt 5 Aug 2005 05:31:44 -0000 1.4
@@ -13,135 +13,157 @@
Abstract
========
-Python, as of version 2.4, has 38 exceptions (including warnings) in the built-in namespace in a rather shallow hierarchy.
-This list of classes has grown over the years without a chance to learn from mistakes and cleaning up the hierarchy.
-This PEP proposes doing a reorganization for Python 3.0 when backwards-compatibility is not an issue.
-Along with this reorganization, adding a requirement that all objects passed to
-a ``raise`` statement must inherit from a specific superclass is proposed.
-Lastly, bare ``except`` clauses will catch only exceptions inheriting from
-Exception.
+Python, as of version 2.4, has 38 exceptions (including warnings) in
+the built-in namespace in a rather shallow hierarchy. This list of
+classes has grown over the years without a chance to learn from
+mistakes and clean up the hierarchy. This PEP proposes doing a
+reorganization for Python 3.0 when backwards-compatibility is not an
+issue. Along with this reorganization, adding a requirement that all
+objects passed to a ``raise`` statement must inherit from a specific
+superclass is proposed. Lastly, bare ``except`` clauses will catch
+only exceptions inheriting from Exception.
Rationale
=========
-Exceptions are a critical part of Python.
-While exceptions are traditionally used to signal errors in a program, they have also grown to be used for flow control for things such as iterators.
-There importance is great.
+Exceptions are a critical part of Python. While exceptions are
+traditionally used to signal errors in a program, they have also grown
+to be used for flow control for things such as iterators. Their
+importance is great.
-But the organization of the exception hierarchy is suboptimal to serve the
-multiple uses of exceptions.
-Mostly for backwards-compatibility reasons, the hierarchy has stayed very flat and old exceptions who usefulness have not been proven have been left in.
-Making exceptions more hierarchical would help facilitate exception handling by making catching exceptions using inheritance much more logical.
-This should also help lead to less errors from being too broad in what exceptions are caught in an ``except`` clause.
+But the organization of the exception hierarchy is suboptimal to serve
+the multiple uses of exceptions. Mostly for backwards-compatibility
+reasons, the hierarchy has stayed very flat and old exceptions whose
+usefulness has not been proven have been left in. Making exceptions
+more hierarchical would help facilitate exception handling by making
+exception catching using inheritance much more logical. This should
+also help lead to fewer errors from overly broad exception catching in
+``except`` clauses.
-A required superclass for all exceptions is also being proposed [Summary2004-08-01]_.
-By requiring any object that is used in a ``raise`` statement to inherit from a specific superclass, certain attributes (such as those laid out in PEP 344 [PEP344]_) can be guaranteed to exist.
-This also will lead to the planned removal of string exceptions.
+A mandatory superclass for all exceptions is also being proposed
+[#Summary2004-08-01]_. By requiring any object that is used in a
+``raise`` statement to inherit from a specific superclass, certain
+attributes (such as those laid out in PEP 344 [#PEP344]_) can be
+guaranteed to exist. This also will lead to the planned removal of
+string exceptions.
-Lastly, bare ``except`` clauses are to catch only exceptions that inherit from
-Exception [python-dev3]_.
-While currently used to catch all exceptions, that use is rather far reaching
-and typically not desired.
-Catching only exceptions inheriting from Exception allows exceptions that
-should not be caught unless explicitly desired to continue to propagate up the
-execution stack.
+Lastly, bare ``except`` clauses are to catch only exceptions that
+inherit from ``Exception`` [#python-dev3]_. While currently used to
+catch all exceptions, that use is too far-reaching and typically not
+desired. Catching only exceptions that inherit from ``Exception``
+allows other exceptions (those that should not be caught unless
+explicitly desired) to continue to propagate up the execution stack.
Philosophy of Reorganization
============================
-There are several goals in this reorganization that defined the philosophy used to guide the work.
-One goal was to prune out unneeded exceptions.
-Extraneous exceptions should not be left in since it just serves to clutter the built-in namespace.
-Unneeded exceptions also dilute the importance of other exceptions by splitting uses between several exceptions when all uses should have been under a single exception.
+There are several goals in this reorganization that defined the
+philosophy used to guide the work. One goal was to prune out unneeded
+exceptions. Extraneous exceptions should not be left in since they
+just serve to clutter the built-in namespace. Unneeded exceptions
+also dilute the importance of other exceptions by splitting uses
+between several exceptions when all uses should have been under a
+single exception.
-Another goal was to introduce any exceptions that were deemed needed to fill any holes in the hierarchy.
-Most new exceptions were done to flesh out the inheritance hierarchy to make it easier to catch a category of exceptions with a simpler ``except`` clause.
+Another goal was to introduce exceptions that were deemed necessary to
+fill holes in the hierarchy. Most new exceptions were added to flesh
+out the inheritance hierarchy to make it easier to catch a category of
+exceptions with a simpler ``except`` clause.
-Changing inheritance to make it more reasonable was a goal.
-As stated above, having proper inheritance allows for more accurate ``except`` statements when catching exceptions based on the inheritance tree.
+Changing inheritance to make the hierarchy more reasonable was a goal.
+As stated above, having proper inheritance allows for more accurate
+``except`` statements when catching exceptions based on the
+inheritance tree.
+
+Lastly, some renaming was done to make the usage of certain exceptions
+more obvious. Having to look up an exception due to the name not
+accurately reflecting its intended use is annoying and slows down
+debugging. Having accurate names also makes debugging easier for new
+programmers. But for simplicity, for the convenience of existing
+users, and for the sake of transitioning to Python 3.0, only
+exceptions whose names were significantly out of alignment with their
+stated purpose have been renamed. All exceptions dealing with errors
+will be named with an "Error" suffix.
-Lastly, any renaming to make an exception's use more obvious from its name was done.
-Having to look up what an exception is meant to be used for because the name does not proper reflect its usage is annoying and slows down debugging.
-Having a proper name also makes debugging easier on new programmers.
-But for simplicity of existing user's and for transitioning to Python 3.0, only exceptions whose names were fairly out of alignment with their stated purpose have been renamed.
-It was also made sure the exceptions dealing with errors had the "Error"
-suffix.
New Hierarchy
=============
-.. note:: exceptions flagged as "stricter inheritance" means that the class no
- longer inherits from a certain class; "broader inheritance" means a class has
- been added to the exception's inheritance tree
+.. Note:: Exceptions flagged with "stricter inheritance" will no
+ longer inherit from a certain class. A "broader inheritance" flag
+ means a class has been added to the exception's inheritance tree.
-::
+.. parsed-literal::
- BaseException
- +-- CriticalError (new)
- +-- KeyboardInterrupt (stricter inheritance)
- +-- MemoryError (stricter inheritance)
- +-- SystemError (stricter inheritance)
- +-- ControlFlowException (new)
- +-- GeneratorExit (defined in PEP 342 [PEP342]_)
- +-- StopIteration (stricter inheritance)
- +-- SystemExit (stricter inheritance)
- +-- Exception
- +-- StandardError
- +-- ArithmeticError
- +-- DivideByZeroError
- +-- FloatingPointError
- +-- OverflowError
- +-- AssertionError
- +-- AttributeError
- +-- EnvironmentError
- +-- IOError
- +-- EOFError (broader inheritance)
- +-- OSError
- +-- ImportError
- +-- LookupError
- +-- IndexError
- +-- KeyError
- +-- NamespaceError (rename of NameError)
- +-- UnboundFreeError (new)
- +-- UnboundGlobalError (new)
- +-- UnboundLocalError
- +-- NotImplementedError (stricter inheritance)
- +-- SyntaxError
- +-- IndentationError
- +-- TabError
- +-- TypeError
- +-- UserError (rename of RuntimeError)
- +-- UnicodeError
- +-- UnicodeDecodeError
- +-- UnicodeEncodeError
- +-- UnicodeTranslateError
- +-- ValueError
- +-- Warning
- +-- AnyDeprecationWarning (new; broader inheritance for subclasses)
- +-- PendingDeprecationWarning
- +-- DeprecationWarning
- +-- FutureWarning
- +-- SyntaxWarning
- +-- SemanticsWarning (rename of RuntimeWarning)
- +-- UserWarning
- +-- WeakReferenceError (rename of ReferenceError)
+ BaseException
+ +-- CriticalError (new)
+ +-- KeyboardInterrupt (stricter inheritance)
+ +-- MemoryError (stricter inheritance)
+ +-- SystemError (stricter inheritance)
+ +-- ControlFlowException (new)
+ +-- GeneratorExit (defined in PEP 342 [#PEP342]_)
+ +-- StopIteration (stricter inheritance)
+ +-- SystemExit (stricter inheritance)
+ +-- Exception
+ +-- StandardError
+ +-- ArithmeticError
+ +-- DivideByZeroError
+ +-- FloatingPointError
+ +-- OverflowError
+ +-- AssertionError
+ +-- AttributeError
+ +-- EnvironmentError
+ +-- IOError
+ +-- EOFError (broader inheritance)
+ +-- OSError
+ +-- ImportError
+ +-- LookupError
+ +-- IndexError
+ +-- KeyError
+ +-- NamespaceError (renamed from NameError)
+ +-- UnboundFreeError (new)
+ +-- UnboundGlobalError (new)
+ +-- UnboundLocalError
+ +-- NotImplementedError (stricter inheritance)
+ +-- SyntaxError
+ +-- IndentationError
+ +-- TabError
+ +-- TypeError
+ +-- UserError (renamed from RuntimeError)
+ +-- UnicodeError
+ +-- UnicodeDecodeError
+ +-- UnicodeEncodeError
+ +-- UnicodeTranslateError
+ +-- ValueError
+ +-- Warning
+ +-- AnyDeprecationWarning (new; broader inheritance for subclasses)
+ +-- PendingDeprecationWarning
+ +-- DeprecationWarning
+ +-- FutureWarning
+ +-- SyntaxWarning
+ +-- SemanticsWarning (renamed from RuntimeWarning)
+ +-- UserWarning
+ +-- WeakReferenceError (renamed from ReferenceError)
Differences Compared to Python 2.4
==================================
-Changes to exceptions from Python 2.4 can take shape in three forms: removal, renaming, or change in the inheritance tree.
-There are also new exceptions introduced in the proposed hierarchy.
+Changes to exceptions from Python 2.4 can take shape in three forms:
+removal, renaming, or change of position in the hierarchy. There are
+also new exceptions introduced in the proposed hierarchy.
-In terms of new exceptions, almost all are to flesh out the inheritance tree.
-Those that are leaf classes are to alleaviate overloading the use of another exception.
+In terms of new exceptions, almost all are added to flesh out the
+inheritance tree. Those that are leaf classes are added to alleviate
+the overloading of another exception.
-Inheritance change can be broader or more restrictive. The broader inheritance
-typically occurred to allow for a more reasonable superclass to group related
-exceptions together. Stricter inheritance happened when the pre-existing
-inheritance was deemed incorrect and needed correction.
+Positional changes result in either broader or more restrictive
+inheritance. The broader inheritance typically occurred to allow for
+a more reasonable superclass to group related exceptions together.
+Stricter inheritance happened when the pre-existing inheritance was
+deemed incorrect and needed correction.
New Exceptions
@@ -156,20 +178,20 @@
CriticalError
'''''''''''''
-The superclass for exceptions for which a severe error has occurred that one
-would not want to recover from.
-The name is meant to reflect that these exceptions are raised asynchronously by
-the interpreter when a critical event has occured that one would most likely
-want the interpreter to halt over.
+The superclass for severe error exceptions; typically, one would not
+want to recover from such an exception. The name is meant to reflect
+that these exceptions are raised asynchronously by the interpreter
+when a critical event has occured.
ControlFlowException
''''''''''''''''''''
-This exception exists as a superclass for all exceptions that directly deal
-with control flow.
-Inheriting from BaseException instead of Exception prevents them from being caught accidently when one wants to catch errors.
-The name, by not mentioning "Error", does not lead to one to confuse the subclasses as errors.
+This exception exists as a superclass for all exceptions that directly
+deal with control flow. Inheriting from BaseException instead of
+Exception prevents them from being caught accidently when one wants to
+catch errors. The name, by not mentioning "Error", does not lead to
+one to confuse the subclasses as errors.
UnboundGlobalError
@@ -187,13 +209,12 @@
AnyDeprecationWarning
'''''''''''''''''''''
-A common superclass for all deprecation-related exceptions.
-While having DeprecationWarning inherit from PendingDeprecationWarning was
-suggested because a DeprecationWarning can be viewed as a
-PendingDeprecationWarning that is happening now, the logic was not agreed upon
-by a majority.
-But since the exceptions are related, creating a common superclass is
-warranted.
+A common superclass for all deprecation-related exceptions. While
+having DeprecationWarning inherit from PendingDeprecationWarning was
+suggested (a DeprecationWarning can be viewed as a
+PendingDeprecationWarning that is happening now), the logic was not
+agreed upon by a majority. But since the exceptions are related,
+creating a common superclass is warranted.
Removed Exceptions
@@ -211,85 +232,91 @@
RuntimeError
''''''''''''
-Renamed UserError.
+Renamed to UserError.
-Meant for use as a generic exception to be used when one does not want to
-create a new exception class but do not want to raise an exception that might
-be caught based on inheritance, RuntimeError is poorly named.
-It's name in Python 2.4 seems to suggest an error that occurred at runtime,
-possibly an error in the VM.
-Renaming the exception to UserError more clearly states the purpose for
-the exception as quick-and-dirty error exception for the user to use.
-The name also keeps it in line with UserWarning.
+Meant as a generic exception for use when neither a new exception
+class nor inheritance-based exception catching is desired,
+RuntimeError is poorly named. Its name in Python 2.4 seems to suggest
+an error that occurred at runtime, possibly an error in the VM.
+Renaming the exception to UserError more clearly states the purpose of
+the exception as a quick-and-dirty error exception. The name also
+keeps it in line with UserWarning.
-If a user wants an exception that is not to be used as an error, raising
-BaseException directly should be sufficient as Exception, as UserError inherits
-from, is only used for errors.
+If a user wants an non-error exception, raising BaseException directly
+should be sufficient since Exception, which UserError inherits from,
+is only used for errors.
ReferenceError
''''''''''''''
-Renamed WeakReferenceError.
+Renamed to WeakReferenceError.
-ReferenceError was added to the built-in exception hierarchy in Python 2.2
-[exceptions-stdlib]_.
-Taken directly from the ``weakref`` module, its name comes directly from its original name when it resided in the module.
-Unfortunately its name does not suggest its connection to weak references and thus deserves a renaming.
+ReferenceError was added to the built-in exception hierarchy in Python
+2.2 [#exceptions-stdlib]_. Its name comes directly from the time when
+it resided in the ``weakref`` module. Unfortunately its name does not
+suggest its connection to weak references and thus deserves a
+renaming.
NameError
'''''''''
-Renamed NamespaceError.
+Renamed to NamespaceError.
While NameError suggests its common use, it is not entirely apparent.
-Making it more of a superclass for namespace-related exceptions warrants a
-renaming to make it abundantly clear its use.
-Plus the documentation of the exception module[exceptions-stdlib]_ states that it is actually meant for global names and not for just any exception.
+Making it a superclass for namespace-related exceptions warrants a
+renaming to make its use abundantly clear. Plus the documentation of
+the exception module [#exceptions-stdlib]_ states that it was actually
+meant for global names and not for just any exception.
RuntimeWarning
''''''''''''''
-Renamed SemanticsWarning.
+Renamed to SemanticsWarning.
-RuntimeWarning is to represent semantic changes coming in the future.
-But while saying that affects "runtime" is true, flat-out stating it is a semantic change is much clearer, eliminating any possible association of "runtime" with the virtual machine specifically.
+RuntimeWarning is supposed to represent semantic changes coming in the
+future. But while saying that it affects the "runtime" is true,
+flat-out stating that it is a semantic change is much clearer,
+eliminating any possible association of the term "runtime" with the
+virtual machine.
-Changed Inheritance
--------------------
+Change of Position in the Exception Hierarchy
+---------------------------------------------
KeyboardInterrupt, MemoryError, and SystemError
'''''''''''''''''''''''''''''''''''''''''''''''
-Inherit from CriticalError instead of Exception.
+Inherit from CriticalError instead of from Exception.
-The three above-mentioned exceptions are not standard errors by any means.
-They are raised asynchronously by the interpreter when something specific has
+These three exceptions are not standard errors by any means. They are
+raised asynchronously by the interpreter when something specific has
occurred. Thus they warrant not inheriting from Exception but from an
-entirely separate exception that will not be caught by a bare ``except``
-clause.
+entirely separate exception that will not be caught by a bare
+``except`` clause.
StopIteration and SystemExit
''''''''''''''''''''''''''''
-Inherit from ControlFlowException instead of Exception.
+Inherit from ControlFlowException instead of from Exception.
-By having these exceptions no longer inherit from Exception they will not be
-accidentally caught by a bare ``except`` clause.
+By having these exceptions no longer inherit from Exception they will
+not be accidentally caught by a bare ``except`` clause.
NotImplementedError
'''''''''''''''''''
-Inherits from Exception instead of RuntimeError (renamed UserError).
+Inherits from Exception instead of from RuntimeError (renamed to
+UserError).
-Originally inheriting from RuntimeError, NotImplementedError does not have any
-direct relation to the exception meant for use in user code as a
-quick-and-dirty exception. Thus it now directly inherits from Exception.
+Originally inheriting from RuntimeError, NotImplementedError does not
+have any direct relation to the exception meant for use in user code
+as a quick-and-dirty exception. Thus it now directly inherits from
+Exception.
EOFError
@@ -297,49 +324,53 @@
Subclasses IOError.
-Since an EOF comes from I/O it only makes sense that it be considered an I/O error.
+Since an EOF comes from I/O it only makes sense that it be considered
+an I/O error.
Required Superclass for ``raise``
=================================
-By requiring all objects passed to a ``raise`` statement inherit from a specific superclass, one is guaranteed that all exceptions will have certain attributes.
-If PEP 342 [PEP344]_ is accepted, the attributes outlined there will be guaranteed to be on all exceptions raised.
-This should help facilitate debugging by making the querying of information from exceptions much easier.
+By requiring all objects passed to a ``raise`` statement to inherit
+from a specific superclass, all exceptions are guaranteed to have
+certain attributes. If PEP 344 [#PEP344]_ is accepted, the attributes
+outlined there will be guaranteed to be on all exceptions raised.
+This should help facilitate debugging by making the querying of
+information from exceptions much easier.
-The proposed hierarchy has BaseException as the required class that one must inherit from.
+The proposed hierarchy has BaseException as the required base class.
Implementation
--------------
-Enforcement is straight-forward.
-Modifying ``RAISE_VARARGS`` to do an inheritance check first before raising
-an exception should be enough. For the C API, all functions that set an
-exception will have the same inheritance check.
+Enforcement is straightforward. Modifying ``RAISE_VARARGS`` to do an
+inheritance check first before raising an exception should be enough.
+For the C API, all functions that set an exception will have the same
+inheritance check applied.
-Bare ``except`` Clauses Catching Exception Only
-===============================================
+Bare ``except`` Clauses Catching ``Exception`` Only
+===================================================
-While Python does have its "explicit is better than implicit" tenant, it is not
-necessary if a default behavior is reasonable. In the case of a bare
-``except`` clause, changing the behavior makes it quite reasonable to have
-around.
+While Python does have its "explicit is better than implicit" tenant,
+it is not necessary if there is a reasonable default behavior.
+Changing the behavior of a bare ``except`` clause makes its existance
+quite reasonable.
-In Python 2.4, a bare ``except`` clause will catch all exceptions. Typically,
-though, this is not what is truly desired.
-More often than not one wants to catch all error exceptions that do not signify
-a bad state of the interpreter. In the new exception hierarchy this is
-embodied by Exception. Thus bare ``except`` clauses will catch only
-exceptions inheriting from Exception.
+In Python 2.4, a bare ``except`` clause will catch any and all
+exceptions. Typically, though, this is not what is truly desired.
+More often than not one wants to catch all error exceptions that do
+not signify a "bad" interpreter state. In the new exception hierarchy
+this is condition is embodied by Exception. Thus bare ``except``
+clauses will catch only exceptions inheriting from Exception.
Implementation
--------------
-In the compiler, when a bare ``except`` clause is reached, the code for
-``except Exception`` will be emitted.
+In the compiler, when a bare ``except`` clause is reached, the code
+for ``except Exception`` will be emitted.
Transition Plan
@@ -351,53 +382,54 @@
New Exceptions
''''''''''''''
-New exceptions can simply be added to the built-in namespace.
-Any pre-existing objects with the same name will mask the new exceptions,
+New exceptions can simply be added to the built-in namespace. Any
+pre-existing objects with the same name will mask the new exceptions,
preserving backwards-compatibility.
Renamed Exceptions
''''''''''''''''''
-Renamed exceptions will directly subclass the new names.
-When the old exceptions are instantiated (which occurs when an exception is
-caught, either by a ``try`` statement or by propagating to the top of the
+Renamed exceptions will directly subclass the new names. When the old
+exceptions are instantiated (which occurs when an exception is caught,
+either by a ``try`` statement or by propagating to the top of the
execution stack), a PendingDeprecationWarning will be raised.
-This should properly preserve backwards-compatibility as old usage won't change
-and the new names can be used to also catch exceptions using the old name.
-The warning of the deprecation is also kept simple.
+This should properly preserve backwards-compatibility as old usage
+won't change and the new names can also be used to catch exceptions
+using the old names. The warning of the deprecation is also kept
+simple.
New Inheritance for Old Exceptions
''''''''''''''''''''''''''''''''''
-Using multiple inheritance to our advantage, exceptions whose inheritance is
-now more resrictive can be made backwards-compatible.
+Using multiple inheritance to our advantage, exceptions whose
+inheritance is now more resrictive can be made backwards-compatible.
By inheriting from both the new superclasses as well as the original
-superclasses existing ``except`` clauses will continue to work as before while
-allowing the new inheritance to be used for new clauses.
+superclasses, existing ``except`` clauses will continue to work as
+before while allowing the new inheritance to be used for new code.
-A PendingDeprecationWarning will be raised based on whether the bytecode
-``COMPARE_OP(10)`` results in an exception being caught that would not have
-under the new hierarchy. This will require hard-coding in the implementation
-of the bytecode.
+A PendingDeprecationWarning will be raised based on whether the
+bytecode ``COMPARE_OP(10)`` results in an exception being caught that
+would not have under the new hierarchy. This will require hard-coding
+in the implementation of the bytecode.
Removed Exceptions
''''''''''''''''''
-Exceptions scheduled for removal will be transitioned much like the old names
-of renamed exceptions.
-Upon instantiation a PendingDeprecationWarning will be raised stating the the
-exception is due to be removed by Python 3.0 .
+Exceptions scheduled for removal will be transitioned much like the
+old names of renamed exceptions. Upon instantiation a
+PendingDeprecationWarning will be raised stating the the exception is
+due for removal in Python 3.0.
Required Superclass for ``raise``
---------------------------------
-A SemanticsWarning will be raised when an object is passed to ``raise`` that
-does not have the proper inheritance.
+A SemanticsWarning will be raised when an object is passed to
+``raise`` that does not have the proper inheritance.
Removal of Bare ``except`` Clauses
@@ -410,89 +442,91 @@
==============
Threads on python-dev discussing this PEP can be found at
-[python-dev-thread1]_, [python-dev-thread2]_
+[#python-dev-thread1]_ and [#python-dev-thread2]_
KeyboardInterrupt inheriting from ControlFlowException
------------------------------------------------------
KeyboardInterrupt has been a contentious point within this hierarchy.
-Some view the exception as more control flow being caused by the user.
-But with its asynchronous cause thanks to the user being able to trigger the
-exception at any point in code it has a more proper place inheriting from
-CriticalException. It also keeps the name of the exception from being "CriticalError".
+Some view the exception more as control flow being caused by the user.
+But with its asynchronous cause (the user is able to trigger the
+exception at any point in code) its proper place is inheriting from
+CriticalException. It also keeps the name of the exception from being
+"CriticalError".
Other Names for BaseException and Exception
-------------------------------------------
-Alternative names for BaseException/Exception have been Raisable/Exception and
-Exception/StandardError. The former has been rejected on the basis that
-Raisable does not reflect how it is an exception well enough. The latter was
-rejected based on the fact that it did not reflect current use as the chosen
-names do.
+Alternative names for BaseException and Exception have been
+Raisable/Exception and Exception/StandardError. The former
+alternatives were rejected because "Raisable" does not reflect its
+exception nature well enough. The latter alternatives were rejected
+because they do not reflect current use.
DeprecationWarning Inheriting From PendingDeprecationWarning
------------------------------------------------------------
-Originally proposed because a DeprecationWarning can be viewed as a
-PendingDeprecationWarning that is being removed in the next version.
-But enough people thought the inheritance could logically work the other way
-the idea was dropped.
+This was originally proposed because a DeprecationWarning can be
+viewed as a PendingDeprecationWarning that is being removed in the
+next version. But since enough people thought the inheritance could
+logically work the other way around, the idea was dropped.
AttributeError Inheriting From TypeError or NameError
-----------------------------------------------------
-Viewing attributes as part of the interface of a type caused the idea of
-inheriting from TypeError.
-But that partially defeats the thinking of duck typing and thus was dropped.
+Viewing attributes as part of the interface of a type caused the idea
+of inheriting from TypeError. But that partially defeats the thinking
+of duck typing and thus the idea was dropped.
-Inheriting from NameError was suggested because objects can be viewed as having
-their own namespace that the attributes lived in and when they are not found it
-is a namespace failure. This was also dropped as a possibility since not
-everyone shared this view.
+Inheriting from NameError was suggested because objects can be viewed
+as having their own namespace where the attributes live and when an
+attribute is not found it is a namespace failure. This was also
+dropped as a possibility since not everyone shared this view.
Removal of EnvironmentError
---------------------------
-Originally proposed based on the idea that EnvironmentError was an unneeded
-distinction, the BDFL overruled this idea [python-dev4]_.
+Originally proposed based on the idea that EnvironmentError was an
+unneeded distinction, the BDFL overruled this idea [#python-dev4]_.
Introduction of MacError and UnixError
--------------------------------------
-Proposed to add symmetry to WindowsError, the BDFL said they won't be used
-enough [python-dev4]_. The idea of then removing WindowsError was proposed and
-accepted as reasonable, thus completely negating the idea of adding these
-exceptions.
+Proposed to add symmetry to WindowsError, the BDFL said they won't be
+used enough [#python-dev4]_. The idea of then removing WindowsError
+was proposed and accepted as reasonable, thus completely negating the
+idea of adding these exceptions.
SystemError Subclassing SystemExit
----------------------------------
-Proposed because a SystemError is meant to lead to a system exit, the idea was
-removed since CriticalException signifies this better.
+Proposed because a SystemError is meant to lead to a system exit, the
+idea was removed since CriticalException indicates this better.
ControlFlowException Under Exception
------------------------------------
-It has been suggested that ControlFlowException inherit from Exception.
-This idea has been rejected based on the thinking that control flow exceptions
-are typically not desired to be caught in a generic fashion as Exception
-will usually be used.
+It has been suggested that ControlFlowException should inherit from
+Exception. This idea has been rejected based on the thinking that
+control flow exceptions typically should not be caught by bare
+``except`` clauses, whereas Exception subclasses should be.
Removal of Bare ``except`` Clauses
----------------------------------
-The suggestion has been made to remove bare ``except`` clauses in the name of
-"explicit is better than implicit". But Guido has said this is too weak of an
-argument since other things in Python has default behavior [python-dev3]_.
+The suggestion has been made to remove bare ``except`` clauses
+altogether, in the name of "explicit is better than implicit". But
+Guido has said this is too weak of an argument since other areas of
+Python have default behavior [#python-dev3]_.
Open Issues
@@ -501,63 +535,76 @@
Remove ControlFlowException?
----------------------------
-It has been suggested that ControlFlowException is not needed.
-Since the desire to catch any control flow exception will be atypical, the
+It has been suggested that ControlFlowException is not needed. Since
+the desire to catch any control flow exception will be atypical, the
suggestion is to just remove the exception and let the exceptions that
-inherited from it inherit directly from BaseException. This still preserves the
-seperation from Exception which is one of the driving factors behind the
-introduction of the exception.
+inherited from it inherit directly from BaseException. This still
+preserves the seperation from Exception which is one of the driving
+factors behind the introduction of ControlFlowException.
Acknowledgements
================
-Thanks to Robert Brewer, Josiah Carlson, Nick Coghlan, Timothy Delaney, Jack Diedrich, Fred L. Drake, Jr., Philip J. Eby, Greg Ewing, James Y. Knight, MA Lemburg, Guido van Rossum, Stephen J. Turnbull and everyone else I missed for participating in the discussion.
+Thanks to Robert Brewer, Josiah Carlson, Nick Coghlan, Timothy
+Delaney, Jack Diedrich, Fred L. Drake, Jr., Philip J. Eby, Greg Ewing,
+James Y. Knight, MA Lemburg, Guido van Rossum, Stephen J. Turnbull and
+everyone else I missed for participating in the discussion.
References
==========
-.. [PEP342] PEP 342 (Coroutines via Enhanced Generators)
- (http://www.python.org/peps/pep-0342.html)
-
-.. [PEP344] PEP 344 (Exception Chaining and Embedded Tracebacks)
- (http://www.python.org/peps/pep-0344.html)
+.. [#PEP342] PEP 342 (Coroutines via Enhanced Generators)
+ (http://www.python.org/peps/pep-0342.html)
-.. [exceptionsmodules] 'exceptions' module
- (http://docs.python.org/lib/module-exceptions.html)
+.. [#PEP344] PEP 344 (Exception Chaining and Embedded Tracebacks)
+ (http://www.python.org/peps/pep-0344.html)
-.. [Summary2004-08-01] python-dev Summary (An exception is an exception, unless it doesn't inherit from Exception)
- (http://www.python.org/dev/summary/2004-08-01_2004-08-15.html#an-exception-i…)
+.. [#Summary2004-08-01] python-dev Summary (An exception is an
+ exception, unless it doesn't inherit from Exception)
+ (http://www.python.org/dev/summary/2004-08-01_2004-08-15.html#an-exception-i…)
-.. [Summary2004-09-01] python-dev Summary (Cleaning the Exception House)
- (http://www.python.org/dev/summary/2004-09-01_2004-09-15.html#cleaning-the-e…)
+.. [#Summary2004-09-01] python-dev Summary (Cleaning the Exception House)
+ (http://www.python.org/dev/summary/2004-09-01_2004-09-15.html#cleaning-the-e…)
-.. [python-dev1] python-dev email (Exception hierarchy)
- (http://mail.python.org/pipermail/python-dev/2004-August/047908.html)
+.. [#python-dev1] python-dev email (Exception hierarchy)
+ (http://mail.python.org/pipermail/python-dev/2004-August/047908.html)
-.. [python-dev2] python-dev email (Dangerous exceptions)
- (http://mail.python.org/pipermail/python-dev/2004-September/048681.html)
+.. [#python-dev2] python-dev email (Dangerous exceptions)
+ (http://mail.python.org/pipermail/python-dev/2004-September/048681.html)
-.. [python-dev3] python-dev email (PEP, take 2: Exception Reorganization for Python 3.0)
- (http://mail.python.org/pipermail/python-dev/2005-August/055116.html)
+.. [#python-dev3] python-dev email (PEP, take 2: Exception
+ Reorganization for Python 3.0)
+ (http://mail.python.org/pipermail/python-dev/2005-August/055116.html)
-.. [exceptions-stdlib] exceptions module
- (http://www.python.org/doc/2.4.1/lib/module-exceptions.html)
+.. [#exceptions-stdlib] exceptions module
+ (http://docs.python.org/lib/module-exceptions.html)
-.. [python-dev-thread1] python-dev thread (Pre-PEP: Exception Reorganization for Python 3.0)
- (http://mail.python.org/pipermail/python-dev/2005-July/055020.html
- ,
- http://mail.python.org/pipermail/python-dev/2005-August/055065.html)
+.. [#python-dev-thread1] python-dev thread (Pre-PEP: Exception
+ Reorganization for Python 3.0)
+ (http://mail.python.org/pipermail/python-dev/2005-July/055020.html,
+ http://mail.python.org/pipermail/python-dev/2005-August/055065.html)
-.. [python-dev-thread2] python-dev thread (PEP, take 2: Exception Reorganization for Python 3.0)
- (http://mail.python.org/pipermail/python-dev/2005-August/055103.html)
+.. [#python-dev-thread2] python-dev thread (PEP, take 2: Exception
+ Reorganization for Python 3.0)
+ (http://mail.python.org/pipermail/python-dev/2005-August/055103.html)
-.. [python-dev4] python-dev email (Pre-PEP: Exception Reorganization for Python 3.0)
- (http://mail.python.org/pipermail/python-dev/2005-July/055019.html)
+.. [#python-dev4] python-dev email (Pre-PEP: Exception Reorganization
+ for Python 3.0)
+ (http://mail.python.org/pipermail/python-dev/2005-July/055019.html)
Copyright
=========
This document has been placed in the public domain.
+
+
+..
+ Local Variables:
+ mode: indented-text
+ indent-tabs-mode: nil
+ sentence-end-double-space: t
+ fill-column: 70
+ End:
1
0
python/nondist/peps pep-0349.txt, NONE, 1.1 pep-0000.txt, 1.338, 1.339
by nascheme@users.sourceforge.net 04 Aug '05
by nascheme@users.sourceforge.net 04 Aug '05
04 Aug '05
Update of /cvsroot/python/python/nondist/peps
In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv23371
Modified Files:
pep-0000.txt
Added Files:
pep-0349.txt
Log Message:
Add PEP 349.
--- NEW FILE: pep-0349.txt ---
PEP: 349
Title: Generalised String Coercion
Version: $Revision: 1.1 $
Last-Modified: $Date: 2005/08/05 02:59:00 $
Author: Neil Schemenauer <nas(a)arctrix.com>
Status: Draft
Type: Standards Track
Content-Type: text/plain
Created: 02-Aug-2005
Post-History:
Python-Version: 2.5
Abstract
This PEP proposes the introduction of a new built-in function,
text(), that provides a way of generating a string representation
of an object. This function would make it easier to write library
code that processes string data without forcing the use of a
particular string type.
Rationale
Python has had a Unicode string type for some time now but use of
it is not yet widespread. There is a large amount of Python code
that assumes that string data is represented as str instances.
The long term plan for Python is to phase out the str type and use
unicode for all string data. Clearly, a smooth migration path
must be provided.
We need to upgrade existing libraries, written for str instances,
to be made capable of operating in an all-unicode string world.
We can't change to an all-unicode world until all essential
libraries are made capable for it. Upgrading the libraries in one
shot does not seem feasible. A more realistic strategy is to
individually make the libraries capable of operating on unicode
strings while preserving their current all-str environment
behaviour.
First, we need to be able to write code that can accept unicode
instances without attempting to coerce them to str instances. Let
us label such code as Unicode-safe. Unicode-safe libraries can be
used in an all-unicode world.
Second, we need to be able to write code that, when provided only
str instances, will not create unicode results. Let us label such
code as str-stable. Libraries that are str-stable can be used by
libraries and applications that are not yet Unicode-safe.
Sometimes it is simple to write code that is both str-stable and
Unicode-safe. For example, the following function just works:
def appendx(s):
return s + 'x'
That's not too surprising since the unicode type is designed to
make the task easier. The principle is that when str and unicode
instances meet, the result is a unicode instance. One notable
difficulty arises when code requires a string representation of an
object; an operation traditionally accomplished by using the str()
built-in function.
Using str() makes the code not Unicode-safe. Replacing a str()
call with a unicode() call makes the code not str-stable. Using a
string format almost accomplishes the goal but not quite.
Consider the following code:
def text(obj):
return '%s' % obj
It behaves as desired except if 'obj' is not a basestring instance
and needs to return a Unicode representation of itself. In that
case, the string format will attempt to coerce the result of
__str__ to a str instance. Defining a __unicode__ method does not
help since it will only be called if the right-hand operand is a
unicode instance. Using a unicode instance for the right-hand
operand does not work because the function is no longer str-stable
(i.e. it will coerce everything to unicode).
Specification
A Python implementation of the text() built-in follows:
def text(s):
"""Return a nice string representation of the object. The
return value is a basestring instance.
"""
if isinstance(s, basestring):
return s
r = s.__str__()
if not isinstance(s, basestring):
raise TypeError('__str__ returned non-string')
return r
Note that it is currently possible, although not very useful, to
write __str__ methods that return unicode instances.
The %s format specifier for str objects would be changed to call
text() on the argument. Currently it calls str() unless the
argument is a unicode instance (in which case the object is
substituted as is and the % operation returns a unicode instance).
The following function would be added to the C API and would be the
equivalent of the text() function:
PyObject *PyObject_Text(PyObject *o);
A reference implementation is available on Sourceforge [1] as a
patch.
Backwards Compatibility
The change to the %s format specifier would result in some %
operations returning a unicode instance rather than raising a
UnicodeDecodeError exception. It seems unlikely that the change
would break currently working code.
Alternative Solutions
Rather than adding the text() built-in, if PEP 246 were
implemented then adapt(s, basestring) could be equivalent to
text(s). The advantage would be one less built-in function. The
problem is that PEP 246 is not implemented.
Fredrik Lundh has suggested [2] that perhaps a new slot should be
added (e.g. __text__), that could return any kind of string that's
compatible with Python's text model. That seems like an
attractive idea but many details would still need to be worked
out.
Instead of providing the text() built-in, the %s format specifier
could be changed and a string format could be used instead of
calling text(). However, it seems like the operation is important
enough to justify a built-in.
Instead of providing the text() built-in, the basestring type
could be changed to provide the same functionality. That would
possibly be confusing behaviour for an abstract base type.
Some people have suggested [3] that an easier migration path would
be to change the default encoding to be UTF-8. Code that is not
Unicode safe would then encode Unicode strings as UTF-8 and
operate on them as str instances, rather than raising a
UnicodeDecodeError exception. Other code would assume that str
instances were encoded using UTF-8 and decode them if necessary.
While that solution may work for some applications, it seems
unsuitable as a general solution. For example, some applications
get string data from many different sources and assuming that all
str instances were encoded using UTF-8 could easily introduce
subtle bugs.
References
[1] http://www.python.org/sf/1159501
[2] http://mail.python.org/pipermail/python-dev/2004-September/048755.html
[3] http://blog.ianbicking.org/illusive-setdefaultencoding.html
Copyright
This document has been placed in the public domain.
Local Variables:
mode: indented-text
indent-tabs-mode: nil
sentence-end-double-space: t
fill-column: 70
End:
Index: pep-0000.txt
===================================================================
RCS file: /cvsroot/python/python/nondist/peps/pep-0000.txt,v
retrieving revision 1.338
retrieving revision 1.339
diff -u -d -r1.338 -r1.339
--- pep-0000.txt 5 Aug 2005 00:18:51 -0000 1.338
+++ pep-0000.txt 5 Aug 2005 02:59:00 -0000 1.339
@@ -105,6 +105,7 @@
S 345 Metadata for Python Software Packages 1.2 Jones
I 347 Migrating the Python CVS to Subversion von Löwis
S 348 Exception Reorganization for Python 3.0 Cannon
+ S 349 Generalized String Coercion Schemenauer
S 754 IEEE 754 Floating Point Special Values Warnes
Finished PEPs (done, implemented in CVS)
@@ -392,6 +393,7 @@
SR 346 User Defined ("with") Statements Coghlan
I 347 Migrating the Python CVS to Subversion von Löwis
S 348 Exception Reorganization for Python 3.0 Cannon
+ S 349 Generalized String Coercion Schemenauer
SR 666 Reject Foolish Indentation Creighton
S 754 IEEE 754 Floating Point Special Values Warnes
I 3000 Python 3.0 Plans Kuchling, Cannon
1
0