Python 2.1 alpha 2 released
Jeremy Hylton
jeremy@alum.mit.edu
Fri, 2 Feb 2001 18:40:08 -0500 (EST)
While Guido is working the press circuit at the LinuxWorld Expo in New
York City, the Python developers, including the many volunteers and
the folks from PythonLabs, were busy finishing the second alpha
release of Python 2.1.
The release is currently available from SourceForge and will also be
available from python.org later today. You can find the source
release at:
http://sourceforge.net/project/showfiles.php?group_id=5470
The Windows installer will be ready shortly.
Fred Drake announced the documentation release earlier today. You can
browse the new docs online at
http://python.sourceforge.net/devel-docs/
or download them from
ftp://ftp.python.org/pub/python/doc/2.1a2/
Please give it a good try! The only way Python 2.1 can become a
rock-solid product is if people test the alpha releases. If you are
using Python for demanding applications or on extreme platforms, we
are particularly interested in hearing your feedback. Are you
embedding Python or using threads? Please test your application using
Python 2.1a2! Please submit all bug reports through SourceForge:
http://sourceforge.net/bugs/?group_id=5470
Here's the NEWS file:
What's New in Python 2.1 alpha 2?
=================================
Core language, builtins, and interpreter
- Scopes nest. If a name is used in a function or class, but is not
local, the definition in the nearest enclosing function scope will
be used. One consequence of this change is that lambda statements
could reference variables in the namespaces where the lambda is
defined. In some unusual cases, this change will break code.
In all previous version of Python, names were resolved in exactly
three namespaces -- the local namespace, the global namespace, and
the builtin namespace. According to this old definition, if a
function A is defined within a function B, the names bound in B are
not visible in A. The new rules make names bound in B visible in A,
unless A contains a name binding that hides the binding in B.
Section 4.1 of the reference manual describes the new scoping rules
in detail. The test script in Lib/test/test_scope.py demonstrates
some of the effects of the change.
The new rules will cause existing code to break if it defines nested
functions where an outer function has local variables with the same
name as globals or builtins used by the inner function. Example:
def munge(str):
def helper(x):
return str(x)
if type(str) != type(''):
str = helper(str)
return str.strip()
Under the old rules, the name str in helper() is bound to the
builtin function str(). Under the new rules, it will be bound to
the argument named str and an error will occur when helper() is
called.
- The compiler will report a SyntaxError if "from ... import *" occurs
in a function or class scope. The language reference has documented
that this case is illegal, but the compiler never checked for it.
The recent introduction of nested scope makes the meaning of this
form of name binding ambiguous. In a future release, the compiler
may allow this form when there is no possibility of ambiguity.
- repr(string) is easier to read, now using hex escapes instead of octal,
and using \t, \n and \r instead of \011, \012 and \015 (respectively):
>>> "\texample \r\n" + chr(0) + chr(255)
'\texample \r\n\x00\xff' # in 2.1
'\011example \015\012\000\377' # in 2.0
- Functions are now compared and hashed by identity, not by value, since
the func_code attribute is writable.
- Weak references (PEP 205) have been added. This involves a few
changes in the core, an extension module (_weakref), and a Python
module (weakref). The weakref module is the public interface. It
includes support for "explicit" weak references, proxy objects, and
mappings with weakly held values.
- A 'continue' statement can now appear in a try block within the body
of a loop. It is still not possible to use continue in a finally
clause.
Standard library
- mailbox.py now has a new class, PortableUnixMailbox which is
identical to UnixMailbox but uses a more portable scheme for
determining From_ separators. Also, the constructors for all the
classes in this module have a new optional `factory' argument, which
is a callable used when new message classes must be instantiated by
the next() method.
- random.py is now self-contained, and offers all the functionality of
the now-deprecated whrandom.py. See the docs for details. random.py
also supports new functions getstate() and setstate(), for saving
and restoring the internal state of the generator; and jumpahead(n),
for quickly forcing the internal state to be the same as if n calls to
random() had been made. The latter is particularly useful for multi-
threaded programs, creating one instance of the random.Random() class for
each thread, then using .jumpahead() to force each instance to use a
non-overlapping segment of the full period.
- random.py's seed() function is new. For bit-for-bit compatibility with
prior releases, use the whseed function instead. The new seed function
addresses two problems: (1) The old function couldn't produce more than
about 2**24 distinct internal states; the new one about 2**45 (the best
that can be done in the Wichmann-Hill generator). (2) The old function
sometimes produced identical internal states when passed distinct
integers, and there was no simple way to predict when that would happen;
the new one guarantees to produce distinct internal states for all
arguments in [0, 27814431486576L).
- The socket module now supports raw packets on Linux. The socket
family is AF_PACKET.
- test_capi.py is a start at running tests of the Python C API. The tests
are implemented by the new Modules/_testmodule.c.
- A new extension module, _symtable, provides provisional access to the
internal symbol table used by the Python compiler. A higher-level
interface will be added on top of _symtable in a future release.
Windows changes
- Build procedure: the zlib project is built in a different way that
ensures the zlib header files used can no longer get out of synch with
the zlib binary used. See PCbuild\readme.txt for details. Your old
zlib-related directories can be deleted; you'll need to download fresh
source for zlib and unpack it into a new directory.
- Build: New subproject _test for the benefit of test_capi.py (see above).
- Build: subproject ucnhash is gone, since the code was folded into the
unicodedata subproject.
What's New in Python 2.1 alpha 1?
=================================
Core language, builtins, and interpreter
- There is a new Unicode companion to the PyObject_Str() API
called PyObject_Unicode(). It behaves in the same way as the
former, but assures that the returned value is an Unicode object
(applying the usual coercion if necessary).
- The comparison operators support "rich comparison overloading" (PEP
207). C extension types can provide a rich comparison function in
the new tp_richcompare slot in the type object. The cmp() function
and the C function PyObject_Compare() first try the new rich
comparison operators before trying the old 3-way comparison. There
is also a new C API PyObject_RichCompare() (which also falls back on
the old 3-way comparison, but does not constrain the outcome of the
rich comparison to a Boolean result).
The rich comparison function takes two objects (at least one of
which is guaranteed to have the type that provided the function) and
an integer indicating the opcode, which can be Py_LT, Py_LE, Py_EQ,
Py_NE, Py_GT, Py_GE (for <, <=, ==, !=, >, >=), and returns a Python
object, which may be NotImplemented (in which case the tp_compare
slot function is used as a fallback, if defined).
Classes can overload individual comparison operators by defining one
or more of the methods__lt__, __le__, __eq__, __ne__, __gt__,
__ge__. There are no explicit "reflected argument" versions of
these; instead, __lt__ and __gt__ are each other's reflection,
likewise for__le__ and __ge__; __eq__ and __ne__ are their own
reflection (similar at the C level). No other implications are
made; in particular, Python does not assume that == is the Boolean
inverse of !=, or that < is the Boolean inverse of >=. This makes
it possible to define types with partial orderings.
Classes or types that want to implement (in)equality tests but not
the ordering operators (i.e. unordered types) should implement ==
and !=, and raise an error for the ordering operators.
It is possible to define types whose rich comparison results are not
Boolean; e.g. a matrix type might want to return a matrix of bits
for A < B, giving elementwise comparisons. Such types should ensure
that any interpretation of their value in a Boolean context raises
an exception, e.g. by defining __nonzero__ (or the tp_nonzero slot
at the C level) to always raise an exception.
- Complex numbers use rich comparisons to define == and != but raise
an exception for <, <=, > and >=. Unfortunately, this also means
that cmp() of two complex numbers raises an exception when the two
numbers differ. Since it is not mathematically meaningful to compare
complex numbers except for equality, I hope that this doesn't break
too much code.
- Functions and methods now support getting and setting arbitrarily
named attributes (PEP 232). Functions have a new __dict__
(a.k.a. func_dict) which hold the function attributes. Methods get
and set attributes on their underlying im_func. It is a TypeError
to set an attribute on a bound method.
- The xrange() object implementation has been improved so that
xrange(sys.maxint) can be used on 64-bit platforms. There's still a
limitation that in this case len(xrange(sys.maxint)) can't be
calculated, but the common idiom "for i in xrange(sys.maxint)" will
work fine as long as the index i doesn't actually reach 2**31.
(Python uses regular ints for sequence and string indices; fixing
that is much more work.)
- Two changes to from...import:
1) "from M import X" now works even if M is not a real module; it's
basically a getattr() operation with AttributeError exceptions
changed into ImportError.
2) "from M import *" now looks for M.__all__ to decide which names to
import; if M.__all__ doesn't exist, it uses M.__dict__.keys() but
filters out names starting with '_' as before. Whether or not
__all__ exists, there's no restriction on the type of M.
- File objects have a new method, xreadlines(). This is the fastest
way to iterate over all lines in a file:
for line in file.xreadlines():
...do something to line...
See the xreadlines module (mentioned below) for how to do this for
other file-like objects.
- Even if you don't use file.xreadlines(), you may expect a speedup on
line-by-line input. The file.readline() method has been optimized
quite a bit in platform-specific ways: on systems (like Linux) that
support flockfile(), getc_unlocked(), and funlockfile(), those are
used by default. On systems (like Windows) without getc_unlocked(),
a complicated (but still thread-safe) method using fgets() is used by
default.
You can force use of the fgets() method by #define'ing
USE_FGETS_IN_GETLINE at build time (it may be faster than
getc_unlocked()).
You can force fgets() not to be used by #define'ing
DONT_USE_FGETS_IN_GETLINE (this is the first thing to try if std test
test_bufio.py fails -- and let us know if it does!).
- In addition, the fileinput module, while still slower than the other
methods on most platforms, has been sped up too, by using
file.readlines(sizehint).
- Support for run-time warnings has been added, including a new
command line option (-W) to specify the disposition of warnings.
See the description of the warnings module below.
- Extensive changes have been made to the coercion code. This mostly
affects extension modules (which can now implement mixed-type
numerical operators without having to use coercion), but
occasionally, in boundary cases the coercion semantics have changed
subtly. Since this was a terrible gray area of the language, this
is considered an improvement. Also note that __rcmp__ is no longer
supported -- instead of calling __rcmp__, __cmp__ is called with
reflected arguments.
- In connection with the coercion changes, a new built-in singleton
object, NotImplemented is defined. This can be returned for
operations that wish to indicate they are not implemented for a
particular combination of arguments. From C, this is
Py_NotImplemented.
- The interpreter accepts now bytecode files on the command line even
if they do not have a .pyc or .pyo extension. On Linux, after executing
echo ':pyc:M::\x87\xc6\x0d\x0a::/usr/local/bin/python:' > /proc/sys/fs/binfmt_misc/register
any byte code file can be used as an executable (i.e. as an argument
to execve(2)).
- %[xXo] formats of negative Python longs now produce a sign
character. In 1.6 and earlier, they never produced a sign,
and raised an error if the value of the long was too large
to fit in a Python int. In 2.0, they produced a sign if and
only if too large to fit in an int. This was inconsistent
across platforms (because the size of an int varies across
platforms), and inconsistent with hex() and oct(). Example:
>>> "%x" % -0x42L
'-42' # in 2.1
'ffffffbe' # in 2.0 and before, on 32-bit machines
>>> hex(-0x42L)
'-0x42L' # in all versions of Python
The behavior of %d formats for negative Python longs remains
the same as in 2.0 (although in 1.6 and before, they raised
an error if the long didn't fit in a Python int).
%u formats don't make sense for Python longs, but are allowed
and treated the same as %d in 2.1. In 2.0, a negative long
formatted via %u produced a sign if and only if too large to
fit in an int. In 1.6 and earlier, a negative long formatted
via %u raised an error if it was too big to fit in an int.
- Dictionary objects have an odd new method, popitem(). This removes
an arbitrary item from the dictionary and returns it (in the form of
a (key, value) pair). This can be useful for algorithms that use a
dictionary as a bag of "to do" items and repeatedly need to pick one
item. Such algorithms normally end up running in quadratic time;
using popitem() they can usually be made to run in linear time.
Standard library
- In the time module, the time argument to the functions strftime,
localtime, gmtime, asctime and ctime is now optional, defaulting to
the current time (in the local timezone).
- The ftplib module now defaults to passive mode, which is deemed a
more useful default given that clients are often inside firewalls
these days. Note that this could break if ftplib is used to connect
to a *server* that is inside a firewall, from outside; this is
expected to be a very rare situation. To fix that, you can call
ftp.set_pasv(0).
- The module site now treats .pth files not only for path configuration,
but also supports extensions to the initialization code: Lines starting
with import are executed.
- There's a new module, warnings, which implements a mechanism for
issuing and filtering warnings. There are some new built-in
exceptions that serve as warning categories, and a new command line
option, -W, to control warnings (e.g. -Wi ignores all warnings, -We
turns warnings into errors). warnings.warn(message[, category])
issues a warning message; this can also be called from C as
PyErr_Warn(category, message).
- A new module xreadlines was added. This exports a single factory
function, xreadlines(). The intention is that this code is the
absolutely fastest way to iterate over all lines in an open
file(-like) object:
import xreadlines
for line in xreadlines.xreadlines(file):
...do something to line...
This is equivalent to the previous the speed record holder using
file.readlines(sizehint). Note that if file is a real file object
(as opposed to a file-like object), this is equivalent:
for line in file.xreadlines():
...do something to line...
- The bisect module has new functions bisect_left, insort_left,
bisect_right and insort_right. The old names bisect and insort
are now aliases for bisect_right and insort_right. XXX_right
and XXX_left methods differ in what happens when the new element
compares equal to one or more elements already in the list: the
XXX_left methods insert to the left, the XXX_right methods to the
right. Code that doesn't care where equal elements end up should
continue to use the old, short names ("bisect" and "insort").
- The new curses.panel module wraps the panel library that forms part
of SYSV curses and ncurses. Contributed by Thomas Gellekum.
- The SocketServer module now sets the allow_reuse_address flag by
default in the TCPServer class.
- A new function, sys._getframe(), returns the stack frame pointer of
the caller. This is intended only as a building block for
higher-level mechanisms such as string interpolation.
Build issues
- For Unix (and Unix-compatible) builds, configuration and building of
extension modules is now greatly automated. Rather than having to
edit the Modules/Setup file to indicate which modules should be
built and where their include files and libraries are, a
distutils-based setup.py script now takes care of building most
extension modules. All extension modules built this way are built
as shared libraries. Only a few modules that must be linked
statically are still listed in the Setup file; you won't need to
edit their configuration.
- Python should now build out of the box on Cygwin. If it doesn't,
mail to Jason Tishler (jlt63 at users.sourceforge.net).
- Python now always uses its own (renamed) implementation of getopt()
-- there's too much variation among C library getopt()
implementations.
- C++ compilers are better supported; the CXX macro is always set to a
C++ compiler if one is found.
Windows changes
- select module: By default under Windows, a select() call
can specify no more than 64 sockets. Python now boosts
this Microsoft default to 512. If you need even more than
that, see the MS docs (you'll need to #define FD_SETSIZE
and recompile Python from source).
- Support for Windows 3.1, DOS and OS/2 is gone. The Lib/dos-8x3
subdirectory is no more!
-- Jeremy Hylton <http://www.python.org/~jeremy/>